apps/deepstream-segmask/README

################################################################################
# SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

Prerequisites:
- DeepStreamSDK 7.0 **NOT CURRENTLY SUPPORTED BY 7.1**
- Python 3.10
- Gst-python
- NumPy package <2.0, >=1.22 (2.0 and above not supported)
- OpenCV package

To install required packages:
$ sudo apt update
$ sudo apt install python3-numpy python3-opencv -y

**NOTE**: this application is not currently supported by DeepStream 7.1, due to removal of segmentation models.

Download PeopleSegNet model:
  Please follow instructions from the README.md located at : /opt/nvidia/deepstream/deepstream/samples/configs/tao_pretrained_models/README.md
  to download latest supported PeopleSegNet model

To run:
  $ python3 deepstream_segmask.py -i <uri1> [uri2] ... [uriN] -o <FOLDER NAME TO SAVE FRAMES>
e.g.
  $ python3 deepstream_segmask.py file:///home/ubuntu/video1.mp4 file:///home/ubuntu/video2.mp4 -o frames
  $ python3 deepstream_segmask.py rtsp://127.0.0.1/video1 rtsp://127.0.0.1/video2 -o frames

This document describes the sample deepstream-segmask application.

This sample builds on top of the deepstream-test3 sample to demonstrate how to:

* Extract NvOSD_MaskParams from stream metadata
* Access segmentation mask information from NvOSD_MaskParams 
* Resize mask array to fit object boundaries and binarize according to threshold for interpretable segmentation mask
* Save the mask as image

This sample accepts one or more H.264/H.265 video streams as input. It creates
a source bin for each input and connects the bins to an instance of the
"nvstreammux" element, which forms the batch of frames. The batch of
frames is fed to "nvinfer" for batched inferencing. The batched buffer is
composited into a 2D tile array using "nvmultistreamtiler." The rest of the
pipeline is similar to the deepstream-test3 and deepstream-imagedata sample.

The "width" and "height" properties must be set on the stream-muxer to set the
output resolution. If the input frame resolution is different from
stream-muxer's "width" and "height", the input frame will be scaled to muxer's
output resolution.

The stream-muxer waits for a user-defined timeout before forming the batch. The
timeout is set using the "batched-push-timeout" property. If the complete batch
is formed before the timeout is reached, the batch is pushed to the downstream
element. If the timeout is reached before the complete batch can be formed
(which can happen in case of rtsp sources), the batch is formed from the
available input buffers and pushed. Ideally, the timeout of the stream-muxer
should be set based on the framerate of the fastest source. It can also be set
to -1 to make the stream-muxer wait infinitely.

The "nvmultistreamtiler" composite streams based on their stream-ids in
row-major order (starting from stream 0, left to right across the top row, then
across the next row, etc.).

A probe is added to the sink pad of the tiler to extract metadata. For each 
NvDsObjectMeta object, we access the NvOSD_MaskParams and NvOSD_RectParams
members to access the segmentation mask array, resize it, and binarize it to
create an interpretable representation of the mask. The mask is then saved to
file as an image in the output folder.