Most IP cameras, including security and surveillance cameras, support RTSP H.264 streaming so it made sense to implement a compatible stream processing element (SPE) for rt-ai Edge. The design above is a simple test design. The video stream from the camera is converted into JPEG frames using GStreamer within the SPE and then passed to the DeepLabv3 SPE. The output from DeepLabv3 is then passed to a MediaView SPE for display.
I have a few ONVIF/RTSP cameras around the property and the screen capture above shows the results from one of these. There’s a car sitting in its field of view that’s picked out very nicely. I am using the DeepLabv3 SPE here in its masked image mode where the output frames just consist of recognized object images and nothing else. Just for reference, this is the original frame:
Clearly the segmented image only retains what it is important for later processing.