Finally this is a ZeroSensor all ready to go into full time service, capturing video, audio and environmental data. The goal is to use this data, and that from other cameras around the space, as training data for machine learning systems.
One specific goal is to create an anomaly detector with minimal supervision. As much as possible, it will learn from experience. This is kind of tricky as it requires detection of unknown length sequences depending on the circumstances. I am intrigued by the ideas behind the Universal Translator but not sure how much could carry over to this application. This paper reviews some of the techniques usually applied, at least for video processing. The situation here is a little different as there are quite different types of features involved. My plan is to preprocess video and audio to recognize salient features (using object detection or whatever) and then input these features, along with environmental sensor data, in the form of uniform time-slotted data sets to the anomaly detector. This doesn’t help with detecting the length of an interesting sequence – that’s the fun part of the project.
One application for rt-ai is ubiquitous sensing leading to sentient spaces – spaces that can interact with people moving through and provide useful functionality, whether learned or programmed. A step on the road to that is the ZeroSensor, four prototypes of which are shown in the photo. Each ZeroSensor consists of a Raspberry Pi Zero W, a Pi camera module v2, an Adafruit BME 680 breakout and an Adafruit TSL2561 breakout. The combination gives a video stream and a sensor stream with light, temperature, pressure, humidity and air quality values. The video stream can be used to derive motion sensing and identification while the other sensors provide a general idea of conditions in the space. Notably missing is audio. Microphone support would be useful for general sensing and I might add that in real devices. A 3D printable case design is underway in order to allow wide-scale deployment.
Voice-based interaction is a powerful way for users to interact with sentient spaces. However, it is assumed that people who want to interact are using an AR headset of some sort which itself provides the audio I/O capabilities. Gesture input would be possible via the ZeroSensor’s camera. For privacy reasons video would not be viewed directly or stored but just used as a source of activity data and interaction.
This is the simple rt-ai design used to test the ZeroSensors. The ZeroSynth modules are rt-ai Edge synth modules that contain SPEs that interface with the ZeroSensor’s hardware and generate a video stream and a sensor data stream. An instance of a video viewer and sensor viewer are connected to each ZeroSynth module.
This is the result of running the ZeroSensor test design, showing a video and sensor window for each ZeroSensor. The cameras are staring at the ceiling because the four sensors were on a table. When the correct case is available, they will be deployed in the corners of rooms in the space.