Something conspicuously missing from the original ZeroSensor concept was audio support. That’s now been remedied with the SPH0645 board from Adafruit. I was originally a bit put off by the somewhat extended software installation but, in the end, this is the best approach.
This is the new ZeroSynth synth module for the ZeroSensor. Basically it just consists of an audio capture stream processing element added to the existing video and sensor capture SPEs.
The capture above shows the simple test design with the ZeroSynth module and which once again uses the DeepLabV3 SPE.
rtaiView can be used to view and review the three streams captured from the ZeroSensor. The audio can also be played out of the rtaiView host’s speakers if desired.
The intention is not so much to process audio for content like words at this stage. It’s more the presence or absence of sounds or transients in conjunction with other sensed events that could be more useful. There seem to be two basic choices: either use just an average amplitude level during a specific timeslot or actually try to recognize sound sources during a timeslot. Either of these then becomes a feature for input into an inference engine along with other detected features from the video and sensor streams.