As I had discovered, one Neural Compute Stick 2 (NCS 2) has pretty decent throughput. The question then is: what happens if you connect more than one of these to the same machine? I only have one NCS 2 and one of the older NCS devices to test this out but that combination worked ok with some tuning. OpenVINO manages allocation of requests to physical devices so there is no explicit way for this to be controlled via the API. However, it appears that multiple SPEs on the same node can be supported as then the NCSs are divided up between the SPEs. A reset error message is typically emitted but then everything seems to work fine.
To get the best performance, I ran in async mode using multiple ExecutableNetwork/InferRequest pairs, with the actual number being configurable from the rtaiDesigner GUI. In this case, 5 pairs gave the best results. The throughput is around 18 frames per second running ssd_mobilenet_v2_coco object detection.
Using one NCS at a time, the NCS 2 was able to process 12 frames per second (versus 9 frames per second in synchronous mode using the original SPE code) while the older NCS was able to process 6 frames per second, suggesting that both were being fully utilized.
Now I need to get a second NCS 2…