Up to now the only data sources in rtndf were video and audio. imu is a new Python PPE that can be used to stream IMU data (fused pose, sensor readings etc) into an rtndf data flow pipeline. Another new PPE is imuview, this time a C++ PPE, that can display the resulting stream. The screen capture above shows the data being streamed from a Raspberry Pi SenseHat which is a full 11-dof sensor.
One of the nice things about using a pub/sub system like MQTT is that it is possible to hook into any of the pipeline links to see what data is flowing. To this end, a future PPE will be a generic viewer. The user just gives it the topic and it determines the type of data and displays it appropriately. A very handy debugging tool!
speechdecode is a new PPE based on CMU Sphinx can be added to an rtndf data flow pipeline to decode speech in an audio stream. It’s the first PPE written in C++/Qt and the infrastructure will be used for PPEs that are a bit too heavy to work well in Python or integrate better with C and C++ libraries.
A simple pipeline is:
audio -> speechdecode
speechdecode outputs any recognized phrases in the stream. It is possible to use customized sets of phrases to limit the the range of speech recognized to key phrases and commands.
I had obtained some very nice results with OpenFace in a previous project and thought it would be fun to wrap it into an rtndf pipeline processing element (PPE). It’s also a good test to see whether docker containers can be used with rtndf. Turns out they work just fine. OpenFace has some complex dependencies and it is much easier just to pull a docker container than build it locally. One approach would have been to build a new container based on the original bamos/openface but instead facerec uses a bit of a hack involving host directory mapping.
To make it easy to use, there’s a bash script in the rtndf/facerec directory called facerecstart thattakes care of the docker command line (which is a bit messy). Of course, in order to recognize faces, the system needs to have been trained. rtndf/facerec includes a modified version of the OpenFace web demo that saves the data from the training in the correct form for facerec. There’s a bash script, trainstart, that starts it going and then a browser and webcam can be used to perform the training.
As with the recognize PPE, facerec can either process the whole frame or just segments that contain motion by using the output from themodet PPE. In fact both recognize and facerec can be used in the same pipeline to get combined recognition:
uvccam -> modet -> facerec -> recognize -> avview
This illustrates one of the nice features of the pipeline concept: metadata and annotation can be added progressively by multiple processing stages, adding significant value to the resulting stream.
Yes, that is me waving my Taylor (made in San Diego🙂 ) guitar around in a very careless manner. It’s all in a good cause though. Turns out that Inception-v3 is very good at recognizing acoustic and electric guitars. I put together a new rtndf PPE called recognize based on the code here from the TensorFlow repo.
In its simplest mode, the recognize PPE takes an incoming video stream and tries to recognize an object in the entire frame. If it finds something, it adds a label in the bottom left corner of the image and uses that to generate a new output stream. That’s ok, but what’s more interesting is when it works with another PPE, modet. modet detects moving objects in the stream and draws a box around them. It now also adds metadata to the outgoing pipeline messages that can be used by downstream PPEs to do something with the regions where motion has been detected.
recognize can work in a mode where it uses the modet metadata to recognize moving objects in the stream. The screen capture with the guitar is an example. That’s why I was waving it around – it had to be in motion to get detected and recognized. The box is that big because I am in motion too! However, Inception-v3 seems quite able to recognize the dominant object in the image segment. While there is only one recognized object in this example, if there were more regions they would be individually recognized.
Of course, the example data set for Inception-v3 only knows so many things, guitars being an example. However, something I want to use this for is to detect a UPS truck coming up the drive. I’ll probably have to try retraining the final layer to do this.
Decided that rtnDataFlow was a pain to type so the project name is now rtndf. It’s part of the usual gyrations at the start of a new project – things don’t settle down until there’s a lot of code to change and inertia sets in. The trick is making the right decisions at this stage. For example, the pipeline message format was too limited so it has also been generalized with data and metadata separated into different sections of the JSON message. The Python files for the current Pipeline Processing Elements (PPEs) have been restructured to make it a bit easier to add new ones.
I guess everyone knows that MVP stands for Minimum Viable Product. However, I just came across this post which resonated with a concept that I had been mulling today – that of the Minimum Viable prototype or MVp (to distinguish it from the product version or Tom Brady I assume).
I like the idea of the MVp – enough functionality to make someone believe that a concept will work without having to fuss about with every tiny detail. It’s basically what I do🙂
The idea of joining together separate, lightweight processing elements to form complex pipelines is nothing new. DirectX and GStreamer have been doing this kind of thing for a long time. More recently, Apache NiFi has done a similar kind of thing but with Java classes. While Apache NiFi does have a lot of nice features, I really don’t want to live in Java hell.
I have been playing with MQTT for some time now and it is a very easy to use publish/subscribe system that’s used in all kinds of places. Seemed like it could be the glue for something…
So that’s really the background for rtnDataFlow. It currently uses MQTT as its pub/sub infrastructure but there’s nothing too specific there – MQTT could easily be swapped out for something else if required. The repo consists of a number of pipeline processing elements that can be used to do some (hopefully) useful things. The primary language is Python although there’s nothing stopping anything being used provided it has an MQTT client and handles the JSON messages correctly. It will even be able to include pipeline processing elements in Docker containers. This will make deployment of new, complex, pipeline processing elements very simple.
The pipeline processing elements are all joined up using topics. Pipeline processing elements can publish to one or more topics and/or subscribe to one or more topics. Because pub/sub systems are intrinsically multicasting, it’s very easy to process data in multiple ways in parallel (for redundancy, performance or functionality). MQTT also allows pipeline processing elements to be distributed on multiple systems, allowing load sharing and heterogeneous computing systems (where only some machines might be fitted with GPUs for example).
Obviously, tools are required to design the pipelines and also to manage them at runtime. The design aspect will come from the old RTembedded project. While that actually generates C and Python code from a design that the user inputs via a graphical interface, the rtnDataFlow version will just make sure all topic names and broker addresses line up correctly and then produce a pipeline configuration file. A special app, rtnFlowControl, will run on each system and will be responsible for implementing the pipeline design specified.
So what’s the point of all of this? I’m tired of writing (or reworking) code multiple times for slightly different applications. My goal is to keep the pipeline processing elements simple enough and tightly focused so that the specific application can be achieved by just wiring together pipeline processing elements. There’ll end up being quite a few of these of course and probably most applications will still need custom elements but it’s better than nothing. My initial use of rtnDataFlow will be to assist with experiments to see how machine learning tools can be used with IoT devices to do interesting things.