Background
There is no convenient way to see out the front of our house and if you are waiting for a visitor to park on the driveway or a package to be delivered to the door that can be very annoying. So we have had a camera looking over our driveway for many years but with no security features originally except for displaying the video inside on a dedicated monitor. Still that was a lot better than standing in front of the living room window and looking down on the driveway and waiting for whatever it was you were expecting.
Around 2016 reasonably priced IP (Internet Protocol) cameras became available and I purchased one that used WIFI and was weather proof.
That camera that was made by the company Foscam and promised some desirable security features such as live viewing of the video with a smartphone app, motion detection and alerts, recording of video to their cloud service and even transfer of captured still images via FTP to a server.
It all sounds great until you actually try to get it to work properly especially as far as the motion detection is concerned. If you make it sensitive enough to be useful then there will be lots of false positives and if you don’t make it sensitive enough then it won’t be reliable.
Then there are the normal issues with WIFI for a device that is located outside. In our case the camera was quite far away from the router as well as being outside so the camera would sometimes disconnect without warning.
Eventually I setup an old laptop as a server, originally just to monitor if the camera went offline but then after that I had the camera transfer the still images and video clips captured after motion detection to the server via FTP. The server would then look for arrival of the still images and then send them to me using the messaging service Pushover. At that point I could use the smartphone app to look at the live video.
Over time with a bunch of Python scripts, a lot of tweaking and tuning and a WIFI repeater I got that setup to work pretty well and it certainly was a lot better than what I had before.
Making It a Little Bit Smarter
In March of 2020 I started looking into improving the system and the first thing I did was move the server software to a desktop PC running Debian Linux. The PC was a discarded Dell server about 8 years old that performed a lot better than the laptop especially running Linux rather than Windows Server.
I also moved most of my IoT applications (see my other projects on the project page) to this new server with MQTT and Node-Red as the key components.
I use Node-Red a lot to experiment with data from sensors so I investigated what could be done with images and sure enough I found a Node in the NPM repository that uses TensorFlow and the CoCoSSD model to detect images in Jpeg images.
I used this to analyze the still images coming from the camera and see if there were any objects detected in them before doing anything else with the image. This simple change made a massive improvement in the elimination of false detections meaning that I could increase the sensitivity of the motion detection in the camera and just discard the false detections making the system much more reliable.
So isn’t that good enough?
Well actually not. First of all it takes about 30 seconds from the time the camera detects motion to the time that the image is analyzed and the notification is sent and then received. Then you have to use the app on the smartphone to view the live video which may or may not be useful at that point.
It would be much better to be analyzing the individual image frames from the video to detect objects and also to be saving video segments for viewing later.
Version 3, A Work in Progress
I started researching what it would take to run object detection on every frame of video in real time and discovered Google Coral which is a hardware accelerator for TensorFlow Lite. This comes in several versions including a USB dongle, two different M.2 format cards and a mini PCIe card. The USB dongle turns out to be almost impossible to obtain but I was able to get the mini PCIe module to work with my server by using an adapter for mini PCIe form-factor WIFI cards.
Most IP cameras similar to what I have been using stream video encoded as H.264 or H.265 at 15 frames per second. That video has to be decoded by the host processor to extract the image frames which are then passed to the Coral accelerator. I have found that the Coral accelerator can easily process the real time video with very little load on the host processor giving the overall system the ability to record the video to disk in segments and carry out other processing as well.
I am using a modified version of the object detection example in Python that comes with the Coral library and using the OpenCV variant to handle the video stream from the IP camera.
The video is recorded to disk in 5 minute segments and the segment is only saved if an object has been detected during its recording.
I also use some heuristics to validate the object detection that TensorFlow provides. For example if the object detected is a vehicle then further detections of an object of that type are ignored if they are the same size and in the same location. Another refinement is to only allow one object detection of a person every 5 minutes. This has the effect of minimizing notifications but keeping the video segments from being deleted.
The following images show the system in operation during a package delivery.
The system is really working well however I am evaluating the performance of alternative hardware configurations:
- The current system but without using the Coral accelerator.
- The Raspberry Pi 4 with solid state drive and with and without Coral USB (I managed to get my hands on one!).
- The nVidia Jetson Nano 4GB version with solid state drive and optional addition of a Coral M.2 card.
- The Seeed Studio Odyssey mini X86 PC with optional addition af a Coral M.2 card .
Yes it is a work in progress!