In the previous blog PYNQ-Z2 Dev Kit - CIFAR-10 Convolutional Neural Network , I verified the 3 hardware classifiers against the reference "deer" test image.  Now I'm going to see how the classifiers perform with captured webcam images.  I expect the performance will be degraded because the webcam will produce lower quality images due to issues like image brightness and focus.  CIFAR-10 has a small training set (5000 images per class), so I'm going to use a solid background to help keep the image simple.  Intuitively, I would expect that with higher quantization level the classification accuracy would improve.


I decided that since CIFAR-10 has transportation classes (automobile, truck, ship, airplane) that I would try to classify a few Matchbox vehicles that had escaped being recycled.


Here is my test setup.  I was a bit disappointed in the auto-focus and auto-exposure of the webcam, but I haven't figured out whether I can manually control this particular camera with OpenCV.

Webcam Setup



First try is the firetruck.  I expected this one to be easy, but only the W2A2 classifier got it right.

Firetruck Webcam image


Second try is the convertible.  All the classifiers thought this was an airplane.  I'm assuming that the odd colors may have been confusing.

Sportscar Webcam Image


Third try is a car that I thought should be easy but interestingly the classifiers with higher quantization very clearly classified it as an airplane and the binary classifier thought it was a ship.

Car Webcam



As a head check I grabbed one of the CIFAR-10 test images and it correctly classified with increasing accuracy as the quantization increased.

Sportscar Test Image


At this point I'm confused as to the primary cause of the poor classification of the webcam images.  Is it the image quality or the actual image.  I suspect it is both.


I noticed that the example notebook includes a sample webcam capture of an elk figurine which should classify as a "deer".

Elk Webcam


I would consider this a much harder image to classify because of the background.  It's interesting that in two of the classifiers that "airplane" and "deer" are closely ranked.  I'm getting the impression that CIFAR-10 trained networks are a good educational tool, but would not work well in the real world.  I'm going to move on to look at other neural network examples.