There comes a time in every roboticists life when using distance sensors and light sensors just isn't enough. While navigating a micromouse maze or following a line might be achievable using rather simple electronics, if you want to build robots which interact with people and their surroundings in a more general way, using cameras and image processing is one approach you could take. In this blog post I'll go through the basics of getting OpenCV (the defacto library for computer vision) up and running, and give you some ideas about what to do next and where to look for commonly used algorithms.
For downloading and installing OpenCV see their wiki here - http://opencv.willowgarage.com/wiki/InstallGuide
Capturing Video
The first step to developing computer vision algorithms is getting access to the source of video. This can be taken from either a camera or a video. A video source is very handy when you are programming away and don't want to waste time running the robot, but ultimately you will want to run your program using a camera. Below is the code for extracting video from a video/camera.
/* video_stream.cpp */
#include <opencv/cv.h>
#include <opencv/highgui.h>
using namespace cv;
int main() {
/* Take images from the default camera */
VideoCapture cam(0);
/* Create a window called stream which autosizes */
namedWindow("stream", CV_WINDOW_AUTOSIZE);
while(true) {
/* Create a matrix to hold the image data */
Mat img;
/* Extract an image from the camera and place into the matrix */
cam >> img;
/* Show the image on the window named stream */
imshow("stream", img);
/* Wait for 30ms and if a key is pressed, place the
ascii value of the key into variable ch */
char ch = waitKey(30);
/* If ESC was pressed then break out of the while loop */
if(ch == 27) break;
}
return 0;
}
To compile the code above you need to link it with the cv and highgui libraries (in linux type "g++ video_stream.cpp -lcv -lhighgui -o video_stream")
When running this example you should see a window called stream with the video input playing (in my case thats a camera looking at me
)
Processing Video
Now that we have a nice video stream it is time to do some actual image processing. A nice and simple OpenCV function to use is Canny which performs edge detection on an image. The next program will have 2 windows, one showing the original video stream and the second showing the edges. To do this there are a few steps:
- Create a new window to display the edges
- Create a new matrix to store the matrix data
- Convert the image to grayscale for the Canny function (because it only accepts single channel images whereas colour images have 3 channels)
- Apply the Canny function to get edge data
- Show the edge data on the window.
1: /* video_stream.cpp */
2: #include <opencv/cv.h>
3: #include <opencv/highgui.h>
4:
5: using namespace cv;
6:
7: int main() {
8: /* Take images from the source */
9: //VideoCapture src("/home/h/video.avi"); // Video file
10: VideoCapture src(0); // Default camera
11:
12: /* Create windows */
13: namedWindow("stream", CV_WINDOW_AUTOSIZE);
14: namedWindow("edges", CV_WINDOW_AUTOSIZE);
15:
16: while(true) {
17: /* Create matrices to hold the image and edge data */
18: Mat img, edges;
19:
20: /* Extract an image from the camera and place into the matrix */
21: src >> img;
22:
23: /* If image empty then break out of while loop */
24: if(img.empty()) break;
25:
26: /* Convert the image to grayscale and temporarily store in edge matrix */
27: cvtColor(img, edges, CV_BGR2GRAY);
28:
29: /* Find the edges and place in edge matrix
30: 1st arg is the source matrix, 2nd arg is the destination matrix
31: 3rd arg is the lower threshold, 4th arg is the upper threshold
32: Thresholds are used in the Canny algorithm to detect lines */
33: Canny(edges, edges, 10, 100);
34:
35: /* Show the images */
36: imshow("stream", img);
37: imshow("edges", edges);
38:
39: /* Wait for 30ms and if a key is pressed, place the
40: ascii value of the key into variable ch */
41: char ch = waitKey(30);
42:
43: /* If ESC was pressed then break out of the while loop */
44: if(ch == 27) break;
45: }
46: return 0;
47: }
The code above is compiled in the same way as before. Upon running the program this is what I see:
It's clear that edge detection might have some nice uses in text processing as it captures quite well the text outlines in the image.
Tweaking Parameters Easily
One last thing I want to show is the use of trackbars in OpenCV. Usually when working with image processing your algorithms will depend on many different parameters (e.g. the edge detection algorithm we just used required 2 threshold parameters). It's extremely useful to be able to change these values at runtime to tweak and tune until things are processed just right. This can be done in OpenCV using track bars which are basically sliders which are used to alter the value of a variable inside you program whilst it is running. Here's what we have to do:
- Create a window to display the trackbars in
- Create variables for the trackbars
- Create the trackbars and reference both the variables they control and the windows they are to be displayed in
- Place the variables into the paramater lists of our Canny function
Here is the code:
1: /* video_stream.cpp */
2: #include <opencv/cv.h>
3: #include <opencv/highgui.h>
4:
5: using namespace cv;
6:
7: int main() {
8: /* Take images from the source */
9: //VideoCapture src("/home/h/video.avi"); // Video file
10: VideoCapture src(0); // Default camera
11:
12: /* Create windows */
13: namedWindow("stream", CV_WINDOW_AUTOSIZE);
14: namedWindow("edges", CV_WINDOW_AUTOSIZE);
15: namedWindow("trackbars", CV_WINDOW_AUTOSIZE);
16:
17: /* Create the variables and the trackbars, referencing the "trackbars"
18: window in which the trackbars should be shown */
19: int lowerThres = 10;
20: createTrackbar("LowerThres", "trackbars", &lowerThres, 255); // 255 is the max value of the trackbars
21: int upperThres = 100;
22: createTrackbar("UpperThres", "trackbars", &upperThres, 255);
23:
24: while(true) {
25: /* Create matrices to hold the image and edge data */
26: Mat img, edges;
27:
28: /* Extract an image from the camera and place into the matrix */
29: src >> img;
30:
31: /* If image empty then break out of while loop */
32: if(img.empty()) break;
33:
34: /* Convert the image to grayscale and temporarily store in edge matrix */
35: cvtColor(img, edges, CV_BGR2GRAY);
36:
37: /* Find the edges and place in edge matrix */
38: /* 1st arg is the source matrix, 2nd arg is the destination matrix */
39: /* 3rd arg is the lower threshold, 4th arg is the upper threshold */
40: /* Thresholds are used in the Canny algorithm to detect lines */
41: Canny(edges, edges, lowerThres, upperThres);
42:
43: /* Show the images */
44: imshow("stream", img);
45: imshow("edges", edges);
46:
47: /* Wait for 30ms and if a key is pressed, place the
48: ascii value of the key into variable ch */
49: char ch = waitKey(30);
50:
51: /* If ESC was pressed then break out of the while loop */
52: if(ch == 27) break;
53: }
54: return 0;
55: }
Here is what I see when running that code and tweaking it so that my hand is shown clearly against the background:
What's Next
So yeah that's it for today! You should now be able to try out other things for yourself like face/object recognition, blob-tracking or stereo-vision. If you do, post them on here! Here's a cool video on a hot-topic in research at the moment called SLAM (simultaneous localization and mapping):
OpenCV has functions for things like that! You can find them in the OpenCV documentation here - http://opencv.willowgarage.com/documentation/cpp/index.html
Stay tuned and there'll be some more interesting things on OpenCV as we are using it in our robot this year for localization.



