A couple of months back I received a ZC702 Xilinx development board from Randall after pitching an idea on how to build an embedded vision project. The board arrived a couple of weeks later (thank you Randall) and I started working on the steps to build an embedded vision application that targets real time lane detection. The main idea was to build a vision system capable of automatic lane detection. In a nutshell, I wanted to build a custom vision application running on the FPGA fabric that would takes images from a camera and thresholds them in order to implement an edge detector.
The idea was to build a vision system that leverages a number of custom and standard IP cores in order to implement a system capable of lane detection at 30 frame per second. The reason it’s called a vision system as opposed to imaging system is that a number of intermediate image processing steps are performed. First, let's take a look at the board itself.
ZC702 development board
The ZC702 is a ZYNQ 7000 series development board that features a XC7020 FPGA SoC from Xilinx in a 484 pin BGA package. The IC merges in one chip a hard dual core ARM-A9 MPU and a programmable logic fabric from the Artix 7 series, hence the name 7000. The device stands somewhere in the middle of the offerings with more logic cells and BRAM as compared to entry level chips like XC7007S (Minized) or XC7010 (Zybo) . This is the same device that is featured on the Zedboard by the way.
The board is shown below. It comes equipped with Ethernet , JTAG programmer , SD card, CAN bus transceiver and an HDMI transmitter codec. In addition it has 2 FMC connectors and all the power supplies voltage rails can be queried via I2C.
The board comes pre-loaded with a reference design on the SD card which includes a QT Linux application that tests a Sobel filter implemented on the PL. The technical reference design (TRD) dates back from 2014 so using it on Vivado 2018.3 requires updating a number of IP blocks. In practice it's easier to build one from scratch since the TRD makes use of IP's which need a license. Plugging the SD card and the unit into an HDMi TV results in the
screen below. When installing a mouse it starts testing the Sobel core implemented on the PL section via the Linux app.
ZC702 features 2 FMC LPC connectors. These are 40 pin connectors which are partially populated. Xilinx uses the term LPC( Low Pin count) for these type of FMC connectors. Each FMC has a a number of GPIO's brought out from the PL side of the SoC. In this project, these are used to interface with the camera sensor.
Image processing pipeline
The image processing pipeline can be logically split into three sections. The image source, an image processing pipeline and the image sink.
From a high-level point of view, one needs an image source which can be a camera sensor or a image pattern generator and a sink which can be a display. The SoC stands right in the middle capturing the image stream, performing on the fly processing and piping the results on the screen.The easiest way to test when no camera is available is to use a test pattern generator. Luckily for us Vivado 2018.3 make the Video TPG IP free unlike the previous versions which needed a license, hence I started experimenting with this first. This IP allows one to experiment with different patterns and features. The IP outputs data in AXIS format at different programmable resolutions. It is controlled via an AXI Lite bus so one needs to configure it from the PS side via a driver API.
The next option for a sink is obviously to use a bona fide camera sensors. There are a number of camera interfaces but the simplest ones are CMOS sensors that use a parallel bus like the venerable OV7670 or OV7690 VGA camera sensor. To use such a sensor however one has to implement a camera capture module that converts the byte stream piped from the sensor into the proper protocol which in our case is the AXIS bus.
In addition there needs to be a camera configuration module either implemented in PL as a I2C module that reads the configuration from a BRAM or in the PS as a generic camera driver. The PS configuration is obviously more flexible so I picked this option. I used a custom FMC camera board with an OV7690 chip camera designed for stereo applications. This is a small camera sensor in a BGA package with embedded optics with VGA resolution (640 columns x 480V rows). The FMC card has three cameras. In this application i have used only one camera connected to FMC J4. Initially i tried to use an OV7670 camera however soldering the FMC connectors is a bit of a hassle without the proper tools.
The OV7690 outputs 2 bytes per pixel. Depending on the configuration code the camera can be configured to use different color spaces. The IP is configured to use the RGB565 color space. The control of the camera is done via the AXI interface. The camera IP is assigned a location in memory. To start the camera, one has to set the bit in order to enable the camera output data. The data itself is packed in using the AXIS (AXI Stream) protocol.
The output of the AXIS from the camera is sent to a dual port asynchronous FIFO. An asynchronous FIFO is one method that is used whenever one needs to cross two different clock domains. The OV7690 camera IP operates on the PCLK clock domain.
The slave side of the FIFO is connected to PCLK while the master side of the FIFO is connected to the AXI master clock domain that operates at 50MHz. The master clock must always be higher than the slave clock otherwise the FIFO will overflow resulting in missed pixels.
After the pixel stream crosses the clock domain it is sent through a subset converter. This is an AXIS IP block which remaps the RGB565 data to a 24 bit packet. The 24 bit is used by the image processing IP since each color is assigned 8 bits.
The next element in line is the AXIS switch. This IP operates as a simple multiplexer for the AXIS stream. It can be configured either via an AXI lite interface or automatically by leveraging the AXIS strobe signals. In this particular application the AXI Lite interface of the AXIS Switch is enables. This however requires implementing the SDK drivers for configuring the switch.
The image source on the Vivado design is a custom IP that interfaces with the OV7690 CMOS cameras. The camera uses a parallel eight-bit interface. In addition, there are two synchronization signals HSYNC and VSYNC whose strobing pattern denote row and frames. The camera uses an output pixel clock PCLK. The pixel rate is synchronized to PCLK. The OV7690 camera also requires a 24 MHz (XCLK) camera clock as an input.
Video application are mostly implemented in streaming fashion since most FPGA have very limited memory which may not be enough for storing even one frame. The protocol of choice for streaming video is called AXI Stream . AXI (Advanced eXtensible interface) is a family of bus specification from ARM. To control the IP the AXIS data input capture is wrapped in another high-level IP that uses an AXI Full bus to enable or disable the data output form the camera.
The video stream is composed of pixels which encode the colors. Colors however can be expressed in different color spaces. RGB, is the most familiar color space. There are multiple RGB formats such as RGB888 , RGB5656 or RGB444 however RGB is not the most efficient depending on what one tries to accomplish. Xilinx AXIS IP’s make use of the 24 bit format so for most of the image pipeline we will use the RGB888 format. When interfacing with the HDMi chipset the YCBCR (YUV) color space has to be used due to the way the chipsets is configured.
Hence , the RGB pixel data is converted into the YCBCr color space. The next step then is to use chroma re-sampling to convert the full YCbCr4:4:4 format into a decimate version YCbCr 4:2:2 (YUV). This basically takes care of the format needed for the HDMI sink.
One important issue is that the chroma resampler standalone IP is not free so one has to either buy it from Xilinx or roll his/her own.
Basically the application can be as simple as an image filter with an AXIS stream input and output interface sitting right between the source and the sink. In this way , the filtering is done in real time by using the least amount of memory. Or it can be as complex as multiple DMA, VDMA and image IP subject to SoC BRAM and LUT constrains. In the current application, a VDMA was used.
To implement lane detection the idea is to take the image and pass it through an edge detector IP. The most well known one is the Sobel edge detector. The Canny edge detector, has the advantage that it has programmable thresholds. Modifying the thresholds on the fly allows for adaptation under different lighting conditions.
The Canny edge detector IP was built using Vivado HLS. It was coded in C. The main advantage of HLS , is that allows for rapid deployment of IP. The disadvantage is that the generated logic is obfuscated and usually not as efficient as hand crafted Verilog code although it can approach it. In any case generating the Canny IP involved making use of Vivado HLS 2013 and using the 2018.3 version of xfopencv. This is a set of C++ libraries that replicates the well known OpenCv library for FPGA logic.
Xilinx has recently released Vivado 2019.1 as well as HLS 2019.1. This required using SDSOC with the revision framework which is not free.
The main change to the revision code that was implemented was to modify the function to support AXIS protocol for input and output. In addition the AXI Lite bus was used to bundle all the programmable variables in a set that and expose them to the PS memeory map via an AXI interconnect connected to the GP0 ZYNQ bus. In addition to AXIS version of the Canny edge detector I also implemented a RGB to gray IP. This simply takes the RGB channels and converts it to grayscale via the formula. The grayscale data is then replicated on all three channels of the 24 bit bus.
Finally to switch between the two different IP blocks , an AXI switch was used. This IP can basically be used as a programmable multiplexer or demultiplexer. The IP allows the original image frame pixels to be routed either to the Canny edge detector , the RGB to grayscale detector or the passthrough channel which contains co image processing IP’s.
The output of the Canny edge detector is a 24 bit pixel stream so it is sent to another subset converter which converts the pixels back to 16 bit. The pixels are then piped to a VDMA IP. The VDMA is configured in both read and write mode. The purpose of the VDMA is to write the pixel stream in a contiguous area in DRAM so that the PS can access the images. The pixel stream is then read and written to AXIS video out.
The VDMA is configured in triple buffering mode. The VDMA needs to be configured prior to operation via the AXI lite interface which connects it to the main AXI Interconnect.
The output of the VDMA is sent to DDR3 RAM via the High performance interface HP0 while the VDMA MM2S interface is sent to the video sink block.
The video sink block is composed of the video timing IP, the AXIS Video Out IP and a clock which is dependent on the resolution of the display used.
Video sink (HDMI)
The ZC702 makes use of a HDMI transmitter codec based on the Analog Devices AD7511. This is an HDMI transmitter which takes a digital data bus together with rown and frame synchronization signals HSYNCS and VSYNC as well as a DE (data enable) and a XCK clock signal.
One can build the IP in Verilog or VHDL to interface with a HDMI TV , in fact Digilent has an IP just for this purpose, but luckily Xilinx already has an IP called AXI Stream to Video OUT. The AD7511 HDMI transmitter codec is configured to accept a 16 bit data bus using the YUV 4:2:2 color space. This color format is a decimate version of the full YCbCR4:4:4 format.
The setup below connects the ZC702 ZYNQ board to an HDMI TV. The first design implemented a simple video pipeline with the camera output routed directly to the HDMI screen with no processing in between. The hardware consist of the following Vivado block diagram. The next step was to include the image processing block right after the VDMA.
The main design takes almost 20% of the ZC702 logic resources. The next step was to launch the SDK and start the program.
To configure the camera one has to access the FMC I2C bus via the I2C multiplexer. The ZC702 uses two I2C level converters for SDA and SCl and one I2C multiplexer from TI that has 8 channels. So to implement the camera configuration after initializing the I2C one has to first set the I2C bus channel then address the camera itself.
The camera still has some issues when configured in RGB color. This still needs some troubleshooting bit it looks like the pixel stream is interlaced.
The camera has to be configured in RGB mode in this application , however it has to be converted back to YCbCr4:2:2 since that is the output format on the ZC702 development board.
Since the pixel stream has to cross on the video output clock domain another asynchronous FIFO is used. Also the AXIS to Video Out IP can only accept 24 bit RGB pixel format so another AXI subset converter is used to reformat the pixels again from 16 bit to 24 bit.
I managed to find another monitor with a DVI interface. The final results are shown below.
Then, a lot of time was spent with the xfOpencv framework. Vivado 2019.1 version changes a lot of thing so care has to be taken to implement the application using the 2018.3 Vivado HLS version.
The main issues currently are to determine the correct camera configuration settings. These are under NDA so not much information is available for this specific camera hence most of the testing time was spent peeking and poking the OVM7690 camera registers.