The main target for Zynq family FPGAs is: compute systems with hardware acceleration.

It's architecture focuses on being able to stream data efficiently between ARM and FPGA submodules.

The FPGA can then perform manipulations in hardware that take too long in software.

The tool chain supports this. The Vitis HLS IDE's only goal is to convert C functions to FPGA IPs..

 

Hardware Accelerated Image Resize Algorithm

 

As proof of concept,  Xilinx adapted OpenCV so that you can build functions and filters in hardware.

I'm reviewing such an example: image resize.

This demo performs the same exercise twice: resize an image from 3840 * 2160 to  1920 * 1080

  • using the ARM processor to execute OpenCV resize calls
  • using the same function, implemented inside the FPGA fabric.

 

The results:

in software, it took 1.03 s. In hardware, 210 ms.

The test is both times done in a loop, taking the fastest execution as benchmark. This to take in account optimising, caching, incidental OS activity, ...

The time taken is from having the original image in memory,  to when the algorithm has finished writing the resized image to memory.

 

Resize executed in ARM processorsResize executed in FPGA fabric

 

 

 

The Accelerated Resize Function Design

 

This is done in Vitis HLS. The example uses the Xilinx OpenCV port and calls its resize() function.

The C code (full):

 

void resize_accel (axis_t *src, axis_t *dst, int src_rows, int src_cols, int dst_rows, int dst_cols) {
  // ...
  xf::cv::Mat<TYPE, HEIGHT, WIDTH, NPC_T> src_mat(src_rows, src_cols);
  xf::cv::Mat<TYPE, NEWHEIGHT, NEWWIDTH, NPC_T> dst_mat(dst_rows, dst_cols);
  // ...
  axis2xfMat(src, src_mat, src_rows, src_cols);
  xf::cv::resize<INTERPOLATION, TYPE, HEIGHT, WIDTH, NEWHEIGHT, NEWWIDTH, NPC_T, MAXDOWNSCALE>(src_mat, dst_mat);
  xfMat2axis(dst_mat, dst, dst_rows, dst_cols);
}

 

Vitis HLS converts this into an FPGA IP (both Verilog and VHDL source are generated).

 

This IP can then be used in Vivado

 

The Accelerated Resize Function Used

 

You can use this generated IP similar to other IPs in Vivado. In the image below, it's the orange block.

Inputs and outputs are interfaced with the ARM controllers via the AXI interface.

 

 

 

 

Install on Pynq Board

 

This is straightforward.

You open a Linux terminal. There is one available from the Jupyter home page of your board. Or you can use PuTTY, etc.

Then follow the Quick Start. The two notebooks with ARM and FPGA implementations will become available

 

 

 

Building the example from source

If you want to have the Vivado and Vitis HLS projects available in your 2020.1 install, that's possible.

 

Clone the HelloWorld git, with subprojects.

git clone --recursive https://github.com/Xilinx/PYNQ-HelloWorld.git

Start the Vivado 2020.1 TCL shell

 

Vivado Project

Navigate to the directory you just cloned, and move to the Vivado root for your Pynq board.

cd PYNQ-HelloWorld/boards/Pynq-Z2/resizer

Generate the Vivado project:

exec vivado -mode batch -source resizer.tcl -notrace

 

Vitis HLS Project

 

First (if you have a version before 2021.1), edit these two headers of the Vitis HLS install:

Mine were located in D:\Xilinx\Vitis\2020.1\win64\tools\clang\include\c++\4.5.2. This depends on your version, and if you installed Vitis with Vivado or Vivado with Vitis .

 

exception_ptr.h

 

add the last two lines:

#ifndef _EXCEPTION_PTR_H

#define _EXCEPTION_PTR_H

 

#ifdef __clang__

    class type_info;

#endif

 

 

nested_exception.h

 

replace line 110

    __throw_with_nested(_Ex&&, const nested_exception*) // modified

 

replace line 122

    __throw_with_nested(_Ex&& __ex, const nested_exception*) // modified

 

Then generate the project and build the IP.

cd PYNQ-HelloWorld/boards/ip/hls/resize

make

 

You now have the sources and projects for the accelerated function and the Vivado FPGA design.

 

Pynq - Zync - Vivado series
Add Pynq-Z2 board to Vivado
Learning Xilinx Zynq: port a Spartan 6 PWM example to Pynq
Learning Xilinx Zynq: use AXI with a VHDL example in Pynq
VHDL PWM generator with dead time: the design
Learning Xilinx Zynq: use AXI and MMIO with a VHDL example in Pynq
Learning Xilinx Zynq: port Rotary Decoder from Spartan 6 to Vivado and PYNQ
Learning Xilinx Zynq: FPGA based PWM generator with scroll wheel control
Learning Xilinx Zynq: use RAM design for Altera Cyclone on Vivado and PYNQ
Learning Xilinx Zynq: a Quadrature Oscillator - 2 implementations
Learning Xilinx Zynq: a Quadrature Oscillator - variable frequency
Learning Xilinx Zynq: Hardware Accelerated Software
Automate Repeatable Steps in Vivado
Learning Xilinx Zynq: Try to make my own Accelerated OpenCV Function - 1: Vitis HLS
Learning Xilinx Zynq: Try to make my own Accelerated OpenCV Function - 2: Vivado Block Design
Learning Xilinx Zynq: Logic Gates in Vivado
Learning Xilinx Zynq: Interrupt ARM from FPGA fabric
Learning Xilinx Zynq: reuse and combine components to build a multiplexer