Skip navigation
1 2 3 Previous

FPGA Group

122 posts
I had a lot of interest in my giveaway for the Digilent CMOD S7 board, so I am offering another giveaway for 2 members who are interested in experimenting with a Spartan-6 FPGA. Let me tell you about the board first.   It's called the Cmod S6 -- a Breadboardable Spartan-6 FPGA Module.   It's small, featuring a 48-pin DIP form factor board built around a Xilinx Spartan 6 LX4 FPGA.   The board also includes a programming ROM, clock source, USB programming and data transfer circ ...
Sorting Networks - The Verification Problem   In the final post on this subject of parallel sorting using FPGAs I will talk about the difficult issue of how to verify that such a design actually works. That is when given a set of any N input numbers in any arbitrary order the design outputs them in ascending sorted order. There are two main verification strategies for hardware designs, based on formal proof and exhaustive testing. The first one uses logical mathematical methods to prove th ...
Sorting Networks - The results for the VHDL implementation of Batcher's sorting algorithm   So how good is this VHDL implementation of a parallel sorting network? Before we look at the results here are the two modules that were missing from the previous post, a generic DELAY module for elements to be sorted:   library ieee;   use ieee.std_logic_1164.ALL;   use work.SORTER_PKG.all;   entity DELAY is   generic(SIZE:INTEGER:=1);   port(CLK:in STD_LOGIC; ...
A couple of months back I received a ZC702 Xilinx development board from Randall after pitching an idea on how to build an embedded vision project. The board arrived a couple of weeks later (thank you Randall) and I started working on the required steps to build an embedded vision application that targets lane detection. The main idea was to build a vision system capable of automatic lane detection. In a nutshell,  I wanted to build a custom embedded vision app that would takes images from ...
Sorting Networks - The VHDL implementation of Batcher's sorting algorithm   OK, enough with the preliminaries, it's time now for the real deal, how do we implement in an FPGA this parallel sorting network thing. The main design goals are creating a generic, reusable and efficient implementation. We want to be able to implement a sorting network of any size N with a single piece of code, we want to be able to sort all kinds of data, not just integers and we want something close in size to ...
Sorting Networks - The Batcher or odd-even mergesort sorting algorithm   Now that we have defined the FPGA design engineering problem - parallel sorting of a set of N items in one clock, that is one new sorting operation starting every clock - we need to chose a sorting algorithm.While ideal in terms of performance (number of compare-exchange operations respectively latency) optimal sorting networks are irregular and more importantly they cannot be generated programmatically for an arbitr ...
Sorting Networks   Sorting networks are an interesting and unsolved mathematical puzzle. They are quite different from the usual sorting algorithms one encounters in computer science like bubble sort, quick sort, merge sort, heap sort and so on. While sequential algorithms are generic, in the sense that they work on an input set of items of any size, their execution time grows with N, either as  O(N·log2N) for the fast algorithms or  O(N2) for the more trivial ones and the ...
Here is the new release 0.1a of XXICC.  Rev 0.1a adds the return statement to GCHD (GalaxC for Hardware Design) which allows a hardware module to return a value without using an output port.  Rev 0.1a also adds GCHD comparison operators missing from earlier releases: x < y, x <= y, x > y, and x >= y.  Rev 0.1a also fixes some bugs, mostly involving n-bit integers.   XXICC (21st Century Co-design) is a not-for-profit research project which attempts to bring digit ...
Sorting and Searching Algorithms   Now that we went through a variety of VHDL design building blocks and we saw the techniques to create both generic, reusable and at the same time efficient (in terms of speed and area) designs, using either behavioral inference or primitive instantiations, it is time to put what we have learned in practice. We have already looked briefly at FIRs, Finite Impulse Response Filters in the context of introducing the DSP48 primitive, there is a lot more to be s ...
The DSP58 Primitive   Xilinx has recently announced a new 7nm FPGA family called Versal. Devices in this family will have an improved version of the UltraScale/UltraScale+ DSP48E2 primitive we just studied in the last posts. The new Versal primitive is called DSP58 and there are numerous improvements compared to the earlier DSP48.   First of all, the signed multiplier, which was 27x18 in DSP48 is now 27x24 and the 48-bit post-adder/accumulator is 58 bits, which is where the name of ...
The DSP48 Primitive - Complex multipliers   Traditionally a complex multiplication can be decomposed into four real multiplications and two additions:      x+i·y=(a+i·b)(c+i·s)=(a·c−b·s)+i·(a·s+b·c) where     i=√−1   This maps well into four DSP48s, including the two additions, which can use the post-adders and the DSP48 P cascades. The latency of a fully pipelined DSP48 implementation is fo ...
The DSP48 Primitive - Inferring larger multipliers   The DSP48E2 primitive contains a signed 27x18 multiplier, any signed multiplier up to this size can be implemented with just one such primitive. If we need larger multipliers we can achieve that with multiple DSP48s.   The way larger multipliers are built uses a feature of the DSP48 primitive in which the 48-bit dedicated P cascade output of one DSP48 is right shifted by 17 bits before being added to the partial product calculate ...
The DSP48 Primitive - Small Multiplications - Two For the Price of One   The DSP48E2 primitive contains a signed 27x18 multiplier, any signed multiplier up to this size can be implemented with just one such primitive. Unsigned multiplications are of course possible if you add a zero MSB bit to the operands and then treat them as signed but the largest unsigned multiplication that can be done that way with one DSP48E2 is 26x17.   When the operands are much smaller it becomes possib ...
From: https://thetinysynth.wordpress.com/ The Tiny Synth, with its synthesis core measuring only 1.5×1.5mm, is probably the smallest subtractive synthesizer out there. Based on the Artix7 device from the latest Xilinx FPGA family, it provides a total of seven oscillators, three LFOs (Low-Frequency Oscillators), two envelope generators, tremolo and vibrato effects and an SVF (Static Variable Filter) with resonance and frequency control. The Artix7 device, the core of The Tiny Synth.   ...
The DSP48 Primitive - Wide XOR Mode   The DSP48 primitive can be used for more than just multiply and accumulate. It can for example implement very wide XOR functions. Apart from the obvious ability of XORing two 48-bit operands using the A concatenated with B, or A:B and C inputs and producing a 48-bit result on the P output, or 48 XOR2 logic functions, it is also possible to implement 8 XOR12s, or 4 XOR24s, or 2 XOR48s or one XOR96 with a single DSP48E2. It is also possible to compute a ...

Filter Blog

By date: By tag: