Skip navigation
1 2 3 Previous

FPGA Group

95 posts
Using the Carry-Save Adder, The Constant Coefficient Multiplier  Multiplications in Xilinx FPGAs are done using DSP48s, which are primitives that consist of a 25x18 signed multiplier, a 25-bit preadder and a 48-bit postadder/accumulator. In UltraScale/UltraScale+ FPGA families the signed multiplier is 27x18 and the post adder has three inputs instead of just two. Depending on the FPGA size and family there are hundreds to thousands of such DSP48 primitives, that are able to do one multiply ...
Using the Carry-Save Adder, A Generic Adder Tree  In this post I will show how to implement an efficient and generic adder tree, we need to compute the sum of N elements, where N can be any value. The numbers we add are also arbitrary precision fixed point values, all the same range but otherwise unconstrained.   We can represent the input data as an unconstrained array of unconstrained SFIXED, which requires VHDL-2008 support - with Vivado we can synthesize and implement this but we ...
Using the Carry-Save Adder, Computing a Running Average  I will show in the next few posts some design examples where using a 3-input carry-save adder instead of the normal 2-input ripple-carry adder makes a significant difference. The first example is a running average, where we have a stream of input samples and we want to compute a continuous running average every clock, as the average of the last N samples. In mathematical terms:   y(n)=1/N*Sum(x(n-k)), k=0..N-1   As a firs ...
This is  a little test board I'm putting together to experiment with the MAX32660 processor and the Lattice UP5K Ultra Plus FPGA.   The UP5K is a tiny FPGA, available in a hand solderable 48 pin, 0.5mm pitch QFN with 5k LUTs, 15k bytes of embedded block RAM and 128k bytes of SPRAM (slowish ram in 32k byte blocks). It has 8 multipliers (16 x 16) and draws a static current of 75uA. They cost pocket money, about £5 a few at a time.   It's a volatile FPGA so it needs something ...
The Carry-Save Adder, two for the price of one  This post is about buying two adders but paying only for one of them.   When developing software the CPU and memory your code is running on is already paid for and there is little incentive to optimize your code to make it either smaller or faster. But as a hardware designer you literally pay for every LUT and FF in the FPGA you are using. If you could make your design smaller and faster you could do more with the same FPGA or you could ...
This is an update of my on going N64 HDMI conversion project. The N64=>HDMI Conversion Project As of right now I have identified that the RCP (or the GPU) have common outputs between board revisions for the video and audio. There is a seven bit word, Dsync, and clock that contribute to the digital video signal. These are connected to a DAC that outputs an RGB signal. The plan is to rob these signals on there way to the DAC in order to reduce complexity and latency of the FPGA design. I have ...
Counters, Adders and Accumulators  One of the most common operation encountered in digital hardware design, especially for digital signal processing applications, is addition. This actually covers a large group of fundamental building blocks, like up/down binary counters, adders/subtractors, comparators, accumulators and so on. The signal types operated on can be IEEE.numeric_std SIGNED/UNSIGNED for integer operands, the user defined SFIXED introduced earlier, or the default VHDL-2008 type ...
This is a continuation of this post: Custom Vivado Parts/Board Creation and this post:Nintendo 64 Schematic   I plan on adding to this blog and over time creating a documented progression of the project.   For those who did not take the time to go through that post (though you definitely should find the time), here are the cliff notes. I am a computer engineering student. My end project is to convert raw N64 GPU data (and audio) to HDMI. It will require I design my own board a ...
The Universal MUX Building Block Part 3, the one with the Dutch Cocoa Box and the Ouroboros  We have seen in the previous post that Vivado Synthesis is able to optimally infer a mux form behavioral code for multiplexers with up to 16 inputs, but beyond that not so much. The synthesis results are not bad but for high performance designs where every LUT and especially every logic level counts not bad is not good enough.   So in this post I will present a solution to this problem, that ...
The Universal MUX Building Block Part 2  So the question is now what is the most efficient implementation for arbitrary size multiplexers one should expect? If the result the synthesis tools infers from behavioral code is equal or very close to this there is no need for a specialized MUX Building Block. If the difference is significant then there will be a definite need for such a block, especially for designs with large muxes and/or many of them.   To simplify the analysis we will f ...
I have a single seven segment display with common anode. This display has following symbol: FJS-5161B. Here is brief information from datasheet for this display: I have created a kind of PMOD module which contains a single seven segment display, eight resistors with value 150 ohm for limit the LEDs current and pinout socket. Here is photo of this module: Here is information about connection between Pmod JD connector and seven segment display: V15 jd[0] -> D U12 jd[1] -> C V13 jd[2 ...
The Universal MUX Building Block  The next example in the series of generic building blocks is a multiplexer. This is a combinatorial block - if we need pipelining we can always add that separately to keep it as generic as possible - with an input port I of N elements, an UNSIGNED SEL port and an output port O, which is one of the N elements of the input port I selected by SEL. We want of course the I and SEL ports to be unconstrained arrays. The most generic solution would be one with a g ...
Introduction They may seem like an unlikely or odd couple: art and technology. Art implies the vast realm of unbridled creativity, while technology is bounded by empiricism: rules, standards and rationality. But when art and technology are interwoven, new vistas are revealed, new ideas are born, and new technologies with a distinctive twist are realized. In his Art of FPGA Design Series, Xilinx employee and designer Catalin Baetoniu breathes life into the finer points of designing with FPGAs. E ...
Have you ever wanted to integrate a Microchip PHY into the Xilinx Ecosystem, but previously had no proven reference designs available to mitigate risk factors? Recently Avnet released the Network FMC (http://avnet.me/fmc-network1 ), which is a dual Microchip Ethernet 10/100/1000 PHY Low Pin Count (LPC) FMC expansion module. I will take you through the design choices made in the development of the Network FMC expansion module and various lessons learned during this development process.   &# ...
Instantiating LUT6 Primitives Part 2  Today I will show a couple of examples where LUT6 primitive instantiations make sense. To keep things short and simple these are somewhat artificial examples but situations like these tend to show up all the time in hardware designs. Let's say we need a 48-input AND function. This can be coded very easily behaviorally, especially if we take advantage of the new VHDL-2008 features:   library IEEE; use IEEE.STD_LOGIC_1164.all;  entity WideAN ...

Filter Blog

By date: By tag: