Multichannel Symmetric FIRs In the last two posts we have considered the case when the FPGA clock frequency is faster than the FIR sample rate. The ratio between the system clock and the data sample rate is called the overclocking factor M. We have seen that there are two ways to take advantage of this, either implement M identical FIR channels with K DSP48s, where the filter order N is K for non-symmetric FIRs, 2*K for even-symmetric ones and 2*K-1 for the odd-symmetric case, or implem ...

Multichannel and Overclocking FIRs - The Single Rate Symmetric Case In the last post I created an overclocked or semi-parallel implementation of a systolic, non-symmetric FIR, where each DSP48 in the chain implements M taps of the filter. The filter sample rate is M times slower than the system clock rate, but the device utilization is also M times smaller, N/M DSP48s instead of the normal N, where N is the filter order. Many times the FIR filter is symmetric and the DSP4 ...

Multichannel and Overclocking FIRs - The Single Rate non-Symmetric Case We are looking now at the case of the single rate FIR filter where the sample rate is a sub-multiple of the FPGA clock rate. For example, let's say that the input and output sample rates of our single rate FIR of order N=8 are 200Msps, but we know we can run our FPGA DSP48s and fabric at 800MHz. We can take advantage of the extra FPGA speed in two ways. We can either implement four such filters for the price o ...

Taking advantage of coefficient symmetry in Polyphase FIRs We have seen in previous posts that when the FIR coefficients are symmetric, we can use a DSP48 feature called a pre-adder and reduce the number of multipliers required in half. Essentially, an FIR of order N can be implemented with N/2 DSP48s.Taking advantage of the filter symmetry is important, especially when the FIR is large or there are many instances of such filters in a design. The DSP48s are a scarce resource and reducin ...

Polyphase Decimators The Polyphase Decimator FIR is the dual structure of the Polyphase Interpolator. The basic idea is that you can reduce the sample rate of a signal by a factor of M if you keep only one out of every M samples. This only works if the signal being decimated has a limited bandwidth, otherwise we will get aliasing artifacts. This is achieved by low-pass filtering the input signal before decimation, with a prototype filter which is very similar to the one we encountered in ...

Polyphase Interpolators In the previous post we have looked at and important class of FIR filters, namely Polyphase architectures, which are extensively used for changing the sample rate of a signal by and integer factor, a process called interpolation or decimation. We have looked at the particular case where the system clock rate is equal to the lower sample rate, that is the input rate for an interpolator and the output rate for a decimator. In that case, a Polyphase FIR that interpol ...

Polyphase FIRs The half-band FIR is just one particular case of a larger class of FIR filter implementations called polyphase structures. The basic idea is to split the sum of products we need to compute for every filter output sample into multiple sub-sums or phases, using the associativity property of addition. From a mathematical point of view, this would be expressed by the following formula: What this means is that we have M partial sums which are computed separately and t ...

The Single Rate Half-Band FIR Decimator A decimating filter will reduce the sample rate of a signal, while preventing aliasing. Decimation by a factor of 2x is achieved by simply throwing out every second input sample. For this to work the input data must be first filtered and the upper half of the frequency spectrum attenuated to a point where it will not affect the desired signal after decimation. The half-band FIR is ideally suited for this task. We start again with the same single r ...

The Single Rate Half-Band FIR Interpolator In the previous post we looked at the single rate half-band FIR, a particular type of odd-symmetric FIR, where almost half of the filter coefficients are zero. Not computing multiplications with these zero coefficients and also taking advantage of the symmetry reduces the number of multiplications per output sample by a factor of 4x. From a mathematical point of view, that filter looked like this, for the particular case where the filter ...

The Single Rate Half-Band FIR We have started by looking at the most general version of an FIR filter. From a mathematical point of view, this is all that is needed. There are countless variations, like the symmetric versions, both odd and even, they are just particular cases of the general FIR algorithm and they present little interest to a mathematician. But from an implementation point of view, these particular filter versions do matter. One such example is the half-band FIR, which wi ...

The Single Rate symmetric FIR, low latency transposed architecture The question we need to answer now is this - for those applications that require very low latency FIRs is there a way to avoid the increase in latency proportional to the filter order that is characteristic of direct systolic implementations, both non-symmetric and even/odd-symmetric ones? We have already seen in Post 5 that the answer to this question for the non-symmetric FIR case was yes. By using the transposed FIR i ...

The Single Rate odd-symmetric FIR In the last post we have examined the even-symmetric FIR, a filter of order N=2*K. The main conclusion was that we only need K DSP48s to implement such a filter, and we came up with a basic building block that is both efficient in terms of device utilization, generic and scalable. We can build filters of virtually any size simply by cascading this block, with no speed degradation as the filter gets larger. We will now consider the odd case, whe ...

The Single Rate even-symmetric FIR We have looked so far at the simplest and most generic FIR possible, the single rate non-symmetric FIR filter. But many FIR filter implementations have more particular structures and taking advantage of these can improve the filter efficiency in terms of resource utilization. The simplest possible variation is coefficient symmetry. As a matter of fact, more than half of the FIRs you will ever encounter will be symmetric in one form or another, and the g ...

The Single Rate non-symmetric FIR, direct and transpose architectures As I mentioned earlier, the single rate non-symmetric FIR filter has two possible implementations, the direct and the transpose forms. We will now apply again the retiming and pipeline cut methods to derive the transpose architecture from the direct one and prove that the two are equivalent. We started from the mathematical equation of the FIR and we derived from it the direct form implementation, which is the f ...

Register pushing and the pipeline cut It should be clear by now that a direct implementation of the DSP algorithm is not good enough. Every single individual computation block, the adders and the multipliers, will require pipeline registers and there are simply none available. If we had these registers already available in the design, we could move them around to where we actually need them, a process called re-timing, or more informally, register pushing. There are several re-timing tra ...