# FPGA: Waves: 4 Tinker, Taylor, Soldier, Sine

Posted by jc2048 in FPGA Group on Oct 14, 2019 7:56:00 AM |

Previous blogs:
FPGA: Making Waves
I have a small (Brevia 2)(Brevia 2) development board [1] from Lattice Semiconductor featuring one of their XP2-family FPGAs. I'm using it to explore, in a very simple, basic kind of way, digital signal generation and processing. These blogs aren't tutorials and there's no structured path to any of it: it's just me experimenting and trying things out.
The iterated sine oscillators that I looked at in Waves 3 are 'ballistic' in nature; that is, you launch them and hope they remain on course with the oscillation neither growing or shrinking. Curiously, at a fixed frequency, the two forms I looked at did maintain a constant amplitude, though the time I examined them for wouldn't suffice if you wanted to be sure of them operating for an extended period of time. It was very obvious, however, that changing the frequency of operation on the fly did affect the amplitude. Although it's possible to add some form of AGC [Automatic Gain Control] - the digital equivalent of what you'd do with a Wien bridge oscillator to stabilise the amplitude - that rapidly complicates something that was supposed to be very simple. A further problem with the iterated oscillators is that the control parameters are in terms of either sines or cosines, so changing frequency to anything other than fixed, pre-computed values starts to get complicated as well. As far as I can see, they are best suited to fixed-frequency tones, whether of short duration, or continuous if you can convince yourself of the long-term stability.
If you're interested in these kinds of oscillator, there's an interesting paper by Clay S. Turner [2] which analyses them in terms of a rotation matrix operating on a pair of variables. It gives the popular variants, with their advantages and disadvantages, and enough information to make up your very own unique oscillator if you've a mind to.
The most obvious alternative to an interative approach gets called 'phase accumulation'. Here the phase increment for each sample period is accumulated and the sine is then directly calculated from that. One simple and very popular way of doing that is via a sine look-up table, which is where I started my journey before I got diverted. Another way to calculate a sine, which was suggested in the comments of a previous blog, is via a CORDIC approach. That looks interesting and I may still try that but, for the moment, I'm going to try another idea that I came across and that is to calculate the sine using a Taylor's series (this one is really 'back to basics' - the maths is so far back I can't even remember being taught it).
On the surface this looks a bit crazy, with all the calculations that seem to be involved, but, because of the factorials, the series converges quickly (only for small numbers) so that not too many terms need to be used.
Programming books would steer you away from this one for general use for calculating sines because of the way the powers balloon quickly but, in this case, for use as an oscillator, I can work with the interval -pi/2 to +pi/2 and that keeps them to fairly modest values. A bit of work with a spreadsheet suggests that using six terms would result in a worst-case error of about half a bit for a 24-bit output (worst case error is when the sine is close to one). I won't achieve that, the way I'm going to do it with the 32-bit multipliers available in the DSP blocks in the FPGA and fixed-point arithmetic, but it will be more than good enough for the 12-bit DAC I'm using.
Apparently, calculating it is more efficient if you factor the series first using Horner's alorithm, but I like the simplicity of the basic form and so I'm going to try using that directly and see if I can get it working.
Here's how I've structured it:
Incidentally, today's apple is an Egremont Russet.
The first multiplier keeps multiplying the theta value (the phase of the sine wave at that sample) by itself to generate each power in turn [all the powers, not just the odd ones that we want]. The second multiplier then multiplies the result by a fractional coefficient which is the reciprocal of the factorial (from a small table). Finally, the results are added up to give the sine value. Skipping the even powers is achieved simply by having a value of zero for the particular coefficient. The alternating additions and subtractions come from working with signed numbers and building the required sign into the coefficient.
As a scheme, it's obviously not as efficient as it could be, but it has the virtue of simplicity. Two 32-bit multipliers, an adder and a small, 16-value coefficient table seems quite reasonable to me for a sine to almost 24-bit accuracy.
To accommodate the powers, my fixed point format needs to be able to handle numbers up to (pi/2)^11 which is around 144, so I have 8 bits ahead of the implied point like the top format here:
The coefficients in the table have a maximum value of 1, so I've stored them in the look-up table in the lower format. The multiplier outputs get mangled in order to cope with the shifts in the point placement.
Here's the VHDL I arrived at. Keep in mind I'm not a specialist in programmable logic and my coding style is probably a bit odd and quirky compared to the kind of thing a professional would come up with. Also, it hasn't been tested thoroughly nor had the output validated in any way. It's an experiment: use it for real at your peril!
------------------------------------------------------------------ -- ***** waves_taylor ***** -- -- Test of Taylor expansion sine calculation -- ------------------------------------------------------------------ -- JC 9th October 2019 -- ------------------------------------------------------------------ -- Rev Date Comments -- -- 01 09-Oct-2019 -- ------------------------------------------------------------------ library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_unsigned.all; entity wave_taylor_con is port( clk_in: in std_logic; --- system clock in (50 MHz oscillator) --- DAC connections mcp4821_ncs: out std_logic; --- DAC cs mcp4821_sck: out std_logic; --- DAC sck mcp4821_sdi: out std_logic; --- DAC sdi mcp4821_nldac: out std_logic; --- DAC load --- misc control signals on evaluation board that it might be good to hold at fixed levels spi_cs: out std_logic; --- hold_n: out std_logic; --- sram_cen: out std_logic; --- sram_oen: out std_logic; --- sram_wen: out std_logic; --- uart_tx: out std_logic); --- end wave_taylor_con; architecture arch_wave_taylor of wave_taylor_con is signal spi_send, spi_ncs, spi_ncs_del,taylor_reset,taylor_enable: std_logic; signal spi_output_sr_bit_count: std_logic_vector (5 downto 0); signal spi_output_sr: std_logic_vector (15 downto 0); signal interval_count: std_logic_vector (8 downto 0); signal acc_count: std_logic_vector (7 downto 0); signal taylor_count: std_logic_vector (3 downto 0); signal mult1_product: std_logic_vector (71 downto 0); signal mult2_product: std_logic_vector (71 downto 0); signal add1_product: std_logic_vector (35 downto 0); signal adder2_result: std_logic_vector (35 downto 0); signal theta: std_logic_vector (35 downto 0); signal phase_increment: std_logic_vector (35 downto 0); signal n1: std_logic_vector (35 downto 0); signal n2: std_logic_vector (35 downto 0); signal n3: std_logic_vector (35 downto 0); signal n4: std_logic_vector (35 downto 0); signal n5: std_logic_vector (35 downto 0); signal sine_value: std_logic_vector (35 downto 0); signal osc_reset: std_logic_vector (1 downto 0); signal coeff: std_logic_vector (35 downto 0); --- declare the multiplier, the adder, and the table modules as components component mult_module is port ( DataA: in std_logic_vector(35 downto 0); DataB: in std_logic_vector(35 downto 0); Result: out std_logic_vector(71 downto 0)); end component; component adder_module is port ( DataA: in std_logic_vector(35 downto 0); DataB: in std_logic_vector(35 downto 0); Result: out std_logic_vector(35 downto 0)); end component; component rom_module is port ( Address: in std_logic_vector(3 downto 0); Q: out std_logic_vector(35 downto 0)); end component; begin wave_taylor_stuff: process (clk_in) begin if (clk_in'event and clk_in='1') then --- interval_count counts at the clock rate (50MHz) --- counts down by 500, so spi_send occurs every 10us (100kHz) if (interval_count(8 downto 0) = b"000000000") then --- if zero interval_count(8 downto 0) <= b"111110011"; --- preset to 499 spi_send <= '1'; else interval_count(8 downto 0) <= interval_count(8 downto 0) - 1; --- count down spi_send <= '0'; end if; --- spi ncs goes low when triggered by spi_send, goes hi again when bitcount reaches 31 if (spi_send = '1') then spi_ncs <= '0'; elsif (spi_output_sr_bit_count(5 downto 0) = b"111111") then spi_ncs <= '1'; end if; spi_ncs_del <= spi_ncs; --- taylor reset occurs at end of spi cs if (spi_ncs = '1' and spi_ncs_del = '0') then taylor_reset <= '1'; else taylor_reset <= '0'; end if; --- taylor count runs through the series if (taylor_reset = '1') then taylor_enable <= '1'; elsif (taylor_count(3 downto 0) = b"1010") then taylor_enable <= '0'; end if; if (taylor_enable = '1') then taylor_count(3 downto 0) <= taylor_count(3 downto 0) + 1; else taylor_count(3 downto 0) <= b"0000"; end if; --- spi output bit count only counts when enable spi cs is low if (spi_ncs = '0') then spi_output_sr_bit_count(5 downto 0) <= spi_output_sr_bit_count(5 downto 0) + 1; else spi_output_sr_bit_count(5 downto 0) <= b"000000"; end if; --- phase accumulator if (spi_send = '1') then if (acc_count(7 downto 0) = b"0000000") then --- if zero acc_count(7 downto 0) <= b"01100100"; --- preset to 100 theta(35 downto 0) <= b"1_11111110_011011011110000001001010110"; --- set to -pi/2 else acc_count(7 downto 0) <= acc_count(7 downto 0) - 1; --- count down theta(35 downto 0) <= adder2_result(35 downto 0); end if; end if; --- taylor expansion if (taylor_reset = '1') then n2(35 downto 0) <= theta(35 downto 0); sine_value(35 downto 0) <= b"0_00000000_000000000000000000000000000"; elsif (taylor_enable = '1') then n2(35 downto 0) <= n1(35 downto 0); sine_value(35 downto 0) <= n4(35 downto 0); end if; --- spi output shift register if (spi_send = '1') then --- load... spi_output_sr(10 downto 0) <= sine_value(26 downto 16); --- spi_output_sr(11) <= not sine_value(35); --- inverse of sign bit spi_output_sr(15 downto 12) <= b"0011"; --- dac control bits elsif (spi_ncs = '0' and spi_output_sr_bit_count(1 downto 0) = b"11") then --- shift... spi_output_sr(15 downto 1) <= spi_output_sr(14 downto 0); --- the register contents spi_output_sr(0) <= '0'; end if; end if; phase_increment(35 downto 0) <= b"0_00000000_000010000000101011011111110"; --- pi/100 --- multiplier output connection mangling --- this corrects for the output shifts incurred by using fixed point binary representation --- (no point actually shifting it - we can simply just rewire it) n1(34 downto 0) <= mult1_product(61 downto 27); n1(35) <= mult1_product(71); n3(34 downto 0) <= mult2_product(68 downto 34); n3(35) <= mult2_product(71); --- mcp4821_ncs <= spi_ncs; mcp4821_sck <= spi_output_sr_bit_count(1); mcp4821_sdi <= spi_output_sr(15); mcp4821_nldac <= '0'; --- hold these device control pins at a fixed level to stop them flapping around spi_cs <= '1'; hold_n <= '1'; sram_cen <= '1'; sram_oen <= '1'; sram_wen <= '1'; uart_tx <= '1'; end process wave_taylor_stuff; --- instantiate the multipliers, adder-subtractor, and table mult_module1: mult_module port map( DataA => theta(35 downto 0), DataB => n2(35 downto 0), Result => mult1_product(71 downto 0)); mult_module2: mult_module port map( DataA => n2(35 downto 0), DataB => coeff(35 downto 0), Result => mult2_product(71 downto 0)); adder1: adder_module port map( DataA => sine_value(35 downto 0), DataB => n3(35 downto 0), Result => n4(35 downto 0)); adder2: adder_module port map( DataA => phase_increment(35 downto 0), DataB => theta(35 downto 0), Result => adder2_result(35 downto 0)); rom_module1: rom_module port map( Address => taylor_count(3 downto 0), Q => coeff(35 downto 0)); end arch_wave_taylor;
I used IPExpress to generate a signed 32-bit multiplier with no registers, a 32-bit signed adder with no registers, and a 32-bit wide distributed ROM table with 16 entries, which you'll see treated as components in the code. The ROM table was given this initialisation file - this is the coefficients in S1.34 binary format.
010000000000000000000000000000000000 000000000000000000000000000000000000 111101010101010101010101010101010101 000000000000000000000000000000000000 000000001000100010001000100010001001 000000000000000000000000000000000000 111111111111110010111111110011000000 000000000000000000000000000000000000 000000000000000000001011100011101111 000000000000000000000000000000000000 111111111111111111111111111001010010 000000000000000000000000000000000000 000000000000000000000000000000000000 000000000000000000000000000000000000 000000000000000000000000000000000000 000000000000000000000000000000000000
And here it is generating the fourth and first quadrants of a sinewave. I've got 200 steps from -pi/2 to +pi/2, so the time is 10uS x 200 = 2mS before it repeats. Generating a complete sinewave would then require using symmetry to get the values in the other two quadrants (and all the bother of manipulating the phase which is in terms of radians and more difficult to deal with than with a sine table which would normally have a power of 2 entries and thereby be very amenable to modulo arithmetic).
Using programmable logic in an FPGA to experiment with computational algorithms is very interesting. Casting things in a form suitable for the logic leads to a somewhat different mindset to how you'd look at them if you were computing using a processor, both from the point of view of the choice of algorithm but also in how it gets implemented.
For any of you who aren't British, the title is a rather convoluted pun on a pun. 'Tinker, tailor, soldier, sailor,...' is a traditional children's rhyme from centuries past. It was used last century as the title of a famous spy novel, 'Tinker, Tailor, Soldier, Spy', by the author John le Carre. And I'm now reusing it in the sense of tinkering with Taylor to work out a sine.
[1] http://www.latticesemi.com/en/Products/DevelopmentBoardsAndKits/LatticeXP2Brevia2DevelopmentKit.aspx |

## Comments