11 Replies Latest reply on Sep 11, 2021 6:54 PM by scottiebabe

    The RP2040 PIO Module is Pretty Nifty

    scottiebabe

      I recently have had an excuse to try writing a PIO program for the PI Pico. Aside from that, I have only copy-pasted the WS2812B PIO state machine code.

       

      Although this maybe is maybe isn't the greatest example, but 4-bit interfacing with the HD44780 LCD controller does take a small amount of bit wrangling.

       

      As a brief refresher in 4-bit mode you strobe the upper 4-bits then the lower 4 character bits in succession, like the following:

       

      The enable line is driven as a sideset output from the PIO module. LCD data bits are driven as PIO output pins.

       

      @rp2.asm_pio(out_shiftdir=PIO.SHIFT_RIGHT,sideset_init=(PIO.OUT_LOW),out_init=([PIO.OUT_LOW]*4),fifo_join=PIO.JOIN_TX)
      def HD44780():
          # Actual program body follows
          wrap_target()
          pull(block)              .side(0)     [0] # Load 32bit word from fifo into OSR
          set(y,3)                 .side(0)     [0] # Loop 4 times for 4 8bit chars
          label('charout')
          out(x,4)                 .side(0)     [0] # stash lower 4 char bits
          out(pins, 4)             .side(1)     [1] # Write high 4 char bits, Drive EN High
          nop()                    .side(0)     [1] # 2 cycles   
          mov(pins,x)              .side(1)     [1] # Write low 4 char bits, Drive EN High
          set(x,8)                 .side(0)     [0] # Loop (8+1) times
          label('chardelay')
          nop()                    .side(0)     [7] # 7+1 cycles
          jmp(x_dec,'chardelay')   .side(0)     [0] # 1 cycle
          jmp(y_dec,'charout')     .side(0)     [0] # 1 cycle
          wrap()                                    # 0 cycles
      

       

      I chose to clock the PIO state machine at 2 MHz, so each cycle is 500 ns. The ".side(x)" defines the enable pin state and the [n] adds an addition n cycle delay to the instruction execution time.

       

      I choose to make full use of FIFO by taking in 4 characters in each 32-bit word, but one could just as easily only accept the low 8-bits of the FIFO word as one character only.

       

      Even the character to character delay timing is taken care of by the PIO state machine with an 81 cycle delay.

       

      There are a few other neat tricks for a transmit only state machine, like using the input shift register as an addition scratch register or as input parameter as seen in the PWM example found at: https://github.com/raspberrypi/pico-micropython-examples/tree/master/pio

       

        • Re: The RP2040 PIO Module is Pretty Nifty
          Fred27

          scottiebabe  wrote:

          Although this maybe is maybe isn't the greatest example...

          I disagree. This is exactly the sort of thing that the PIO is for, and a nice write up.

          • Re: The RP2040 PIO Module is Pretty Nifty
            michaelkellett

            Thanks for posting.

            This is the first post I've seen where someone makes the PIO do something with their own code.

            Pretty much everything else seems to use it like an AVR on an Arduino.

            Which toolchain/development host are you using ?

             

            MK

              • Re: The RP2040 PIO Module is Pretty Nifty
                scottiebabe

                All of my explorations with the RP2040 have been in uPython.

                 

                After reading your post I searched to see if there is a PIO simulator and there does appear to be one: https://rp2040pio-docs.readthedocs.io/en/latest/

                 

                I haven't tried compiling uPython modules (.mpy) in c yet, that is still on my todo list. Another interesting feature of the chip is keeping an eye on program cache hits:

                 

                 

                For many microcontrollers the flash memory is on die, so the cpu access time is guaranteed. Running uPython the cache hits have been almost 100% but its something I am starting to think about. I suppose in C you may be able to define certain program elements like ISRs to execute from SRAM always.

              • Re: The RP2040 PIO Module is Pretty Nifty
                scottiebabe

                I probably spent more time soldering the wires to the LCD than I spent porting some of my old C code to micropython. So, I give RPI & micropython high marks here. However what led me to try the PIO module is that bit-banging in micropython is really slow. If I was developing in C I don't know that I would have jump to the PIO module straight away, but in this case I was forced to try it and am really glad I did.

                 

                This was the ported code I ended up with:

                 

                def LCDData(c):
                    
                    LCD_RS.value(1)
                    
                    LCD.DAT = c >> 4;
                    
                    LCD_EN.value(1)
                   
                    LCD_EN.value(0)
                
                    LCD.DAT = c;
                
                    LCD_EN.value(1)
                    
                    LCD_EN.value(0)
                

                 

                The nop delays were omitted as python is slow enough and the LCD.DAT bit-field takes care of bit masking for only the 4 gpio bits.

                 

                Here is a scope shot of the micropython code above executing, its slow...

                 

                Here is the PIO module with lovely timing:

                 

                The character to character delay is also handled by the PIO module:

                 

                 

                Now in python the only software overhead is re-arming the DMA channel for each "frame" update

                3 of 3 people found this helpful
                  • Re: The RP2040 PIO Module is Pretty Nifty
                    javagoza

                    I don't know about the RP2040 or micropython, but I have a question. In those examples are you using threads for the LCD driver or is everything running from the main thread?

                    If everything is being executed from the main thread, it would make it much faster to execute the LCD driver from another thread.

                    1 of 1 people found this helpful
                      • Re: The RP2040 PIO Module is Pretty Nifty
                        scottiebabe

                        In the first example, it took almost 200 us to execute the 7 lines of python.

                         

                        You idea for offloading to another thread is I believe exactly what the RP2040 designers had in mind! In the second example, a second "thread" is running on the PIO state machine, essentially a very simple processor. Which is pulling in data off the TXFIFO which the main processor writes into.

                         

                        With the PIO module doing all of the bit fiddling, the only software overhead of the main thread is re-arming a DMA channel on a timer event:

                         

                        def framewrite(t):
                            mem32[CH0_CTRL_TRIG] = (1<<4)|(DREQ_PIO0_TX0<<15)|(2<<2) # 32bit transfers
                            mem32[CH0_TRANS_COUNT] = len(buf)>>2 # num words = (num bytes)/4
                            mem32[CH0_READ_ADDR] = ctypes.addressof(buf)
                            mem32[CH0_WRITE_ADDR] = 0x50200000+0x010 # PIO0 TXFIFO 0
                            mem32[CH0_CTRL_TRIG] |= 1 #enable DMA
                        
                        tdisp = Timer(period=50, callback=framewrite)
                        
                        1 of 1 people found this helpful
                    • Re: The RP2040 PIO Module is Pretty Nifty
                      neuromodulator

                      Great post, you convinced me to buy one