We know SPI as a 4*-wire protocol. But it doesn't have to be.

I'm checking high speed SPI data transfer with buffers, DMA and parallel data lines.

In this blog, I finally got DMA working.


Attention: there's pre-knowledge required. For this blog it's expected that you have decent Hercules skills and that you can successfully replicate the mibSPI DMA example from HALCoGen RM57Lx Help.

Don't start with this project if that example isn't fully familiar to you. It would be a sure path to frustration.


Configuration of the SPI module in HALCoGen


This is exactly identical as for buffered SPI. So all settings of the part 2 - SPI with Buffers.

I'm using transfer group 3, a 16 bit 64 position buffer. Baud rate is 27 MHz.

There's a number of activities to be done with the memory settings and the linker file to give our RAM the right behaviour for DMA (shared mode).

These are documented in the mibSPI DMA example from HALCoGen RM57Lx Help (and this blog: https://www.hackster.io/jancumps/hercules-configure-memory-cache-for-dma-0945cd ).


/* buffer memory in special location for DMA / MibSPI */
#define D_SIZE      128

#pragma SET_DATA_SECTION(".sharedRAM")
#pragma DATA_ALIGN(TXDATA, 8); /* Note: It is very important to keep the data alignment on 64-bit for the DMA */
uint16_t TXDATA[D_SIZE/2];         /* transmit buffer in sys ram */


To make DMA work with the SPI module 3 of the launchpad, you need to know the request line to be used with this device.

The info can be found in the controller's technical reference document. We're ussing DMA channel 0, so the request line is 15.


void initDMA() {
       dmaReqAssign(0, 15 );


Next is to configure the DMA control structure. Here are a few highlights that are different than the HALCoGen example:


      g_dmaCTRLPKT.ELCNT     = dsize;             /* element count              */
      g_dmaCTRLPKT.ELDOFFSET = 4;                 /* element destination offset */
      g_dmaCTRLPKT.WRSIZE    = ACCESS_16_BIT;     /* write size                 */
      g_dmaCTRLPKT.ADDMODEWR = ADDR_OFFSET;       /* address mode write         */


dsize is the size of the block we're transferring. In this case 64 16 bit variables -> 128 8 bit values.

eldoffset is 4 because we already have transfer group 1 and 2 defined. They consume the start of SPI3's buffer.


We pass the structure in the DMA registers and prime the system:


       /* - setting dma control packets */

       /* - setting the dma channel to trigger on h/w request */
       dmaSetChEnable(DMA_CH0, DMA_HW);

       /* - configuring the mibspi dma , channel 0 , tx line -0 , rxline -1     */
       /* - refer to the device data sheet dma request source for mibspi tx/rx  */

       // first set the read mode
       TXDATA[0] = 0x03;
       // - enabling dma module


Then our program prepares the memory and hands things over to the SPI / DMA tandem:


void flashBitmapLineDMA(bitmap_t *bmp, uint32_t line) {

    _setWindow(0, line, bmp->width/2, line );
    loadBitmapInDMABuffer(bmp, line, 0);

    _setWindow(bmp->width/2, line, bmp->width, line );
    loadBitmapInDMABuffer(bmp, line, D_SIZE / 2);



The function that handles the notification to the SPI module is simple:


void _writeData64DMA() {
    gioSetBit(_portDataCommand, _pinDataCommand, 1);
    mibspiTransfer(mibspiREG3, 2 );
    while(!(mibspiIsTransferComplete(mibspiREG3, 2))) {


You'll see that the function waits for the DMA is complete. That's to make things easy for my program but it also takes away a possible advantage: doing something else while the transfer happens.

So there's still room for improvement.

All in all, it's fast already:



The Series
0 - Buffers and Parallel Data Lines
1a - Buffers and DMA
1b - SPI without Buffers
2 - SPI with Buffers
3a - SPI with DMA
3b - SPI with DMA works
4a - SPI Master with DMA and Parallel Data Lines
Hercules microcontrollers, DMA and Memory Cache