How a microcontroller can read data at 1.6 Gbps

Good day! it never happened and here again . Enough time has passed since my last article , and it poses new challenges. And if I used to transmit data at a speed of 100 Mbps, now I had to swing at 1600 Mbps ...



At KPDV - the hero of our novel - he was able to read data at that speed!







So, my next project demanded to read a 32-bit data stream at a speed of 50 MHz (this, by the way, will be the same 1.6 Gbps) in an amount known in advance - let it be 10000. It would be just fine to read immediately using DMA from one port - but, unfortunately, there were no suitable processors (I hope someone corrects this in the comments), for some reason, all ports that are suitable for speed are for some reason 16-bit.



But such a trifle will not stop us - we will read from two ports at once! True, this in the general case will not always be possible to do with the necessary degree of control and synchronism, but in our case, everything is not so sad - there is a signal after which 20 ns data will be kept on the port.



And since we have a processor - stm32h750 at 400 MHz, and a bus and timers at 200 MHz, then everything should work out.



It would seem a simple case - to trigger a single DMA forwarding on a signal. But only DMA does not have such an opportunity - the port can issue an interrupt, but it cannot control DMA. But our processor has a good thing - DMAMUX, in which there is an event generator for the DMA channel, but this generator has two suitable options - either use the EXTIT0 interrupt or the signal from the TIM12 timer (this was a strange fantasy for the chip developers).



We won’t have time in time for the interruption - even empty testing requires about 47 clock cycles, and our clock cycle is 2.5 ns ...



But in time for a timer. It remains only to tack the timer from an external signal of 100 MHz, and set the length of the timer to 1 and the TRGO output will trigger the DMAMUX generator, and then it will send a command to send DMA and it will read the port and send the data to memory.



But stop! The port is 16-bit, but we have 32 ... Well, you can try to read another second port ... Only for this we need a second DMA channel, and it will take the same bus - that is, we will have time to read, but we can do not have time to write data to memory. Well, theoretically, this processor has different types of memory, and in a large picture of the processor structure you can see that both DMA and RAM_D1 memory are sitting on the same bus with a frequency of 200 MHz. It remains to verify in practice.



DMA1->LIFCR |= ~0; DMA1_Stream0->CR = (0b11 << DMA_SxCR_PL_Pos) | (0b01 << DMA_SxCR_MSIZE_Pos) | (0b01 << DMA_SxCR_PSIZE_Pos) | DMA_SxCR_MINC; DMA1_Stream0->M0AR = (uint32_t) data; DMA1_Stream0->PAR = (uint32_t) &(GPIOE->IDR); DMA1_Stream0->NDTR = 10000; DMA1_Stream1->CR = (0b11 << DMA_SxCR_PL_Pos) | (0b01 << DMA_SxCR_MSIZE_Pos) | (0b01 << DMA_SxCR_PSIZE_Pos) | DMA_SxCR_MINC; DMA1_Stream1->M0AR = (uint32_t) data2; DMA1_Stream1->PAR = (uint32_t) &(GPIOD->IDR); DMA1_Stream1->NDTR = 10000; DMAMUX1_Channel0->CCR = DMAMUX_CxCR_EGE | (1); DMAMUX1_Channel1->CCR = DMAMUX_CxCR_EGE | (2); DMAMUX1_RequestGenerator0->RGCR = DMAMUX_RGxCR_GE | (0b01 << DMAMUX_RGxCR_GPOL_Pos) | (7); DMAMUX1_RequestGenerator1->RGCR = DMAMUX_RGxCR_GE | (0b01 << DMAMUX_RGxCR_GPOL_Pos) | (7); DMA1_Stream0->CR |= DMA_SxCR_EN; DMA1_Stream1->CR |= DMA_SxCR_EN; TIM12->CNT = 0; TIM12->CCMR1 |= TIM_CCMR1_CC2S_0; TIM12->CR2 = (0b010 << TIM_CR2_MMS_Pos); TIM12->CR1 |= TIM_CR1_CEN; while (DMA1_Stream0->NDTR) i++; TIM12->CR1 &= ~TIM_CR1_CEN;
      
      





And of course, you need to place the data and data2 arrays in the desired memory segment, this is done like this:



 __attribute__((section(".dma_buffer"))) uint16_t data[10240],data2[10240];
      
      





and in the file for the linker indicate:



 .dma_buffer : { *(.dma_buffer) } >RAM_D1
      
      





To check, well, and as the first option, just stupid copying was implemented using

CPU (still 400 MHz):



  uint16_t * ptr = cpudata; volatile uint16_t * src = &(GPIOE->IDR); volatile uint16_t * src2 = &(GPIOD->IDR); for (register int i = 0; i < 10000; i++) { *ptr++ = *src; *ptr++ = *src2; }
      
      





For verification, cpudata data was located in different memory, the fastest (well, it is true, only 64K) was the fastest memory (also 400 MHz) DTCMRAM.



results



During the tests, it turned out that with the help of the CPU it is possible to read at a speed of 12.5 MHz from two ports. And 25 MHz from one. So the option does not work ...



With the help of DMA and such a mother, TIM12 was able to successfully read at a speed of 50 MHz, and in a few hours there was no error test. Both ports were read, but it was not yet possible to measure how far the reading on the second DMA lags ...



So in my (slightly degenerate) case, I managed to achieve the speed of information transfer to the stm32h750 processor at a speed of 32x50 = 1600 Mbps.



All Articles