In many microcontroller projects, you need to read and write data. It can be reading data from the peripheral unit like ADC and writing values to RAM. In another case, maybe you need to send chunks of data using SPI. Again you need to read it from RAM and continuously write to the SPI data register. When you do this using processor – you lose a significant amount of processing time. To avoid occupying the CPU, most advanced microcontrollers have a Direct Memory Access (DMA) controller. As its name says – DMA does data transfers between memory locations without the need for a CPU.
Before explaining the Flexible Array Member (Array without dimension in structure), It would be useful and easily understandable, if I explain the problem that we have. Let's take one example where I have to create one structure with variable size of the array. See the below snippet.
- Declaration of array in c. In c/c language we can create single dimensional or multiple dimensional arrays. Declaration of the array is very simple, we need to write array name followed by the data type and square brackets, square brackets contain the size of the array. The size of the array.
- Understanding the STM32 Lineup. STMicroelectronics offers a dizzying array of microcontrollers and microprocessors. The most useful place to start is on their Microcontrollers and Microprocessors page. There, you still find an overview of their offerings, likely in a chart like the one below.
- STMicroelectronics provides STM32 standard peripheral library which contains APIs to work with STM32 microcontrollers. I used version 3.5.0. The library can be downloaded here. Another option is to use STM32Cube which seems to be recommended by STMicroelectronics.
Low and medium density ST32 microcontrollers have a single 7 channel DMA unit, while high-density devices have two DMA controllers with 12 independent channels. In STM32VLDiscovery, their ST32F100RB microcontroller with a single DMA unit having 7 channels.
DMA controller can do automated memory to memory data transfers, also do peripheral to memory and peripheral to peripheral. DMA channels can be assigned one of four priority levels: very high, high, medium, and low. And if two same priority channels are requested simultaneously – the lowest number of the channel gets priority. DMA channel can be configured to transfer data into the circular buffer. So DMA is an ideal solution for any peripheral data stream.
Speaking of physical DMA bus access, it is important to note that DMA only accesses bus for actual data transfer. Because of the DMA request phase, address computation and Ack pulse are performed during other DMA channel bus transfers. So when one DMA channel finishes bus transfer, another channel is already ready to do transfer immediately. This ensures minimal bus occupation and fast transfers. Another exciting feature of DMA bus access is that it doesn’t occupy 100% of bus time. DMA takes 5 AHB bus cycles for single word transfer between memory – three of them are still left for CPU access. This means that DMA only takes a maximum of 40% of bus time. So even if DMA is doing intense data transfer, the CPU can access any memory area, peripheral. If you look at the block diagram, you will see that the CPU has a separate Ibus for Flash access. So program fetch isn’t affected by DMA.
Programming DMA controller
Simply speaking, programming DMA is relatively easy. Each channel can be controlled using four registers: Memory address, peripheral address, number of data, and configuration. And all channels have two dedicated registers: DMA interrupts the status register and interrupts the clear flag register. Once set, DMA takes care of memory address increment without disturbing the CPU. DMA channels can generate three interrupts: transfer finished, half-finished, and transfer error.
As an example, let’s write a simple program that transfers data between two arrays. To make it more exciting, let’s do the same task using DMA and without it. Then we can compare the time taken in both cases.
Here is a code of DMA memory to memory transfer:
First of all, we create two arrays: source and destination. Size of length is determined by ARRAYSIZE, which in our example is equal to 800
We use the LED library from the previous tutorial – they indicate a start and stop-transfer for both modes – DMA and CPU. As we see in the code, we must turn on the DMA1 clock to make it functional. Then we start loading settings into DMA_InitStructure. For this example, we selected DMA1 Channel1, so first of all, we call DMA_DeInit(DMA1_Channel1) function, which makes sure DMA is reset to its default values. Then turn on memory to memory mode, then we select normal DMA mode (also, we could select circular buffer mode). As priority mode, we assign Medium. Then we choose the data size to be transferred (32-bit word). This needs to be done for both – peripheral and memory addresses.
NOTE! If one of the memory sizes would be different, say source 32-bit and destination 8- bit – then DMA would cycle four times in 8-bit chunks.
Then we load destination, source start addresses, and amount of data to be sent. Afterload these values using DMA_Init(DMA_Channel1, &DMA_InitStructure). After this operation, DMA is prepared to do transfers. Any time DMA can be fired using DMA_Cmd(DMA_Channel1, ENABLE) command.
To catch the end of DMA transfer, we initialized DMA transfer Complete on channel1 interrupt.
Where we could toggle the LED and change the status flag giving a signal to start the CPU transfer test.
CPU based memory copy routine is simple:
Measuring DMA and CPU based transfer speeds
Since LEDG is connected to GPIOC pin 9 and LEDB is connected to GPIOC pin 8 we could track start and end pulses using scope:
So using 800 32-bit word transfer using DMA took 214μs:
While using the CPU memory copy algorithm, it took 544μs:
This shows a significant increase in data transfer speed (more than two times). And with DMA most considerable benefit is that the CPU is unoccupied during transfer and may do other intense tasks or go into sleep mode.
I hope this example gives an idea of DMA’s importance. With DMA we can do loads of work only on the hardware level. We will get back to it when we get to other STM32 features like ADC.
Download Sourcery G++ Lite Eclipse project files here: STM32DiscoveryDMA.zip
Previously we have tried to do a single conversion of one ADC channel. We were waiting for the ADC result in a loop, which isn’t an effective way of using processor resources. It is better to trigger a conversion and wait for the conversion to complete the interrupt. This way, a processor can do other tasks rather than wait for ADC conversion to complete. This time we will go through another example to set up more than one channel and read ADC values using interrupt service routine.
How does multichannel ADC conversion works?
If we need to convert several channels continuously, we need to set up Sequence registers (ADC_SQRx). There are three sequence registers: ADC_SQR1, ADC_SQR2, and ADC_SQR3 where we can set up a maximum of 16 channels in any order. Conversion sequence starts with SQ1[4:0] settings in ADC_SQR3 register. Bits [4:0] hold the number of ADC channels.
All 16 sequence channels can be set up the same way through all SQR registers. Then in the ADC_SQR1 register, there are four bits marked L[3:0] where you can set the number how many times sequence reading will be repeated.
Another thing we will have to take care of is to set up sample time for each channel. As we know, each channel in sequence can be set for different conversion times. The sampling time for each channel can be set up in two registers: ADC_SMPR1 and ADC_AMPR2. There are three bits for each channel in sequence.
If you use a standard peripheral library setting up multichannel ADC becomes an easy task.
Setting up multichannel ADC conversion with DMA write
Let’s write an example where we will read the first 8 ADC channels four times using scan mode. Then we calculate an average value of each channel and later print results on a terminal screen using UART.
We will write ADC values to memory by using a DMA channel. Once all data is stored in memory, a DMA transfer complete interrupt will be generated to trigger averaging and output. In the STM32F100x datasheet, we find that ADC pins are assigned alternate functions as follows:
- ADC1_IN0 – PA0
- ADC1_IN1 – PA1
- ADC1_IN2 – PA2
- ADC1_IN3 – PA3
- ADC1_IN4 – PA4
- ADC1_IN5 – PA5
- ADC1_IN6 – PA6
- ADC1_IN7 – PA7
- ADC1_IN8 – PB0
- ADC1_IN9 – PB1
- ADC1_IN10 – PC0
- ADC1_IN11 – PC1
- ADC1_IN12 – PC2
- ADC1_IN13 – PC3
- ADC1_IN14 – PC4
- ADC1_IN15 – PC5
We will need to set up pins A0 to A7 as analog inputs for the first eight channels. Then we can set up an ADC conversion mode. Also, we need to set up Scan Conversion Mode to go through all channels selected in ADC1_SQRx registers. In the peripheral library, this looks like:
Then we must enable to enable continuous conversion mode as we want to cycle through channel list several times:
Then we indicate the number of channels to be converted in scan mode:
The next thing is to indicate which channels and what order we need to convert. For this, we set up each channel individually with commands:
Stm32 Array Declaration Code
I’ve chosen to go all eight channels in the row from 0 to 7. But you can mess up the numbers as you like. The rest is to set up DMA where it copies ADC values to memory on each EOC event. After DMA copies a predefined number of values, it generates an interrupt. Then we can manipulate data as we like. As in our example, we average multiple instances.
This is a result on the terminal screen.
You can hook up a potentiometer or any other analog sensor to each channel to see its ADC value.
Working C code of multichannel ADC
Here is the complete main source code if you would like to analyze or use fragments for your purposes:
Stm32 Array Declaration Tool
Also, you can download project files [STM32DiscoveryADC_DMAmultiple.zip] that compile with Codebench GCC and Eclipse.