Implementing STM32 UART DMA receive transforms serial communication from a blocking, CPU-intensive operation to a fully non-blocking, zero-overhead background process. When receiving data via standard HAL_UART_Receive(), the CPU stalls until data arrives. With DMA-based UART receive, data arrives silently in a buffer while the CPU handles other tasks — essential for robust embedded firmware. This tutorial covers complete STM32 UART DMA receive implementation.
Table of Contents
- Why DMA for UART Receive?
- STM32CubeMX DMA Configuration
- Fixed-Length DMA Receive
- UART Idle Line Detection for Variable Data
- Circular Buffer DMA for Continuous Reception
- HAL Callbacks Explained
- Parsing Received Data
- Frequently Asked Questions
Why DMA for UART Receive?
Standard UART receive in STM32 can be done three ways:
- Polling (HAL_UART_Receive): CPU blocks and waits — worst approach for real firmware
- Interrupt (HAL_UART_Receive_IT): CPU interrupted per byte — overhead at high baud rates
- DMA (HAL_UART_Receive_DMA): DMA controller receives data into RAM with zero CPU involvement — best approach
At 115200 baud, data arrives at ~11,520 bytes/second. Without DMA, that’s 11,520 interrupts per second stealing CPU cycles. With DMA, you get one interrupt when the buffer is full (or half-full) — massively more efficient.
STM32CubeMX DMA Configuration
- Open your .ioc file in STM32CubeIDE
- Go to Connectivity → USART2 (or your UART)
- Enable Mode: Asynchronous, set baud rate
- Go to DMA Settings tab under USART2
- Click Add → select USART2_RX
- Direction: Peripheral To Memory
- Mode: Normal (for fixed-length) or Circular (for continuous)
- Data Width: Byte for both Peripheral and Memory
- Enable USART2 global interrupt in NVIC
- Generate Code
Fixed-Length DMA Receive
/* main.c - Receive exactly 10 bytes via DMA */
#include "main.h"
#include <string.h>
#define RX_BUFFER_SIZE 10
uint8_t rxBuffer[RX_BUFFER_SIZE];
volatile uint8_t rxComplete = 0;
/* Start DMA receive in setup */
int main(void) {
HAL_Init();
SystemClock_Config();
MX_USART2_UART_Init();
MX_DMA_Init();
/* Start receiving 10 bytes via DMA */
HAL_UART_Receive_DMA(&huart2, rxBuffer, RX_BUFFER_SIZE);
while (1) {
if (rxComplete) {
rxComplete = 0;
/* Process rxBuffer here */
process_received_data(rxBuffer);
/* Re-arm DMA for next reception */
HAL_UART_Receive_DMA(&huart2, rxBuffer, RX_BUFFER_SIZE);
}
}
}
/* DMA complete callback - called from UART IRQ */
void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart) {
if (huart->Instance == USART2) {
rxComplete = 1;
}
}
UART Idle Line Detection for Variable Data
Fixed-length DMA works for known packet sizes. For variable-length data (like GPS NMEA sentences, AT commands), combine DMA with UART Idle Line interrupt:
#define RX_BUFFER_SIZE 256
uint8_t rxBuffer[RX_BUFFER_SIZE];
uint16_t rxLength = 0;
void USART2_IRQHandler(void) {
/* Check for IDLE line interrupt */
if (__HAL_UART_GET_FLAG(&huart2, UART_FLAG_IDLE)) {
__HAL_UART_CLEAR_IDLEFLAG(&huart2);
/* Stop DMA and calculate received bytes */
HAL_UART_DMAStop(&huart2);
rxLength = RX_BUFFER_SIZE - __HAL_DMA_GET_COUNTER(huart2.hdmarx);
/* Signal main loop */
rxDataReady = 1;
/* Restart DMA */
HAL_UART_Receive_DMA(&huart2, rxBuffer, RX_BUFFER_SIZE);
}
HAL_UART_IRQHandler(&huart2);
}
/* Enable IDLE interrupt in setup (after UART init) */
void enable_uart_idle_interrupt(void) {
__HAL_UART_ENABLE_IT(&huart2, UART_IT_IDLE);
}
Circular Buffer DMA for Continuous Reception
For streaming data (GPS, IMU sensors), circular DMA is most efficient:
#define RX_RING_BUFFER_SIZE 512
uint8_t rxRingBuffer[RX_RING_BUFFER_SIZE];
uint16_t rxHead = 0; /* DMA write position */
uint16_t rxTail = 0; /* CPU read position */
/* Configure DMA as Circular in CubeMX, then: */
void start_circular_dma_receive(void) {
HAL_UART_Receive_DMA(&huart2, rxRingBuffer, RX_RING_BUFFER_SIZE);
/* DMA will loop indefinitely, overwriting old data */
}
/* Get current DMA write position (head) */
uint16_t get_rx_head(void) {
return RX_RING_BUFFER_SIZE - __HAL_DMA_GET_COUNTER(huart2.hdmarx);
}
/* Read one byte from ring buffer */
int read_byte(uint8_t *byte) {
uint16_t head = get_rx_head();
if (rxTail == head) return 0; /* No new data */
*byte = rxRingBuffer[rxTail];
rxTail = (rxTail + 1) % RX_RING_BUFFER_SIZE;
return 1;
}
/* HAL_UART_RxHalfCpltCallback fires at 50% full */
/* HAL_UART_RxCpltCallback fires at 100% full */
/* Process data in main loop using read_byte() */
HAL Callbacks Explained
- HAL_UART_RxCpltCallback: Called when entire DMA buffer is filled (Normal mode) or when DMA wraps around (Circular mode)
- HAL_UART_RxHalfCpltCallback: Called when first half of DMA buffer is filled — use this for double-buffering to process first half while DMA fills second half
- HAL_UART_ErrorCallback: Called on framing, overrun, or noise error — always implement this to re-arm DMA after errors
void HAL_UART_ErrorCallback(UART_HandleTypeDef *huart) {
if (huart->Instance == USART2) {
/* Clear error flags */
__HAL_UART_CLEAR_OREFLAG(huart);
__HAL_UART_CLEAR_NEFLAG(huart);
__HAL_UART_CLEAR_FEFLAG(huart);
/* Re-arm DMA */
HAL_UART_Receive_DMA(huart, rxBuffer, RX_BUFFER_SIZE);
}
}
Parsing Received Data
/* Example: Parse GPS NMEA sentences from circular DMA buffer */
char nmeaLine[128];
uint8_t lineIdx = 0;
void process_uart_data(void) {
uint8_t byte;
while (read_byte(&byte)) {
if (byte == '
') {
nmeaLine[lineIdx] = 0; /* Null-terminate */
if (nmeaLine[0] == '$') {
parse_nmea(nmeaLine);
}
lineIdx = 0;
} else if (lineIdx < sizeof(nmeaLine) - 1) {
nmeaLine[lineIdx++] = byte;
}
}
}
Frequently Asked Questions
Does STM32 UART DMA work with all UART peripherals?
Most STM32 UART/USART peripherals support DMA, but the specific DMA channels/streams connected to each UART depend on the STM32 variant. Check the DMA request mapping table in your STM32 reference manual. STM32CubeMX shows available DMA options automatically for each peripheral.
What happens if DMA buffer overflows before I read it?
In Normal mode, DMA stops when buffer is full — new incoming bytes are lost. In Circular mode, DMA overwrites old data. Use a buffer large enough for your maximum burst receive size. Monitor rxTail to ensure you’re reading fast enough in Circular mode.
Why is UART Idle Line detection better than a fixed timeout for variable data?
Idle Line detection fires immediately when the UART line has been idle for one frame time after the last byte — typically within 100 microseconds. A software timeout of even 1 millisecond introduces unnecessary latency. Idle line detection is hardware-implemented, zero CPU overhead, and precise.
Can I use UART DMA with FreeRTOS on STM32?
Yes — use a FreeRTOS notification or semaphore in the DMA callback instead of a global volatile flag. The DMA callback runs in ISR context, so use FromISR variants: xTaskNotifyFromISR() or xSemaphoreGiveFromISR(). The task waiting for UART data blocks efficiently without wasting CPU.
How do I debug UART DMA issues on STM32?
Check: DMA stream enabled in NVIC, correct DMA request mapping for your UART, __HAL_UART_ENABLE_IT called before DMA start (for Idle line), and UART error callback implemented. Use STM32CubeIDE’s live expressions to monitor DMA counter and buffer contents during debugging.
Add comment