I2S Audio Protocol Explained: Digital Sound for ESP32

The I2S audio protocol ESP32 combination enables high-quality digital audio in maker projects — far superior to the noisy PWM audio that beginners often start with. I2S (Inter-IC Sound) is a standard serial bus specifically designed to transfer digital audio data between ICs. The ESP32 has two dedicated hardware I2S peripherals that handle all the timing automatically, leaving your application code free to focus on the audio content rather than bit-banging. This guide explains I2S from signal level fundamentals to working ESP32 code.

I2S Signal Lines Explained
I2S Timing and Data Format
Master vs Slave Configuration
ESP32 I2S Hardware Overview
I2S Audio Output: ESP32 to DAC
I2S Audio Input: Microphone to ESP32
Frequently Asked Questions

I2S Signal Lines Explained

I2S uses three signal lines (plus power and ground) to transfer stereo audio data between chips:

BCLK (Bit Clock / Serial Clock / SCK): This clock signal runs at a frequency equal to sample_rate × bits_per_sample × channels. For CD-quality audio (44.1kHz, 16-bit, stereo): BCLK = 44,100 × 16 × 2 = 1.4112 MHz. Each rising or falling edge of BCLK transfers one data bit.
WS (Word Select / LRCLK / Frame Select / LRC): This signal switches at exactly the sample rate (44.1kHz) to indicate which channel is being transmitted. WS = LOW during the left channel frame; WS = HIGH during the right channel frame (in standard I2S format).
SD/DATA (Serial Data): The actual audio samples as binary data, MSB (most significant bit) first. Left channel samples appear when WS = LOW, right channel when WS = HIGH.

Some I2S implementations use separate MCLK (Master Clock) — an additional clock running at 256× or 512× the sample rate. Some DAC chips (PCM5102, WM8960) require MCLK for their internal PLL. The ESP32 can generate MCLK on a dedicated output pin.

Recommended: INMP441 MEMS Omnidirectional Microphone Module for ESP32 — I2S digital microphone that connects directly to ESP32 I2S input pins for high-quality audio capture.

I2S Timing and Data Format

The standard I2S frame format (Philips I2S standard) works as follows:

Left channel data is transmitted during the WS LOW phase
The MSB is sent on the second BCLK pulse after the WS transition (1 BCLK delay)
After all data bits are sent, remaining bit positions are zero-padded

For a 16-bit sample at 44.1kHz:

// I2S Frame Timing (16-bit, 44.1kHz)
// BCLK frequency = 44100 Hz × 32 = 1,411,200 Hz (1.41 MHz)
// (32 BCLK cycles per WS period: 16 bits left + 16 bits right)
//
// WS:    ___/‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾___/‾‾‾‾‾
//              Left channel       Right channel
//
// DATA:  MSB → LSB → 0,0...   MSB → LSB → 0,0...
//        (16 bits)  (16 zeros) (16 bits)  (16 zeros)

// Common I2S format variations:
// Standard I2S (Philips): 1 BCLK delay after WS change
// Left Justified: MSB immediately after WS change
// Right Justified (Sony): LSB on last BCLK before WS change
// DSP Mode / PCM Mode: Used for TDM multi-channel

Master vs Slave Configuration

In an I2S connection, one side generates the clocks (BCLK and WS) — this is the master. The other side (the slave) receives the clocks and synchronises its data transmission/reception to them.

ESP32 as master (most common): ESP32 generates BCLK and WS, drives them to the DAC/ADC chip. The DAC (MAX98357A, PCM5102) receives clocks and sends analog audio. Use this for audio playback applications.
ESP32 as slave: An external clock source (another DAC, audio DSP) drives BCLK and WS to the ESP32. ESP32 receives audio data synchronised to the external clock. Use this when integrating ESP32 into an existing audio system.
Microphone (INMP441) as slave: The INMP441 always operates as an I2S slave — the ESP32 master provides BCLK and WS, and the INMP441 sends audio data. For microphone input, ESP32 is always the master.

ESP32 I2S Hardware Overview

The ESP32 has two hardware I2S controllers (I2S0 and I2S1), each capable of:

Simultaneous transmit (to speaker DAC) and receive (from microphone)
Sample rates from 1 kHz to 96 kHz
8, 16, 24, or 32 bits per sample
DMA (Direct Memory Access) transfer — audio data moves directly between RAM and I2S peripheral without CPU intervention, enabling real-time audio processing
Configurable pin mapping — any GPIO can be assigned to BCLK, WS, or DATA (some restrictions apply)

The ESP32-S3 (newer variant) has an enhanced I2S controller with support for TDM (Time Division Multiplex) for multi-microphone arrays, and PDM (Pulse Density Modulation) for direct connection to PDM microphones.

I2S Audio Output: ESP32 to DAC

// ESP32 I2S Output Example: Generate a 1kHz sine wave
#include <driver/i2s.h>
#include <math.h>

#define I2S_NUM         I2S_NUM_0
#define SAMPLE_RATE     44100
#define SAMPLE_BITS     16
#define I2S_BCLK_PIN    26
#define I2S_WS_PIN      25
#define I2S_DOUT_PIN    22

void i2s_init() {
  i2s_config_t config = {
    .mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_TX),
    .sample_rate = SAMPLE_RATE,
    .bits_per_sample = (i2s_bits_per_sample_t)SAMPLE_BITS,
    .channel_format = I2S_CHANNEL_FMT_RIGHT_LEFT,
    .communication_format = I2S_COMM_FORMAT_STAND_I2S,
    .intr_alloc_flags = ESP_INTR_FLAG_LEVEL1,
    .dma_buf_count = 8,
    .dma_buf_len = 64,
    .use_apll = true,            // Use Audio PLL for accurate clock
    .tx_desc_auto_clear = true
  };
  i2s_pin_config_t pins = {
    .bck_io_num   = I2S_BCLK_PIN,
    .ws_io_num    = I2S_WS_PIN,
    .data_out_num = I2S_DOUT_PIN,
    .data_in_num  = I2S_PIN_NO_CHANGE
  };
  i2s_driver_install(I2S_NUM, &config, 0, NULL);
  i2s_set_pin(I2S_NUM, &pins);
}

void setup() {
  i2s_init();
}

void loop() {
  // Generate 1 kHz sine wave (1 cycle = 44 samples at 44100 Hz)
  const int SINE_SAMPLES = 44;
  int16_t sine_wave[SINE_SAMPLES * 2];  // Stereo: L + R pairs
  
  for (int i = 0; i < SINE_SAMPLES; i++) {
    int16_t sample = (int16_t)(32767.0 * sin(2.0 * M_PI * i / SINE_SAMPLES));
    sine_wave[i * 2]     = sample;  // Left
    sine_wave[i * 2 + 1] = sample;  // Right
  }
  
  size_t bytesWritten;
  i2s_write(I2S_NUM, sine_wave, sizeof(sine_wave), &bytesWritten, portMAX_DELAY);
}

Recommended: Ai Thinker ESP32-A1S WiFi+BT Audio Development Board — Complete ESP32 audio development board with built-in ES8388 codec, headphone jack, and microphone — all I2S routing handled on-board.

I2S Audio Input: Microphone to ESP32

// ESP32 I2S Input: INMP441 Microphone
#include <driver/i2s.h>

#define I2S_NUM         I2S_NUM_0
#define I2S_SCK         14    // Bit clock → INMP441 SCK
#define I2S_WS          15    // Word select → INMP441 WS
#define I2S_SD          32    // Serial data ← INMP441 SD

void i2s_mic_init() {
  i2s_config_t config = {
    .mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_RX),
    .sample_rate = 16000,          // 16kHz for voice applications
    .bits_per_sample = I2S_BITS_PER_SAMPLE_32BIT,
    .channel_format = I2S_CHANNEL_FMT_ONLY_LEFT,
    .communication_format = I2S_COMM_FORMAT_STAND_I2S,
    .intr_alloc_flags = 0,
    .dma_buf_count = 8,
    .dma_buf_len = 256
  };
  i2s_pin_config_t pins = {
    .bck_io_num   = I2S_SCK,
    .ws_io_num    = I2S_WS,
    .data_out_num = I2S_PIN_NO_CHANGE,
    .data_in_num  = I2S_SD
  };
  i2s_driver_install(I2S_NUM, &config, 0, NULL);
  i2s_set_pin(I2S_NUM, &pins);
}

void setup() {
  Serial.begin(115200);
  i2s_mic_init();
  Serial.println("INMP441 ready");
}

// Note: INMP441 with 32-bit config outputs 18-bit data in MSBs
// Shift right by 14 to get 18-bit signed value, or by 16 for 16-bit
int32_t raw_samples[256];
void loop() {
  size_t bytesRead;
  i2s_read(I2S_NUM, raw_samples, sizeof(raw_samples), &bytesRead, portMAX_DELAY);
  int samplesRead = bytesRead / 4;
  long sum = 0;
  for (int i = 0; i < samplesRead; i++) sum += abs(raw_samples[i] >> 14);
  float rms = sum / samplesRead;
  Serial.println(rms);  // Higher values = louder sound
}

Recommended: Waveshare ESP32-S3 Round Display with Onboard Speaker and Microphone — Includes both speaker and microphone with I2S connections, ideal for smart display projects with voice interaction.

Frequently Asked Questions

What is the difference between I2S and I2C?

I2C (Inter-Integrated Circuit) is a low-speed (<1MHz) multi-device bus for control registers and sensor data — sensors, displays, IMUs. I2S (Inter-IC Sound) is specifically designed for high-speed (1–50 MHz) audio data streaming. They share a similar name convention but serve completely different purposes. An audio project might use I2C to configure the codec’s registers (volume, equaliser settings) and I2S to stream the actual audio data.

Can I use I2S audio and WiFi simultaneously on ESP32?

Yes, with care. WiFi uses one of the two I2S peripherals (I2S0) internally on some ESP32 variants. Use I2S1 for audio to avoid conflicts. In practice, WiFi and I2S audio coexist well at low WiFi data rates, but audio dropouts can occur during high-throughput WiFi operations. Use DMA-based I2S with sufficient DMA buffer depth (8–16 buffers of 256 samples) to absorb WiFi interrupt latency without dropouts.

Why does my I2S audio have pops and clicks?

DMA buffer underruns — your code is not feeding the I2S DMA fast enough, causing the hardware to insert zero samples (resulting in audible pops). Increase DMA buffer count (from 4 to 8 or 16). Also check if your loop() function is blocked by delays, Serial.println() (which is slow), or other operations that prevent audio data from being written to the DMA in time.

Does ESP32-C3 (RISC-V) support I2S?

Yes, the ESP32-C3 has one I2S controller. However, it is more limited than the dual I2S on the original ESP32. ESP32-S3 is the preferred choice for audio applications — it has two I2S controllers, PDM support, and the ESP-ADF (Audio Development Framework) is optimised for it with support for TensorFlow Lite for voice recognition at the edge.

Shop Audio & Sound Modules at Zbotic →

I2S Audio Protocol Explained: Digital Sound for ESP32

Table of Contents

I2S Signal Lines Explained

I2S Timing and Data Format

Master vs Slave Configuration

ESP32 I2S Hardware Overview

I2S Audio Output: ESP32 to DAC

I2S Audio Input: Microphone to ESP32

Frequently Asked Questions

What is the difference between I2S and I2C?

Can I use I2S audio and WiFi simultaneously on ESP32?

Why does my I2S audio have pops and clicks?

Does ESP32-C3 (RISC-V) support I2S?

Related posts

Audio Oscillator: 555 Timer Tone Generator Projects

Doorbell Chime: Custom Sound with Arduino and Speaker

Music Reactive Fountain: Water Dance with Arduino

Sound Direction Finder: Microphone Array Localization

Audio AGC Circuit: Automatic Volume Level Control

Add comment Cancel reply

Call us: 020 69134444 / 1800 209 0998

My Account

Cart

Wishlist

Checkout

My Orders

Track Order

My Account

Information

FAQs

Blogs

Career

About Us

Contact Us

Payment Options

Policies

Privacy Policy

Terms & Conditions

GST Input Tax Credit

Shipping Return Policy

E-Waste Collection Points

Our Sitemap