INMP441 Microphone ESP32: Voice Recognition Project Guide

The INMP441 microphone ESP32 voice recognition combination is one of the most popular setups in Indian IoT maker projects — and for good reason. The INMP441 is a high-quality I2S MEMS omnidirectional microphone that connects directly to the ESP32’s I2S peripheral, providing clean digital audio for voice processing applications. Whether you are building a smart home voice controller, an attendance system, or a language learning device, this guide gives you a complete working setup with code.

INMP441 Specifications and I2S Interface
Wiring INMP441 to ESP32
Audio Capture and Level Detection Code
Voice Recognition with Edge Impulse
Simple Wake Word Detection
Smart Home Voice Control Example
Frequently Asked Questions

INMP441 Specifications and I2S Interface

The InvenSense (now TDK) INMP441 is a high-performance, omnidirectional MEMS microphone in a bottom-ported LGA package. The breakout module makes it breadboard-compatible with pin headers:

Frequency response: 60 Hz – 15 kHz (±3 dB), adequate for voice frequency range (100 Hz – 8 kHz)
Signal-to-Noise Ratio (SNR): 61 dB(A) — better than most electret microphone modules
Sensitivity: -26 dBFS at 94 dB SPL — captures speech at normal conversational distance (0.5–1 metre)
Power consumption: 1.4mA active, 25μA power-down mode — suitable for battery-powered devices
Supply voltage: 1.8–3.3V — directly compatible with ESP32 (3.3V logic)
Digital output: I2S (24-bit PCM, up to 192kHz — though 16/44.1kHz is typical)
Direction: Omnidirectional — picks up sound equally from all directions

The INMP441 module has six pins: VCC (3.3V), GND, SCK (BCLK), WS (LRCLK), SD (data output), and L/R (channel select: GND for left, 3.3V for right).

Recommended: INMP441 MEMS Omnidirectional Microphone Module for ESP32 — The INMP441 module with breakout board, directly compatible with ESP32 I2S pins at 3.3V.

Wiring INMP441 to ESP32

// INMP441 → ESP32 Wiring
// INMP441 VCC → ESP32 3.3V
// INMP441 GND → ESP32 GND
// INMP441 SCK → ESP32 GPIO 14 (I2S Bit Clock)
// INMP441 WS  → ESP32 GPIO 15 (I2S Word Select)
// INMP441 SD  → ESP32 GPIO 32 (I2S Data Input)
// INMP441 L/R → GND (left channel = address 0)
//             → 3.3V (right channel = address 1) if using two microphones
//
// For stereo (two INMP441 modules):
// Mic 1 L/R → GND  (left channel)
// Mic 2 L/R → 3.3V (right channel)
// Both share SCK, WS, and SD lines
// Use I2S_CHANNEL_FMT_RIGHT_LEFT in config

Audio Capture and Level Detection Code

#include <driver/i2s.h>
#include <math.h>

#define I2S_PORT    I2S_NUM_0
#define I2S_SCK     14
#define I2S_WS      15
#define I2S_SD_PIN  32
#define SAMPLE_RATE 16000
#define BUFFER_SIZE 512

int32_t rawBuffer[BUFFER_SIZE];

void setupI2S() {
  i2s_config_t config = {
    .mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_RX),
    .sample_rate = SAMPLE_RATE,
    .bits_per_sample = I2S_BITS_PER_SAMPLE_32BIT,
    .channel_format = I2S_CHANNEL_FMT_ONLY_LEFT,
    .communication_format = I2S_COMM_FORMAT_STAND_I2S,
    .intr_alloc_flags = 0,
    .dma_buf_count = 8,
    .dma_buf_len = BUFFER_SIZE,
    .use_apll = false,
    .tx_desc_auto_clear = false,
    .fixed_mclk = 0
  };
  i2s_pin_config_t pins = {
    .bck_io_num   = I2S_SCK,
    .ws_io_num    = I2S_WS,
    .data_out_num = I2S_PIN_NO_CHANGE,
    .data_in_num  = I2S_SD_PIN
  };
  i2s_driver_install(I2S_PORT, &config, 0, NULL);
  i2s_set_pin(I2S_PORT, &pins);
  i2s_zero_dma_buffer(I2S_PORT);
}

float calculateRMS() {
  size_t bytesRead = 0;
  i2s_read(I2S_PORT, rawBuffer, sizeof(rawBuffer), &bytesRead, portMAX_DELAY);
  int samples = bytesRead / 4;
  long long sumSquares = 0;
  for (int i = 0; i < samples; i++) {
    // INMP441 data is in MSBs of 32-bit word — shift to get useful range
    int16_t sample = (int16_t)(rawBuffer[i] >> 16);
    sumSquares += (long long)sample * sample;
  }
  return sqrt((float)sumSquares / samples);
}

void setup() {
  Serial.begin(115200);
  setupI2S();
  Serial.println("INMP441 ready.");
}

void loop() {
  float rms = calculateRMS();
  // Roughly convert to dB (calibrate for your environment)
  float dB = 20.0 * log10(rms + 1) - 30;
  Serial.printf("RMS: %.1f | ~dB: %.1fn", rms, dB);

  if (rms > 500) {  // Threshold for speech detection
    Serial.println(">>> Voice activity detected! <<<");
  }
  delay(50);
}

Voice Recognition with Edge Impulse

Edge Impulse (edgeimpulse.com) is a free machine learning platform that lets you train keyword detection models and deploy them on ESP32 without deep ML expertise:

Create a project on Edge Impulse. Select "Keywords" as the project type.
Collect training data: Use your INMP441 + ESP32 with the Edge Impulse data forwarder (or use the browser microphone). Record 50–100 samples each of: your target keywords ("lights on", "fan off"), background noise, and unknown words.
Design and train: Edge Impulse processes audio with MFCC (Mel-Frequency Cepstral Coefficients) feature extraction and trains a neural network classifier. The Training accuracy for simple two-keyword models typically reaches 95–99%.
Export and deploy: Export as Arduino library. Include the library in your sketch. The exported model runs entirely on the ESP32 at the edge — no internet connection needed for inference.
Inference time: Typically 50–200ms per 1-second audio window on ESP32 — fast enough for real-time keyword detection.

Simple Wake Word Detection

Without a full ML model, you can implement a simple threshold-based voice activity detector (VAD) that wakes an IoT device when it hears any loud word, then listens for a command over WiFi or MQTT:

#include <driver/i2s.h>
#include <WiFi.h>
#include <PubSubClient.h>

// ... (setupI2S() from above) ...

const char* SSID     = "YourWiFi";
const char* PASSWORD = "YourPassword";
const char* MQTT_SERVER = "192.168.1.100";

bool wakeWordDetected = false;
unsigned long lastDetectionTime = 0;

void setup() {
  Serial.begin(115200);
  WiFi.begin(SSID, PASSWORD);
  while (WiFi.status() != WL_CONNECTED) delay(500);
  setupI2S();
}

void loop() {
  float rms = calculateRMS();
  unsigned long now = millis();

  // Voice Activity Detection: continuous sound above threshold for 200ms
  if (rms > 800) {
    if (!wakeWordDetected && (now - lastDetectionTime > 2000)) {
      wakeWordDetected = true;
      lastDetectionTime = now;
      // Publish to MQTT: send a notification to process the command
      Serial.println("Wake word detected — sending trigger!");
      // mqttClient.publish("home/voice/trigger", "1");
    }
  } else {
    if (now - lastDetectionTime > 500) {
      wakeWordDetected = false;
    }
  }
  delay(20);
}

Recommended: Waveshare ESP32-S3 Round Display with Microphone — ESP32-S3 board with onboard microphone and display — ideal for a complete voice assistant device with visual feedback.

Smart Home Voice Control Example

A practical voice-controlled home automation setup for Indian homes:

Hardware: ESP32 + INMP441 (this guide) + relay module for appliances
Commands for Indian context: “Lights on/off” in Hindi (“batti on/batti off”), fan speed control, geyser timer
Cloud option: Dialogflow (Google) or LUIS (Microsoft) for natural language understanding — send the recorded audio over WiFi to cloud NLU, receive structured command back
Privacy option: Run the full TensorFlow Lite model on ESP32 locally (Edge Impulse export) — no audio leaves the device, suitable for privacy-conscious households
Integration: Use Home Assistant (Raspberry Pi) with MQTT integration — the ESP32 publishes recognised commands as MQTT messages, Home Assistant executes them through smart plugs, Zigbee lights, or directly via relay modules

Frequently Asked Questions

How far away can the INMP441 detect voice?

In a quiet room, the INMP441 reliably detects conversational speech (60–70 dB SPL) at up to 2–3 metres. For smart speaker applications with the device on a table, 1–2 metres is a realistic working range. Background noise (TV, fans, AC) reduces the effective range significantly. India’s typical home environment with ceiling fan running continuously reduces reliable detection to about 0.5–1 metre.

Can I use two INMP441 microphones for voice direction finding?

Yes. Connect two INMP441 modules to the same I2S bus (both share SCK and WS lines). Set one module’s L/R pin to GND (left channel) and the other to 3.3V (right channel). Configure ESP32 I2S for stereo input (I2S_CHANNEL_FMT_RIGHT_LEFT). By calculating the time delay difference between the two microphones (TDOA — Time Difference of Arrival), you can determine the approximate direction of a sound source.

Will the INMP441 work in India’s humid climate?

Yes. The INMP441 is rated for 0–70°C operating temperature and typical humidity ranges found in India. However, the breakout module’s PCB should be conformal coated if deployed in high-humidity environments (coastal areas, industrial humid zones). The MEMS sensing element itself is sealed, but PCB traces and connectors are vulnerable to condensation in extreme humidity.

Is Edge Impulse free for Indian students and hobbyists?

Yes. Edge Impulse has a free Developer tier that allows unlimited public projects with up to 4 million inferences per month per device. For students and hobbyists, this is more than sufficient. The free tier includes model training, deployment, and OTA updates. Commercial use requires the Professional plan, but educational and hobby projects remain free.

Shop Audio & Sound Modules at Zbotic →

INMP441 Microphone ESP32: Voice Recognition Project Guide

Table of Contents

INMP441 Specifications and I2S Interface

Wiring INMP441 to ESP32

Audio Capture and Level Detection Code

Voice Recognition with Edge Impulse

Simple Wake Word Detection

Smart Home Voice Control Example

Frequently Asked Questions

How far away can the INMP441 detect voice?

Can I use two INMP441 microphones for voice direction finding?

Will the INMP441 work in India’s humid climate?

Is Edge Impulse free for Indian students and hobbyists?

Related posts

Audio Oscillator: 555 Timer Tone Generator Projects

Doorbell Chime: Custom Sound with Arduino and Speaker

Music Reactive Fountain: Water Dance with Arduino

Sound Direction Finder: Microphone Array Localization

Audio AGC Circuit: Automatic Volume Level Control

Add comment Cancel reply

Call us: 020 69134444 / 1800 209 0998

My Account

Cart

Wishlist

Checkout

My Orders

Track Order

My Account

Information

FAQs

Blogs

Career

About Us

Contact Us

Payment Options

Policies

Privacy Policy

Terms & Conditions

GST Input Tax Credit

Shipping Return Policy

E-Waste Collection Points

Our Sitemap