Image Classification on Edge Device: MobileNet and Pi 5

Running image classification on edge devices with MobileNet and Raspberry Pi 5 opens up real-time AI inference without cloud connectivity. MobileNet’s depthwise separable convolutions are designed specifically for resource-constrained devices, making it ideal for embedded vision applications. This tutorial covers deploying MobileNetV2 and MobileNetV3 on Raspberry Pi 5 using TensorFlow Lite for efficient MobileNet Pi 5 image classification at 30+ FPS.

MobileNet Architecture for Edge AI
Hardware: Raspberry Pi 5 Advantages
TensorFlow Lite Setup
Downloading MobileNet Models
Real-Time Inference Code
Custom Classification with Transfer Learning
Performance Benchmarks
FAQ

MobileNet Architecture for Edge AI

MobileNet uses depthwise separable convolutions to reduce computation by 8-9x compared to standard convolutions. MobileNetV2 introduced inverted residual blocks with linear bottlenecks. MobileNetV3 adds hard-swish activations and squeeze-excitation blocks for further efficiency gains. On Raspberry Pi 5, MobileNetV3-Small achieves ~200 FPS with TFLite, while MobileNetV3-Large runs at ~80 FPS with 75.8% ImageNet top-1 accuracy.

Hardware: Raspberry Pi 5 Advantages

Raspberry Pi 5 features the Broadcom BCM2712 with ARM Cortex-A76 cores running at 2.4 GHz – approximately 2-3x faster than Pi 4 for AI inference tasks. Key advantages:

LPDDR4X RAM at higher bandwidth than Pi 4’s LPDDR4
PCIe 2.0 interface for AI accelerator HAT (Hailo-8L) adding 13 TOPS
Improved NPU access via Hailo AI HAT for neural network acceleration

Arducam IMX477 12MP HQ Camera

12MP Sony IMX477 High Quality Camera for Raspberry Pi 5. Ideal input for MobileNet classification – high resolution crop-and-resize before inference. C/CS mount for interchangeable lenses. Available at Zbotic.

View Product

Waveshare IMX219 8MP Camera for Pi 5

IMX219 8MP camera fully compatible with Raspberry Pi 5 CSI port. Cost-effective option for edge AI projects. Good dynamic range for indoor classification tasks.

View Product

TensorFlow Lite Setup

sudo apt update
sudo apt install -y python3-pip python3-picamera2

# Install TFLite runtime (optimised for ARM)
pip3 install tflite-runtime

# Or full TensorFlow (larger but includes training tools)
pip3 install tensorflow

# Verify
python3 -c "import tflite_runtime.interpreter as tflite; print('TFLite OK')"

Downloading MobileNet Models

mkdir -p ~/models && cd ~/models

# MobileNetV3-Small (fastest - recommended for Pi)
wget https://storage.googleapis.com/download.tensorflow.org/models/tflite/task_library/image_classification/android/mobilenet_v3_small_100_224_float_quant.tflite

# MobileNetV2 (balanced accuracy/speed)
wget https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v2_1.0_224_quant.tflite

# ImageNet labels
wget https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tflite.labels

Real-Time Inference Code

import numpy as np
import cv2
from picamera2 import Picamera2
import tflite_runtime.interpreter as tflite
import time

# Load model and labels
INTERPRETER = tflite.Interpreter(model_path='/home/pi/models/mobilenet_v2_1.0_224_quant.tflite')
INTERPRETER.allocate_tensors()

input_details = INTERPRETER.get_input_details()
output_details = INTERPRETER.get_output_details()
INPUT_SHAPE = input_details[0]['shape'][1:3]  # (224, 224)

with open('/home/pi/models/imagenet_labels.txt') as f:
    LABELS = [line.strip() for line in f.readlines()]

def classify(frame):
    # Resize to model input size
    img = cv2.resize(frame, (INPUT_SHAPE[1], INPUT_SHAPE[0]))
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img = np.expand_dims(img, axis=0).astype(np.uint8)
    INTERPRETER.set_tensor(input_details[0]['index'], img)
    INTERPRETER.invoke()
    output = INTERPRETER.get_tensor(output_details[0]['index'])
    top_idx = np.argmax(output[0])
    confidence = output[0][top_idx] / 255.0  # dequantize
    return LABELS[top_idx], confidence

# Camera setup
picam2 = Picamera2()
picam2.configure(picam2.create_preview_configuration(main={'size':(640,480),'format':'BGR888'}))
picam2.start()

prev_time = time.time()
while True:
    frame = picam2.capture_array()
    label, conf = classify(frame)
    fps = 1 / (time.time() - prev_time)
    prev_time = time.time()
    cv2.putText(frame, f'{label}: {conf:.1%}', (10,30), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0,255,0), 2)
    cv2.putText(frame, f'FPS: {fps:.1f}', (10,60), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255,255,0), 2)
    cv2.imshow('MobileNet Classification', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'): break
picam2.stop()

Custom Classification with Transfer Learning

For India-specific classification tasks (sorting electronic components by part type, identifying Indian currency notes, quality inspection of manufactured parts), fine-tune MobileNetV2 on your own dataset:

import tensorflow as tf

base_model = tf.keras.applications.MobileNetV2(
    input_shape=(224, 224, 3),
    include_top=False,
    weights='imagenet'
)
base_model.trainable = False  # Freeze base

model = tf.keras.Sequential([
    base_model,
    tf.keras.layers.GlobalAveragePooling2D(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(NUM_CLASSES, activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(train_dataset, epochs=20, validation_data=val_dataset)

# Convert to TFLite
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]  # Quantize
tflite_model = converter.convert()
with open('custom_mobilenet.tflite', 'wb') as f:
    f.write(tflite_model)

Performance Benchmarks

Inference times on Raspberry Pi 5 (INT8 quantised models):

MobileNetV3-Small: ~5ms per frame (200 FPS)
MobileNetV2 1.0: ~12ms per frame (80 FPS)
MobileNetV3-Large: ~13ms per frame (77 FPS)
EfficientNet-Lite0: ~18ms per frame (55 FPS)

With Hailo-8L AI HAT on Pi 5: MobileNetV2 accelerates to under 2ms (500+ FPS). Available from distributors in India for approximately Rs 8,000-10,000.

FAQ

Can I run YOLOv8 instead of MobileNet on Pi 5?

Yes. YOLOv8n (nano) runs at ~10 FPS on Pi 5 with TFLite. For classification-only tasks (not detection), MobileNet is 5-10x faster. Use YOLO when you need bounding boxes.

What accuracy does MobileNetV2 achieve?

MobileNetV2 1.0 achieves 71.8% top-1 accuracy on ImageNet. For custom datasets with good training data, fine-tuned models typically achieve 90-95% accuracy.

How many training images do I need for transfer learning?

As few as 50-100 images per class can work well with MobileNet transfer learning. Collect images under similar lighting conditions to your deployment environment. For Indian factory settings, collect at least 200 images per class under the actual lighting conditions.

Does TFLite support GPU acceleration on Pi 5?

TFLite has limited GPU delegate support for Pi’s VideoCore GPU. The main speedup comes from XNNPACK delegate (enabled by default) which uses NEON SIMD instructions on the Cortex-A76 cores.

Shop Camera & Vision Modules