Running image classification on edge devices with MobileNet and Raspberry Pi 5 opens up real-time AI inference without cloud connectivity. MobileNet’s depthwise separable convolutions are designed specifically for resource-constrained devices, making it ideal for embedded vision applications. This tutorial covers deploying MobileNetV2 and MobileNetV3 on Raspberry Pi 5 using TensorFlow Lite for efficient MobileNet Pi 5 image classification at 30+ FPS.
Table of Contents
- MobileNet Architecture for Edge AI
- Hardware: Raspberry Pi 5 Advantages
- TensorFlow Lite Setup
- Downloading MobileNet Models
- Real-Time Inference Code
- Custom Classification with Transfer Learning
- Performance Benchmarks
- FAQ
MobileNet Architecture for Edge AI
MobileNet uses depthwise separable convolutions to reduce computation by 8-9x compared to standard convolutions. MobileNetV2 introduced inverted residual blocks with linear bottlenecks. MobileNetV3 adds hard-swish activations and squeeze-excitation blocks for further efficiency gains. On Raspberry Pi 5, MobileNetV3-Small achieves ~200 FPS with TFLite, while MobileNetV3-Large runs at ~80 FPS with 75.8% ImageNet top-1 accuracy.
Hardware: Raspberry Pi 5 Advantages
Raspberry Pi 5 features the Broadcom BCM2712 with ARM Cortex-A76 cores running at 2.4 GHz – approximately 2-3x faster than Pi 4 for AI inference tasks. Key advantages:
- LPDDR4X RAM at higher bandwidth than Pi 4’s LPDDR4
- PCIe 2.0 interface for AI accelerator HAT (Hailo-8L) adding 13 TOPS
- Improved NPU access via Hailo AI HAT for neural network acceleration
Arducam IMX477 12MP HQ Camera
12MP Sony IMX477 High Quality Camera for Raspberry Pi 5. Ideal input for MobileNet classification – high resolution crop-and-resize before inference. C/CS mount for interchangeable lenses. Available at Zbotic.
Waveshare IMX219 8MP Camera for Pi 5
IMX219 8MP camera fully compatible with Raspberry Pi 5 CSI port. Cost-effective option for edge AI projects. Good dynamic range for indoor classification tasks.
TensorFlow Lite Setup
sudo apt update
sudo apt install -y python3-pip python3-picamera2
# Install TFLite runtime (optimised for ARM)
pip3 install tflite-runtime
# Or full TensorFlow (larger but includes training tools)
pip3 install tensorflow
# Verify
python3 -c "import tflite_runtime.interpreter as tflite; print('TFLite OK')"
Downloading MobileNet Models
mkdir -p ~/models && cd ~/models
# MobileNetV3-Small (fastest - recommended for Pi)
wget https://storage.googleapis.com/download.tensorflow.org/models/tflite/task_library/image_classification/android/mobilenet_v3_small_100_224_float_quant.tflite
# MobileNetV2 (balanced accuracy/speed)
wget https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v2_1.0_224_quant.tflite
# ImageNet labels
wget https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tflite.labels
Real-Time Inference Code
import numpy as np
import cv2
from picamera2 import Picamera2
import tflite_runtime.interpreter as tflite
import time
# Load model and labels
INTERPRETER = tflite.Interpreter(model_path='/home/pi/models/mobilenet_v2_1.0_224_quant.tflite')
INTERPRETER.allocate_tensors()
input_details = INTERPRETER.get_input_details()
output_details = INTERPRETER.get_output_details()
INPUT_SHAPE = input_details[0]['shape'][1:3] # (224, 224)
with open('/home/pi/models/imagenet_labels.txt') as f:
LABELS = [line.strip() for line in f.readlines()]
def classify(frame):
# Resize to model input size
img = cv2.resize(frame, (INPUT_SHAPE[1], INPUT_SHAPE[0]))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = np.expand_dims(img, axis=0).astype(np.uint8)
INTERPRETER.set_tensor(input_details[0]['index'], img)
INTERPRETER.invoke()
output = INTERPRETER.get_tensor(output_details[0]['index'])
top_idx = np.argmax(output[0])
confidence = output[0][top_idx] / 255.0 # dequantize
return LABELS[top_idx], confidence
# Camera setup
picam2 = Picamera2()
picam2.configure(picam2.create_preview_configuration(main={'size':(640,480),'format':'BGR888'}))
picam2.start()
prev_time = time.time()
while True:
frame = picam2.capture_array()
label, conf = classify(frame)
fps = 1 / (time.time() - prev_time)
prev_time = time.time()
cv2.putText(frame, f'{label}: {conf:.1%}', (10,30), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0,255,0), 2)
cv2.putText(frame, f'FPS: {fps:.1f}', (10,60), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255,255,0), 2)
cv2.imshow('MobileNet Classification', frame)
if cv2.waitKey(1) & 0xFF == ord('q'): break
picam2.stop()
Custom Classification with Transfer Learning
For India-specific classification tasks (sorting electronic components by part type, identifying Indian currency notes, quality inspection of manufactured parts), fine-tune MobileNetV2 on your own dataset:
import tensorflow as tf
base_model = tf.keras.applications.MobileNetV2(
input_shape=(224, 224, 3),
include_top=False,
weights='imagenet'
)
base_model.trainable = False # Freeze base
model = tf.keras.Sequential([
base_model,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(NUM_CLASSES, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(train_dataset, epochs=20, validation_data=val_dataset)
# Convert to TFLite
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT] # Quantize
tflite_model = converter.convert()
with open('custom_mobilenet.tflite', 'wb') as f:
f.write(tflite_model)
Performance Benchmarks
Inference times on Raspberry Pi 5 (INT8 quantised models):
- MobileNetV3-Small: ~5ms per frame (200 FPS)
- MobileNetV2 1.0: ~12ms per frame (80 FPS)
- MobileNetV3-Large: ~13ms per frame (77 FPS)
- EfficientNet-Lite0: ~18ms per frame (55 FPS)
With Hailo-8L AI HAT on Pi 5: MobileNetV2 accelerates to under 2ms (500+ FPS). Available from distributors in India for approximately Rs 8,000-10,000.
FAQ
Can I run YOLOv8 instead of MobileNet on Pi 5?
Yes. YOLOv8n (nano) runs at ~10 FPS on Pi 5 with TFLite. For classification-only tasks (not detection), MobileNet is 5-10x faster. Use YOLO when you need bounding boxes.
What accuracy does MobileNetV2 achieve?
MobileNetV2 1.0 achieves 71.8% top-1 accuracy on ImageNet. For custom datasets with good training data, fine-tuned models typically achieve 90-95% accuracy.
How many training images do I need for transfer learning?
As few as 50-100 images per class can work well with MobileNet transfer learning. Collect images under similar lighting conditions to your deployment environment. For Indian factory settings, collect at least 200 images per class under the actual lighting conditions.
Does TFLite support GPU acceleration on Pi 5?
TFLite has limited GPU delegate support for Pi’s VideoCore GPU. The main speedup comes from XNNPACK delegate (enabled by default) which uses NEON SIMD instructions on the Cortex-A76 cores.
Add comment