OpenCV Object Detection: Raspberry Pi Guide

Running OpenCV on a Raspberry Pi for object detection opens up countless applications, from industrial quality inspection to smart doorbell cameras. With the Raspberry Pi 4 or 5 providing sufficient processing power and OpenCV’s optimised libraries, you can achieve real-time object detection at 10-30 fps depending on the model complexity. This guide covers everything from installation to deploying face detection, colour tracking, and TensorFlow Lite-based object detection on your Pi.

Getting Started with OpenCV on Pi
Installation and Setup
Face Detection with Haar Cascades
Colour Object Tracking
TensorFlow Lite Object Detection
Performance Optimisation Tips
Frequently Asked Questions
Conclusion

Getting Started with OpenCV on Pi

OpenCV (Open Source Computer Vision Library) provides over 2500 algorithms for image processing, feature detection, and machine learning. On the Raspberry Pi, OpenCV combined with a camera module creates a portable, affordable computer vision platform that rivals systems costing 10x more.

🛒 Recommended: Arduino Uno R3 Development Board — Pair with Raspberry Pi for vision projects that need to control motors, LEDs, or other hardware based on detection results.

Installation and Setup

# Update system
sudo apt update && sudo apt upgrade -y

# Install OpenCV (pre-built package, much faster than compiling)
sudo apt install python3-opencv -y

# Install additional Python packages
pip3 install numpy imutils tflite-runtime

# Verify installation
python3 -c "import cv2; print(cv2.__version__)"
# Should print 4.x.x

# Test camera
python3 -c "
import cv2
cap = cv2.VideoCapture(0)
ret, frame = cap.read()
print('Camera working:', ret, 'Shape:', frame.shape)
cap.release()
"

Face Detection with Haar Cascades

Haar Cascade classifiers are the fastest method for face detection on Raspberry Pi, achieving 15-25 fps at 640×480 resolution. OpenCV includes pre-trained cascades for faces, eyes, and smiles.

import cv2

face_cascade = cv2.CascadeClassifier(
    cv2.data.haarcascades + 'haarcascade_frontalface_default.xml'
)

cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

while True:
    ret, frame = cap.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
    faces = face_cascade.detectMultiScale(
        gray, scaleFactor=1.1, minNeighbors=5, minSize=(30, 30)
    )
    
    for (x, y, w, h) in faces:
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2)
        cv2.putText(frame, 'Face', (x, y-10),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)
    
    cv2.imshow('Face Detection', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Colour Object Tracking

HSV colour space-based tracking is extremely fast (30+ fps) and ideal for industrial sorting, robotics, and sports analysis. Convert the frame to HSV, apply a colour range mask, find contours, and track the largest contour.

import cv2
import numpy as np

# Track red objects (adjust HSV range for your target)
lower_red = np.array([0, 120, 70])
upper_red = np.array([10, 255, 255])

cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()
    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
    mask = cv2.inRange(hsv, lower_red, upper_red)
    
    # Clean up mask
    mask = cv2.erode(mask, None, iterations=2)
    mask = cv2.dilate(mask, None, iterations=2)
    
    contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    
    if contours:
        largest = max(contours, key=cv2.contourArea)
        if cv2.contourArea(largest) > 500:
            (x, y), radius = cv2.minEnclosingCircle(largest)
            cv2.circle(frame, (int(x), int(y)), int(radius), (0, 255, 0), 2)
    
    cv2.imshow('Colour Tracking', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

🛒 Recommended: Arduino Mega 2560 R3 Board — For complex vision-controlled systems needing many I/O pins for servos, motors, and additional sensors.

TensorFlow Lite Object Detection

For detecting multiple object types (people, cars, animals), TensorFlow Lite with a pre-trained SSD MobileNet model provides the best balance of accuracy and speed on Raspberry Pi. Expect 5-10 fps on Pi 4 at 300×300 input resolution.

import cv2
import numpy as np
import tflite_runtime.interpreter as tflite

# Load TFLite model
interpreter = tflite.Interpreter(model_path='ssd_mobilenet_v2.tflite')
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Load labels
with open('coco_labels.txt', 'r') as f:
    labels = [line.strip() for line in f.readlines()]

cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()
    input_data = cv2.resize(frame, (300, 300))
    input_data = np.expand_dims(input_data, axis=0).astype(np.uint8)
    
    interpreter.set_tensor(input_details[0]['index'], input_data)
    interpreter.invoke()
    
    boxes = interpreter.get_tensor(output_details[0]['index'])[0]
    classes = interpreter.get_tensor(output_details[1]['index'])[0]
    scores = interpreter.get_tensor(output_details[2]['index'])[0]
    
    h, w = frame.shape[:2]
    for i in range(len(scores)):
        if scores[i] > 0.5:
            ymin, xmin, ymax, xmax = boxes[i]
            x1, y1 = int(xmin * w), int(ymin * h)
            x2, y2 = int(xmax * w), int(ymax * h)
            label = labels[int(classes[i])]
            cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
            cv2.putText(frame, f'{label}: {scores[i]:.0%}',
                       (x1, y1-10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0,255,0), 2)
    
    cv2.imshow('Object Detection', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

🛒 Recommended: DHT11 Temperature and Humidity Sensor — Add environmental data logging to your vision system for correlating detections with conditions.

Performance Optimisation Tips

Reduce resolution: Process at 320×240 instead of 640×480; detection accuracy barely drops but speed doubles
Use threading: Separate camera capture and processing into different threads to eliminate frame buffering delays
Skip frames: Process every 2nd or 3rd frame for real-time display while maintaining smooth video
ROI processing: Define a region of interest and only run detection within that area
Use Pi 5: The Pi 5’s improved CPU provides 40-60% faster inference than Pi 4

🛒 Recommended: Waveshare ESP32-S3-Nano Development Board — For edge AI applications where a full Raspberry Pi is overkill, the ESP32-S3 can run basic TFLite models.

Frequently Asked Questions

Can Raspberry Pi handle real-time object detection?

Yes. With TFLite and SSD MobileNet, Raspberry Pi 4 achieves 8-12 fps and Pi 5 achieves 15-20 fps at 300×300 input. For faster inference, add a Coral USB Accelerator (TPU) which pushes performance to 30+ fps.

Which Pi model is best for OpenCV?

Raspberry Pi 5 (8GB) is the best choice for serious OpenCV work. The Pi 4 (4GB) is adequate for lighter tasks. Avoid Pi 3 and Zero for real-time video processing as they lack the necessary CPU and RAM.

Can I use a USB webcam instead of the Pi Camera?

Yes. Any V4L2-compatible USB camera works with OpenCV via cv2.VideoCapture(0). However, the CSI camera has lower latency and CPU overhead because it uses the Pi’s dedicated camera interface and hardware ISP.

How do I run OpenCV headless (without display)?

Replace cv2.imshow() with cv2.imencode() to save frames, or stream via Flask/FastAPI to a web browser. This is common for deployed systems where the Pi runs as a headless server.

Conclusion

OpenCV on Raspberry Pi is a powerful and affordable platform for machine vision projects. From simple face detection to advanced TensorFlow Lite object detection, the Pi handles a remarkable range of computer vision tasks. Start with the simpler Haar Cascade examples, progress to colour tracking, and then tackle deep learning-based detection as you gain confidence. Find Raspberry Pi boards, cameras, and accessories at Zbotic to start your vision project today.