Track Objects in Drone Video

Write a Python script that follows moving objects through aerial footage automatically.

High School Computer Vision 2–3 weeks

Last reviewed: March 2026

Overview

Drone cameras generate enormous amounts of video every day—for infrastructure inspection, wildlife surveys, precision agriculture, and border surveillance. Manually reviewing hours of footage to find moving objects is slow and error-prone. Computer vision algorithms can process video automatically, flagging and tracking objects of interest in real time. In this project you will build exactly that kind of system using OpenCV, the world's most widely used computer vision library.

You will start with publicly available aerial drone footage (several research datasets are freely downloadable) and work through progressively more sophisticated techniques. First you will use background subtraction to isolate moving objects from a static background. Then you will apply contour detection to draw bounding boxes around detected objects. Finally you will use OpenCV's CSRT tracker—a robust multi-object tracker—to follow selected objects across frames even when they temporarily overlap or leave the field of view.

The techniques in this project underpin real aerospace applications: the U.S. Air Force uses similar algorithms in wide-area surveillance systems, wildlife biologists use them to count animal populations from drone surveys, and search-and-rescue teams use them to spot survivors in disaster footage. You do not need any prior programming experience—the guide walks you through every line of code.

What You'll Learn

✓ Load, display, and write video files using OpenCV's VideoCapture and VideoWriter.
✓ Apply background subtraction (MOG2) to isolate moving objects in aerial video.
✓ Use contour detection and bounding-box filtering to identify candidate objects.
✓ Initialize and update an OpenCV object tracker (CSRT) across video frames.
✓ Evaluate tracker performance by computing Intersection over Union (IoU) on annotated frames.

Step-by-Step Guide

Set up OpenCV and acquire test footage

Install OpenCV with pip install opencv-python numpy matplotlib. Download a sample aerial drone video from the VisDrone dataset (github.com/VisDrone) or film your own short clip from a park. Load the video with cap = cv2.VideoCapture("drone_video.mp4") and display the first frame with cv2.imshow. Print the frame dimensions, frame rate, and total frame count using cap.get() to understand what you are working with.

Apply background subtraction to find motion

Create a background subtractor: fgbg = cv2.createBackgroundSubtractorMOG2(history=100, varThreshold=50). In a loop, read each frame and apply the subtractor: fgmask = fgbg.apply(frame). Display the foreground mask alongside the original frame. Moving objects appear white against a black background. Apply morphological operations to clean up noise: kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5,5)) followed by cv2.morphologyEx(fgmask, cv2.MORPH_OPEN, kernel).

Detect objects with contour analysis

Find contours in the cleaned mask with contours, _ = cv2.findContours(fgmask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE). Filter contours by area to remove tiny noise blobs—only keep those with area > 500 pixels. For each surviving contour, compute the bounding rectangle with cv2.boundingRect(contour) and draw it on the original frame in green. Count detections per frame and plot this count over time to see when activity peaks in your footage.

Initialize a tracker on a selected object

Pause video on frame 1 and let the user draw a bounding box around one object using roi = cv2.selectROI("Select Object", frame). Create a CSRT tracker: tracker = cv2.TrackerCSRT_create() and initialize it: tracker.init(frame, roi). In subsequent frames, call success, box = tracker.update(frame). If success is True, draw the bounding box at the new position. Print a message if tracking fails (object left frame or became occluded).

Track multiple objects simultaneously

Use OpenCV's MultiTracker (or maintain a Python list of CSRT trackers manually) to track 3–5 objects at once. Let the user click to select each object in turn on the first frame. Update all trackers each frame and draw each bounding box in a different color. Save the output as a new video file using VideoWriter. This multi-object tracking output is what a real surveillance system would hand off to a human analyst.

Evaluate tracker accuracy with IoU

Manually annotate the ground-truth bounding box for your tracked object in 20 evenly spaced frames by pausing the video and recording coordinates. Compute Intersection over Union (IoU) for each frame: IoU = area of overlap / area of union. A value above 0.5 is typically considered a successful track. Plot IoU over time and note where it drops—these are usually moments of occlusion, fast motion, or lighting change. Discuss what improvements (e.g., deep learning trackers) could address these weaknesses.

Career Connection

See how this project connects to real aerospace careers.

Go Further

Replace CSRT with a deep learning tracker like SiamRPN+ (available through the MMTracking library) and compare accuracy on fast-moving objects.
Add a speed estimator: using known altitude, camera focal length, and pixel displacement per frame, estimate vehicle speed in km/h.
Implement automatic tracker re-initialization when IoU drops below 0.3 by using the background subtraction detections as a re-detection mechanism.
Run your tracking pipeline on a live webcam feed to see performance in real time—hold up objects and move them around to test edge cases.

Related Projects

High School Count Aircraft on a Runway with YOLO Use a real-time object detector to automatically count planes in satellite imagery. View Project → High School Fly a Simulated Drone with PX4 Command a virtual drone through a waypoint mission using real autopilot software. View Project →

← Back to All Projects More High School → High School Projects