YOLOv5 Object Tracking: A How-To Guide

Object tracking algorithms like ByteTrack can be applied to object detection models to track the same instance of an object throughout a video. This is useful for a range of use cases, such as tracking players on a football field to calculate statistics.

WIth a tracking algorithm, you can also count unique instances of an object of interest in an video.

In this guide, we will show how to use ByteTrack to track objects with a


model. Here are the steps to follow:

1. Load supervision, ByteTrack, and an object detection model
2. Create a callback to process a target video
3. Process the target video

Without further ado, let's get started!

Step #1: Install supervision

We'll be using supervision in this guide, an open source Python package with a range of utilities you can use in building computer vision projects. You can install supervision using the following command:

pip install supervision

Step #2: Load Data and Model

First, we need to load data into a Python program. We'll also need to load a model for use in inference and initialize ByteTrack, the object tracking algorithm we will use. Create a new Python file and add the following code:

import supervision as sv
from ultralytics import YOLO

model = YOLO(...)
byte_tracker = sv.ByteTrack()
annotator = sv.BoxAnnotator()

Replace the ... with the name of your model weights file.

Step #2: Create a Video Processing Callback

Next, we need to write a callback that runs inference and applies all of the logic we want to apply to predictions. In the example below, we run inference on our model. Predictions run through ByteTrack for tracking. Then, we plot all predictions.

def callback(frame: np.ndarray, index: int) -> np.ndarray:
    results = model(frame)[0]
    detections = sv.Detections.from_ultralytics(results)
    detections = byte_tracker.update_with_detections(detections)
    labels = [
        f"#{tracker_id} {model.model.names[class_id]} {confidence:0.2f}"
       for _, _, confidence, class_id, tracker_id
      in detections
   return annotator.annotate(scene=frame.copy(), detections=detections, labels=labels)

You can also apply filters to only show predictions that meet a certain criteria. To learn more about filtering detections, refer to the supervision Detections() documentation.

Step #3: Process the Video

Finally, we need to run our callback script on every frame in our video. We can do so using the following code:

sv.process_video(source_path=VIDEO_PATH, target_path=f"result.mp4", callback=process_frame)

You have now set up object tracking with ByteTrack!

Next steps

supervision provides an extensive range of functionalities for working with computer vision models. With supervision, you can:

1. Process and filter detections and segmentation masks from a range of popular models (YOLOv5, Ultralytics YOLOv8, MMDetection, and more).
2. Process and filter classifications.
3. Plot bounding boxes and segmentation masks.

And more! To learn about the full range of functionality in supervision, check out the supervision documentation.

Learn how to track objects using other models

Below, you can find our guides on how to run object tracking with ByteTrack using other models.