How to merge COCO Datasets

You may have two or more datasets that you want to combine to train a larger model. You can use the supervision DetectionsDataset merge() method to merge two datasets together. This works with a variety of dataset formats, from COCO JSON to YOLO.

In this guide, we are going to show you how to merge two

COCO Dataset

s.

We will:

1. Install supervision
2. Load data into supervision DetectionsDataset classes
3. Merge the datasets with the merge() method
4. Export the data into a folder

Without further ado, let's get started!

Step #1: Install supervision

First, install the supervision pip package:

pip install supervision


Once you have installed supervision, you are ready to load your data and start writing logic to filter detections.

Step #2: Load and Merge Data

First, we are going to load our datasets into two supervision.DetectionDataset() objects. These objects will contain information about all the images in a dataset. You can load datasets from many different model types, from YOLO to COCO JSON. For this guide, we will use the

COCO Dataset

data loader.

We will then merge our datasets with the merge() method.

You can load and merge two datasets using the following code:


import supervision as sv

ds_1 = sv.DetectionDataset.from_coco(
    images_directory_path=f"dataset1/train",
    annotations_path=f"dataset1/train/_annotations.coco.json",
)
len(ds_1)
# 100
ds_1.classes
# ['dog', 'person']

ds_2 = sv.DetectionDataset.from_coco(
    images_directory_path=f"dataset2/train",
    annotations_path=f"dataset2/train/_annotations.coco.json",
)
len(ds_2)
# 200
ds_2.classes
# ['cat']

ds_merged = sv.DetectionDataset.merge([ds_1, ds_2])
len(ds_merged)
# 300
ds_merged.classes
# ['cat', 'dog', 'person']

You can then save your dataset in the following formats:

1. COCO JSON
2. Pascal VOC
3. YOLO

Next steps

supervision provides an extensive range of functionalities for working with computer vision models. With supervision, you can:

1. Process and filter detections and segmentation masks from a range of popular models (YOLOv5, Ultralytics YOLOv8, MMDetection, and more).
2. Process and filter classifications.
3. Compute confusion matrices.

And more! To learn about the full range of functionality in supervision, check out the supervision documentation.

Learn how to plot detections for other models

Below, you can find our guides on how to plot detections for other computer vision models.