How to count objects in Transformers predictions

Once you know what objects are in an image, you can count them, either in aggregate or by class. This is useful for a range of use cases. For example, you can check if all five of a list of classes are present (which you could use in a quality assurance checklist), count the number of scratches in a product, and more.



In this guide, we are going to walk through how to count objects in

detections.

We will:

1. Install supervision
2. Run inference with a model
3. Count objects with Python

Without further ado, let's get started!

Step #1: Install supervision

First, install the supervision pip package:

pip install supervision


Once you have installed supervision, you are ready to load your data and start writing logic to filter detections.

Step #2: Load Data

First, we are going to load our dataset into a supervision.DetectionDataset() object. This object will contain information about all the images in a dataset. You can load datasets from many different model types, from YOLO to MMDetection. For this guide, we will use the

data loader.

You can load data using the following code:


import supervision as sv

detections = sv.Detections.from_transformers(...)

Replace the ... with the response object from your model.

Step #3: Count Objects

Count Objects in a Single Class

To count all of the objects that match a single class, we can use the following code:


detections = detections[detections.class_id == classes.index("class name")]
print(len(detections))

Replace "class name" with the name of the class with which you are working.

Count Objects in Multiple Classes

To count all of the objects that match one of multiple classes, we can use the following code:


selected_classes = [class_names.index(i) for i in detections.class_id)
detections = detections[np.isin(detections.class_id, selected_classes)]
print(len(detections))

Replace the selected_classes list with the IDs of the classes with which you are working.


Next steps

supervision provides an extensive range of functionalities for working with computer vision models. With supervision, you can:

1. Process and filter detections and segmentation masks from a range of popular models (YOLOv5, Ultralytics YOLOv8, MMDetection, and more).
2. Process and filter classifications.
3. Compute confusion matrices.

And more! To learn about the full range of functionality in supervision, check out the supervision documentation.