How to plot PaliGemma predictions

Learn how to plot and visualize model predictions with the open source supervision Python package.

Overview

A common task in working with computer vision models is visualizing model predictions. Being able to qualitatively visualize predictions is useful in in model development, testing, and work to prepare a model for production.

To visualize PaliGemma predictions, we will:

  1. Install supervision
  2. Load data
  3. Plot predictions with a supervision annotator

Let's get started!

XXX and Image Annotation Resources

Explore these resources to enhance your understanding of XXX and image annotation techniques.

Install Supervision

First, install the supervision pip package:

pip install supervision


Once you have installed supervision, you are ready to load your data and start writing logic to filter detections.

Load Data

First, we are going to load our dataset into a supervision.DetectionDataset() object. This object will contain information about all the images in a dataset. You can load datasets from many different model types, from YOLO to MMDetection. For this guide, we will use the

PaliGemma

data loader.

You can load data using the following code:


import supervision as sv

detections = sv.Detections.from_lmm(sv.LMM.PALIGEMMA, result="")

box_annotator = sv.BoundingBoxAnnotator()
labels = [
	f"{classes[class_id]} {confidence:0.2f}"
	for _, _, confidence, class_id, _
	in detections
]

Replace the ... with the response object from your model.

Plot Detections

Supervision has two annotators that let you visualize detections from a computer vision model:

1. BoxAnnotator, to plot bounding boxes.
2. MaskAnnotator, to plot segmentation masks.

You can use the following code to plot bounding boxes:


annotated_frame = box_annotator.annotate(
	scene=image.copy(),
	detections=detections,
	labels=labels
)

sv.plot_image(image=annotated_frame, size=(16, 16))

To plot segmentation masks, replace BoundingBoxAnnotator() with MaskAnnotator(). This will use the segmentation mask annotator for visualizing detections.

You can preview what different annotators look like using the following interactive widget:

Next Steps

supervision provides an extensive range of functionalities for working with computer vision models. With supervision, you can:

1. Process and filter detections and segmentation masks from a range of popular models (YOLOv5, Ultralytics YOLOv8, MMDetection, and more).
2. Process and filter classifications.
3. Compute confusion matrices.

And more! To learn about the full range of functionality in supervision, check out the supervision documentation.