Try the Model

Use the widget below to experiment with Segment Anything 2. You can detect COCO classes such as people, vehicles, animals, household items.

Overview

Segment Anything 2 (SAM 2) is a real-time image and video segmentation model. SAM 2 works on both images and videos. The previous version of SAM, on the other hand, was built explicitly for use in images. You can use SAM 2 to identify the location of specific objects in images. Learn more about the newest model SAM 3, which uses concept prompts to generate segmentation masks.

There are two ways you can run SAM 2:

Using the automatic mask generator, that segments all objects in an image or video and generates corresponding masks, or;
By using a point prompt.

The automatic mask generator is ideal if you want to segment all objects. Using a prompt, on the other hand, allows you to be more specific in your segmentation.

To identify the location of an object, you need to provide a “prompt”. A prompt can be:

A specific point or series of points that correspond with the object you want to segment;
A box that surrounds the object you want to segment.

These prompts can be provided to an image or video.

We have made an interactive playground that you can use to test SAM 2. In the below widget, upload an image, then run the playground.

The playground will aim to identify bounding boxes for every object in the image using Florence-2, a zero-shot detection model. Then, the playground will calculate segmentation masks for each bounding box using SAM 2.

It may take 15-30 seconds to see the result for your image.

Segment Anything 2 License

Segment Anything 2

is licensed under a

Apache 2.0

license.

Performance

Deploy a Segment Anything 2 API

You can use Roboflow Inference to deploy a

Segment Anything 2

API on your hardware. You can deploy the model on CPU (i.e. Raspberry Pi, AI PCs) and GPU devices (i.e. NVIDIA Jetson, NVIDIA T4).

Below are instructions on how to deploy your own model API.

You can use Segment Anything 2 with Florence 2 as a grounded segmentation model. With this combination, you can provide a text prompt to Florence 2 to retrieve bounding boxes that correspond to objects. You can then use Segment Anything 2 to generate segmentation masks that correspond to the objects in each bounding box.


pip install autodistill-grounded-sam-2

To generate a segmentation mask for objects in an image, you can use the following code:


from autodistill_grounded_sam_2 import GroundedSAM2
from autodistill.detection import CaptionOntology
from autodistill.utils import plot
import cv2

# define an ontology to map class names to our Grounded SAM 2 prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
base_model = GroundedSAM2(
    ontology=CaptionOntology(
        {
            "person": "person",
            "shipping container": "shipping container",
        }
    )
)

# run inference on a single image
results = base_model.predict("logistics.jpeg")

plot(
    image=cv2.imread("logistics.jpeg"),
    classes=base_model.ontology.classes(),
    detections=results
)
# label all images in a folder called `context_images`
base_model.label("./context_images", extension=".jpeg")

Above, replace "person" and "shipping container" with the text prompts that correspond to the objects you want to identify.

Label Data Automatically with Segment Anything 2

You can automatically label a dataset using Segment Anything 2 with help from Autodistill, an open source package for training computer vision models. You can label a folder of images automatically with only a few lines of code. Below, see our tutorials that demonstrate how to use Segment Anything 2 to train a computer vision model.

No items found.