Try the Model

Use the widget below to experiment with SAM-CLIP. You can detect COCO classes such as people, vehicles, animals, household items.

Overview

SAM-CLIP can accurately segment objects within an image based on both visual and textual inputs, making it highly effective for tasks requiring detailed object identification and separation. The model can classify and segment objects it hasn’t explicitly been trained on, thanks to its integration with CLIP’s zero-shot learning capabilities. SAM-CLIP also provides advanced semantic segmentation, where it can segment and label parts of an image based on their semantic meaning.

‍

Example

Use Grounding DINO, Segment Anything, and CLIP to label objects in images.

Below is an image with segmentation masks of allMcDonalds logos in an image.

This demo was created by sending the prompt logo to Grounding DINO and SAM, then classifying each prediction using CLIP with two prompts: McDonalds and Burger King.

‍

SAM-CLIP License

SAM-CLIP

is licensed under a

license.

Performance

Deploy a SAM-CLIP API

You can use Roboflow Inference to deploy a

SAM-CLIP

API on your hardware. You can deploy the model on CPU (i.e. Raspberry Pi, AI PCs) and GPU devices (i.e. NVIDIA Jetson, NVIDIA T4).

Below are instructions on how to deploy your own model API.

Label Data Automatically with SAM-CLIP

You can automatically label a dataset using SAM-CLIP with help from Autodistill, an open source package for training computer vision models. You can label a folder of images automatically with only a few lines of code. Below, see our tutorials that demonstrate how to use SAM-CLIP to train a computer vision model.

No items found.