Try the Model

Use the widget below to experiment with MobileCLIP. You can detect COCO classes such as people, vehicles, animals, household items.

Overview

MobileCLIP is an image embedding model developed by Apple and introduced in the "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training" paper.

You can use MobileCLIP to calculate image embeddings. These embeddings can be used for:

Zero-shot image classification;
Calculating image similarity;
Deduplicating images in a dataset, and more.

MobileCLIP License

MobileCLIP

is licensed under a

license.

Performance

Deploy a MobileCLIP API

You can use Roboflow Inference to deploy a

MobileCLIP

API on your hardware. You can deploy the model on CPU (i.e. Raspberry Pi, AI PCs) and GPU devices (i.e. NVIDIA Jetson, NVIDIA T4).

Below are instructions on how to deploy your own model API.

First, install Autodistill and Autodistill MobileCLIP:


pip install autodistill-grounded-sam autodistill-mobileclip

Then, run:


from autodistill_mobileclip import MobileCLIP
from autodistill.detection import CaptionOntology

# define an ontology to map class names to our MobileCLIP prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
base_model = MobileCLIP(
    ontology=CaptionOntology(
        {
            "person": "person",
            "a forklift": "forklift"
        }
    )
)

result = base_model.predict("image.jpeg")

print(result)

Label Data Automatically with MobileCLIP

You can automatically label a dataset using MobileCLIP with help from Autodistill, an open source package for training computer vision models. You can label a folder of images automatically with only a few lines of code. Below, see our tutorials that demonstrate how to use MobileCLIP to train a computer vision model.

No items found.