Use the widget below to experiment with SigLIP. You can detect COCO classes such as people, vehicles, animals, household items.
SigLIP is an image embedding model defined in the "Sigmoid Loss for Language Image Pre-Training" paper. Released in March, 2023, SigLIP uses CLIP’s framework with one twist: its loss function. Through this change, SigLIP achieves significant improvements in zero shot detections.
You can use SigLIP to calculate image embeddings. These embeddings can be used for:
SigLIP
is licensed under a
license.
SigLIP achieves superior zero-shot performance on benchmarks like ImageNet, outperforming models like CLIP. The model’s design and training methodology make it particularly effective in generalizing to unseen data, a key advantage in many real-world applications
You can use Roboflow Inference to deploy a
SigLIP
API on your hardware. You can deploy the model on CPU (i.e. Raspberry Pi, AI PCs) and GPU devices (i.e. NVIDIA Jetson, NVIDIA T4).
Below are instructions on how to deploy your own model API.
First, install Autodistill and Autodistill SigLIP:
pip install autodistill autodistill-siglip
Then, run:
from autodistill_siglip import SigLIP
from autodistill.detection import CaptionOntology
# define an ontology to map class names to our SigLIP prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
labels = ["person", "a forklift"]
base_model = SigLIP(
ontology=CaptionOntology({item: item for item in labels})
)
results = base_model.predict("image.jpeg", confidence=0.1)
top_1 = results.get_top_k(1)
# show top label
print(labels[top_1[0][0]])
# label folder of images
base_model.label("./context_images", extension=".jpeg")