Open Vocabulary Object Detection

Open vocabulary object detection models identify object locations in image using arbitrary text prompts.

Deploy select models (i.e. YOLOv8, CLIP) using the Roboflow Hosted API, or your own hardware using Roboflow Inference.

Showing

of

models.

Florence 2

Florence-2 is a lightweight vision-language model open-sourced by Microsoft under the MIT license.

Open Vocabulary Object Detection

Deploy with Roboflow

View Model Details

Deploy with free GPU

Visual Question Answering

Image Similarity

Image Captioning

Zero-shot Detection

Real-Time Vision

Image Embedding

LLMS with Vision Capabilities

Multimodal Vision

Foundation Vision

Frequently Asked Questions

No items found.

Where Can I Learn More About Object Detection?

View All Learning Resources

No items found.