Use the widget below to experiment with YOLOE. You can detect COCO classes such as people, vehicles, animals, household items.
Object detection and segmentation are often constrained by predefined categories or heavy open-set methods. YOLOE consolidates detection and segmentation for text, visual, or no prompts in one efficient model. It re-parameterizes textual embeddings, employs a semantic-activated visual prompt encoder, and leverages a built-in vocabulary for prompt-free detection. Extensive tests show real-time performance, strong zero-shot transferability, and lower training cost. On LVIS, YOLOE outperforms YOLO-Worldv2 with 3× less training cost and faster inference. On COCO, YOLOE exceeds closed-set YOLOv8 with nearly 4× fewer training hours.
Here is an exmple of the model used to identify a variety of objects:
YOLOE
is licensed under a
AGPL-3.0
license.
You can use Roboflow Inference to deploy a
YOLOE
API on your hardware. You can deploy the model on CPU (i.e. Raspberry Pi, AI PCs) and GPU devices (i.e. NVIDIA Jetson, NVIDIA T4).
Below are instructions on how to deploy your own model API.