Use the widget below to experiment with GroundingDINO. You can detect COCO classes such as people, vehicles, animals, household items.
Grounding DINO is a zero-shot object detection model made by combining a Transformer-based DINO detector and grounded pre-training. Grounding DINO does impressively well in zero-shot object detection, where it achieves an impressive performance on COCO and LVIS without being trained on these datasets directly.
According to the Grounding DINO paper abstract, the model achieves "a 52.5 AP on the COCO detection zero-shot transfer benchmark" as well as SOTA performance on several other important benchmarks.
Overall, GroundingDINO is a robust and flexible model for various object detection tasks, especially those requiring open-set detection capabilities.
GroundingDINO
is licensed under a
Apache-2.0
license.
name | backbone | Data | box AP on COCO | Checkpoint | Config | |
---|---|---|---|---|---|---|
1 | GroundingDINO-T | Swin-T | O365,GoldG,Cap4M | 48.4 (zero-shot) / 57.2 (fine-tune) | Github link | HF link | link |
You can use Roboflow Inference to deploy a
GroundingDINO
API on your hardware. You can deploy the model on CPU (i.e. Raspberry Pi, AI PCs) and GPU devices (i.e. NVIDIA Jetson, NVIDIA T4).
Below are instructions on how to deploy your own model API.