Video
Video
Grounding DINO: Automated Dataset Annotation and Evaluation | SOTA Zero-Shot Object Detector
Learn how to use Grounding DINO on your own images in this YouTube guide.
Grounding DINO is a zero-shot object detection model made by combining a Transformer-based DINO detector and grounded pre-training. Grounding DINO does impressively well in zero-shot object detection, where it achieves an impressive performance on COCO and LVIS without being trained on these datasets directly.
According to the Grounding DINO paper abstract, the model achieves "a 52.5 AP on the COCO detection zero-shot transfer benchmark" as well as SOTA performance on several other important benchmarks.
Overall, GroundingDINO is a robust and flexible model for various object detection tasks, especially those requiring open-set detection capabilities.
name | backbone | Data | box AP on COCO | Checkpoint | Config | |
---|---|---|---|---|---|---|
1 | GroundingDINO-T | Swin-T | O365,GoldG,Cap4M | 48.4 (zero-shot) / 57.2 (fine-tune) | Github link | HF link | link |