Use the widget below to experiment with RT-DETR. You can detect COCO classes such as people, vehicles, animals, household items.
RT-DETR (Real-Time Detection Transformer) is an object detection model developed by Baidu that integrates a Transformer-based architecture to achieve high accuracy while maintaining real-time performance. Unlike conventional detection models that rely on separate object proposal and classification stages, RT-DETR employs an end-to-end design, streamlining the detection process for greater efficiency.
One of RT-DETR's key strengths is its ability to balance speed and accuracy, making it well-suited for applications that demand rapid processing, such as autonomous driving and video surveillance. By leveraging the Transformer architecture, the model effectively captures complex spatial relationships in images while ensuring low-latency inference. This combination of efficiency and precision makes RT-DETR an attractive option for developers and researchers working in real-time object detection scenarios.
Feature | Specification |
---|---|
Launch date | 2024 |
Company | Baidu |
License | Apache License 2.0 |
Model family | RT-DETR-S RT-DETR-M RT-DETR-L RT-DETR-X |
Parameter size | RT-DETR-S: 20M RT-DETR-M: 36M RT-DETR-L: 45M RT-DETR-X: 86M |
mAP (Mean Average Precision) COCO | RT-DETR-S: 48.1% RT-DETR-M: 51.9% RT-DETR-L: 53.0% RT-DETR-X: 54.8% |
Inference speed (T4 GPU) | RT-DETR-S: 199 FPS RT-DETR-M: 133 FPS RT-DETR-L: 114 FPS RT-DETR-X: 74 FPS |
Architecture type | Transformer-based |
Unique features | Efficient Hybrid Encoder, IoU-aware Query Selection |
Paper | RT-DETR Paper |
Try this model | How to train RT-DETR on a custom dataset |
RT-DETR
is licensed under a
license.
Based on COCO, RT-DETR beats out YOLOv8
You can use Roboflow Inference to deploy a
RT-DETR
API on your hardware. You can deploy the model on CPU (i.e. Raspberry Pi, AI PCs) and GPU devices (i.e. NVIDIA Jetson, NVIDIA T4).
Below are instructions on how to deploy your own model API.