Florence-2 vs. RTMDet: Compared and Contrasted

Models

Florence 2

Florence-2 is a lightweight vision-language model open-sourced by Microsoft under the MIT license.

RTMDet

RTMDet is an efficient real-time object detector, with self-reported metrics outperforming the YOLO series. It achieves 52.8% AP on COCO with 300+ FPS on an NVIDIA 3090 GPU, making it one of the fastest and most accurate object detectors available as of writing this post.

Learn more about RTMDet

Model Type

Object Detection

Model Features

Item 1 Info

Item 2 Info

Architecture

Annotation Format

Instance Segmentation

Framework

GitHub

View Repo