YOLOv12 is a newly proposed attention-centric variant of the YOLO family that focuses on incorporating efficient attention mechanisms into the backbone while preserving real-time performance. Instead of relying heavily on CNN-based architectures like its predecessors, YOLOv12 introduces a simple yet powerful “area attention” module, which strategically partitions the feature map to reduce the quadratic complexity of full self-attention.
The model uses the same format as YOLOv8: the YOLO PyTorch TXT format. You can import data from and export data to the YOLO PyTorch TXT format with Roboflow.
Below, learn the structure of YOLOv12 PyTorch TXT.
Each image has one txt file with a single line for each bounding box. The format of each row is
class_id center_x center_y width height
where fields are space delimited, and the coordinates are normalized from zero to one.
Note: To convert to normalized xywh from pixel values, divide x (and width) by the image's width and divide y (and height) by the image's height.
1 0.617 0.3594420600858369 0.114 0.17381974248927037
1 0.094 0.38626609442060084 0.156 0.23605150214592274
1 0.295 0.3959227467811159 0.13 0.19527896995708155
1 0.785 0.398068669527897 0.07 0.14377682403433475
1 0.886 0.40879828326180256 0.124 0.18240343347639484
1 0.723 0.398068669527897 0.102 0.1609442060085837
1 0.541 0.35085836909871243 0.094 0.16952789699570817
1 0.428 0.4334763948497854 0.068 0.1072961373390558
1 0.375 0.40236051502145925 0.054 0.1351931330472103
1 0.976 0.3927038626609442 0.044 0.17167381974248927
The `data.yaml` file contains configuration values used by the model to locate images and map class names to class_id
's.
train: ../train/images
val: ../valid/images
nc: 3
names: ['head', 'helmet', 'person']
With Roboflow supervision, an open source Python package with utilities for completing computer vision tasks, you can merge and split detections in YOLOv12 PyTorch TXT. Read our dedicated guides to learn how to merge and split YOLOv12 PyTorch TXT detections.
Below, see model architectures that require data in the YOLOv12 PyTorch TXT format when training a new model.
On each page below, you can find links to our guides that show how to plot predictions from the model, and complete other common tasks like detecting small objects with the model.