This format contains one text file per image (containing the annotations and a numeric representation of the label) and a labelmap which maps the numeric IDs to human readable strings. The annotations are normalized to lie within the range [0, 1] which makes them easier to work with even after scaling or stretching images. It has become quite popular as it has followed the Darknet framework's implementations of the various YOLO models.
Roboflow can read and write YOLO Darknet files so you can easily convert them to or from any other object detection annotation format. Once you're ready, use your converted annotations with our training YOLO v4 with a custom dataset tutorial.
Below, learn the structure of YOLO Darknet TXT.
Each image has one txt file with a single line for each bounding box. The format of each row is:
class_id center_x center_y width height
1 0.408 0.30266666666666664 0.104 0.15733333333333333
1 0.245 0.424 0.046 0.08
head
helmet
person
With Roboflow supervision, an open source Python package with utilities for completing computer vision tasks, you can merge and split detections in YOLO Darknet TXT. Read our dedicated guides to learn how to merge and split YOLO Darknet TXT detections.
Below, see model architectures that require data in the YOLO Darknet TXT format when training a new model.
On each page below, you can find links to our guides that show how to plot predictions from the model, and complete other common tasks like detecting small objects with the model.