How to augment data for YOLOv5

To build an accurate computer vision model, your training dataset must include a vast range of images representative of both the objects you want to identify and the environment in which you want to identify those objects.

Data augmentation for computer vision is a tactic where images are generated using data already in your dataset. Augmented data is created by applying changes such as brightness adjustments, different levels of contrast, and introducing noise.

Adding augmented data helps your model generalize and thus learn to identify objects of interest more effectively.

In this guide, we are going to show how to augment data for


models using Roboflow. Roboflow offers:

Access to a dozen augmentation options

SOC II Type 2 Compliant

Trusted by 250,000+ developers

To generate augmentations for a


model, you will:

1. Import data into Roboflow
2. Open the Versions tab
3. Select the augmentations you want to apply
4. Generate your dataset with augmentations
5. (Optional) Train a model or export your data

Let's get started!

Step #1: Import data into Roboflow Annotate

First, create a free Roboflow account. Then, create a new project from the Roboflow dashboard:

Once you have created a project, you will be taken to a page where you can upload your images. Drag-and-drop any images into the box:

You can also drag in annotation files if you want to view or amend annotations in Roboflow Annotate.

When you have uploaded your files, click "Save and Continue".

Your images will be uploaded to Roboflow.

After you have uploaded all of your images, you can label them using Roboflow Annotate.

Step #2: Prepare to Generate a New Dataset Version

Once you have labeled all of your images, you are ready to generate a dataset version. A dataset version is a frozen-in-time snapshot of a dataset. You can use versions to track different changes to your dataset over time. If you train models using Roboflow Train, you can compare performance with different augmentations all from the Versions tab.

To create a new dataset version, click "Versions" in the sidebar at the left side of your project.

This will show a page that lets you apply preprocessing and augmentation steps to your dataset.

Step #3: Select Augmentations to Apply

We recommend training the first version of your model without any augmentations. This lets you develop a performance baseline to which you can compare model versions that use augmentations.

When you are ready to add augmentations to your dataset, click on the "Add Augmentation Step" button. A pop up will appear showing all of the options available. You can add as many augmentations as you would like.

There are two types of augmentations available:

  1. Image-level augmentations, which are applied to an entire image, and;
  2. Bounding box-level augmentations, which are applied to a bounding box.

To add an augmentation, click on an option. You will then be asked to configure the augmentation. Let's apply a brightness augmentation, ideal for scenarios where a model will be deployed in different lighting conditions. Applying a brightness augmentation will help our model adapt to identify objects in environments with varying degrees of brightness.

Above, we can configure our brightness augmentation. We can choose whether we want brighter or darker images, and by what percentage brightness should be increased or decreased.

To add the augmentation, click "Apply". The change will appear in the list of augmentations to apply when generating your dataset version:

Once you have added all of the augmentations you want to apply, click "Generate" at the bottom of the page to generate your dataset.

You can then use your dataset for training a model on Roboflow. Or, you can export your dataset for use in a custom training process.

Step #5: Train a Model or Export Data

With all of your data labeled, you are now ready to train a model on Roboflow or export your data elsewhere. To train a model in Roboflow with your data, follow our Roboflow Train guide.

Alternatively, you can export your data into over 30 different formats, depending on the needs for your project.

Bonus: Automatically Label Data with Autodistill

Need more labeled data for your training dataset? Using Autodistill, you can automatically label data for object detection, segmentation, and classification. To learn more about using Autodistill, check out our Autodistill guide.

Enterprise-grade security and compliance

We take security seriously and have implemented comprehensive measures to keep your sensitive data safe

Compliant with SOC2 Type 1 requirements
All data is encrypted in transit and at rest, with SSL transport receiving a grade A+ rating from Qualys
Strict row-level permissions to ensure users cannot access sensitive data outside of their organizations
Roboflow is hosted on the Google Cloud Platform and Amazon Web Services, best-in-class infrastructure as a service providers
Authentication, database and file storage mechanisms are ISO 27001, ISO 27017, ISO 27018, SOC 1, SOC 2 and SOC 3 compliant
PCI compliant with Self-Assessment Questionnaire A and Attestation of Compliance
All card numbers and bank accounts never touch our servers and are stored by Stripe, a PCI Service Provider Level 1, the highest available security certification in the payments industry
Access to production data is heavily restricted within Roboflow and only accessible via SSO login
All Roboflow employees sign nondisclosure agreements restricting them from sharing information learned while handling customer data

Learn how to augment for other models

Below, you can find our guides on how to augment data for other models.

MANAGING over 100 million images for companies of all sizes
cardinal healthUSGIntel logoRivian logoMedtronic logoColumn logo