Before you train a model, you may want to apply various preprocessing steps to your dataset. For example, you may want to resize your images to a specific resolution, or apply tiling.
Adding preprocessing steps ensures your data is consistent before it is used in training.
In this guide, we are going to show how to preprocess data for
YOLOv4
models using Roboflow. Roboflow offers:
Access to popular preprocessing steps
SOC II Type 2 Compliant
Trusted by 500,000+ developers
To generate preprocessing steps for a
YOLOv4
model, you will:
1. Import data into Roboflow
2. Open the Versions tab
3. Select the preprocessing steps you want to apply
4. Generate your dataset
5. (Optional) Train a model or export your data
Let's get started!
First, create a free Roboflow account. Then, create a new project from the Roboflow dashboard:
Once you have created a project, you will be taken to a page where you can upload your images. Drag-and-drop any images into the box:
You can also drag in annotation files if you want to view or amend annotations in Roboflow Annotate.
When you have uploaded your files, click "Save and Continue".
Your images will be uploaded to Roboflow.
After you have uploaded all of your images, you can label them using Roboflow Annotate.
Once you have labeled all of your images, you are ready to generate a dataset version. A dataset version is a frozen-in-time snapshot of a dataset. You can use versions to track different changes to your dataset over time. If you train models using Roboflow Train, you can compare performance with different augmentations all from the Versions tab.
To create a new dataset version, click "Versions" in the sidebar at the left side of your project.
This will show a page that lets you apply preprocessing and augmentation steps to your dataset.
Roboflow automatically selects a few preprocessing steps that we recommend for most projects. If you need to, however, you can remove the default steps.
To add more preprocesisng steps to your dataset, click on the "Preprocessing" section of the dataset generation page. Then, click the "Add Preprocessing Step" button. A pop up will appear showing all of the options available. You can add as many augmentations as you would like.
There are several preprocessing steps available, including:
You can also filter your dataset to only include images that are not marked as null in your Roboflow dataset, or include only images that match specific tags that you have added in Roboflow.
Here is an example showing how to apply a Tile augmentation to a dataset:
To add the preprocessing step, click "Apply". The change will appear in the list of augmentations to apply when generating your dataset version:
Once you have added all of the preprocessing and augmentation steps you want to apply, click "Generate" at the bottom of the page to generate your dataset.
You can then use your dataset for training a model on Roboflow. Or, you can export your dataset for use in a custom training process.
With all of your data labeled, you are now ready to train a model on Roboflow or export your data elsewhere. To train a model in Roboflow with your data, follow our Roboflow Train guide.
Alternatively, you can export your data into over 30 different formats, depending on the needs for your project.
Need more labeled data for your training dataset? Using Autodistill, you can automatically label data for object detection, segmentation, and classification. To learn more about using Autodistill, check out our Autodistill guide.
We take security seriously and have implemented comprehensive measures to keep your sensitive data safe
Below, you can find our guides on how to preprocess data for other models.