How to Preprocess Data

How to preprocess data for YOLOv8 Oriented Bounding Boxes

Before you train a model, you may want to apply various preprocessing steps to your dataset. For example, you may want to resize your images to a specific resolution, or apply tiling.

Adding preprocessing steps ensures your data is consistent before it is used in training.

In this guide, we are going to show how to preprocess data for

YOLOv8 Oriented Bounding Boxes

models using Roboflow. Roboflow offers:

Access to popular preprocessing steps

SOC II Type 2 Compliant

Trusted by 500,000+ developers

To generate preprocessing steps for a

YOLOv8 Oriented Bounding Boxes

model, you will:

1. Import data into Roboflow
2. Open the Versions tab
3. Select the preprocessing steps you want to apply
4. Generate your dataset
5. (Optional) Train a model or export your data

Let's get started!

Step #1: Import data into Roboflow Annotate

First, create a free Roboflow account. Then, create a new project from the Roboflow dashboard:


Once you have created a project, you will be taken to a page where you can upload your images. Drag-and-drop any images into the box:


You can also drag in annotation files if you want to view or amend annotations in Roboflow Annotate.

When you have uploaded your files, click "Save and Continue".

Your images will be uploaded to Roboflow.

After you have uploaded all of your images, you can label them using Roboflow Annotate.

Step #2: Prepare to Generate a New Dataset Version

Once you have labeled all of your images, you are ready to generate a dataset version. A dataset version is a frozen-in-time snapshot of a dataset. You can use versions to track different changes to your dataset over time. If you train models using Roboflow Train, you can compare performance with different augmentations all from the Versions tab.

To create a new dataset version, click "Versions" in the sidebar at the left side of your project.

This will show a page that lets you apply preprocessing and augmentation steps to your dataset.

Step #3: Select Preprocessing Steps to Apply

Roboflow automatically selects a few preprocessing steps that we recommend for most projects. If you need to, however, you can remove the default steps.

To add more preprocesisng steps to your dataset, click on the "Preprocessing" section of the dataset generation page. Then, click the "Add Preprocessing Step" button. A pop up will appear showing all of the options available. You can add as many augmentations as you would like.

There are several preprocessing steps available, including:

  • Auto-orient
  • Resize images
  • Isolate objects
  • Static crop
  • Dynamic crop
  • Grayscale
  • Auto-adjust contrast
  • Tile

You can also filter your dataset to only include images that are not marked as null in your Roboflow dataset, or include only images that match specific tags that you have added in Roboflow.

Here is an example showing how to apply a Tile augmentation to a dataset:

To add the preprocessing step, click "Apply". The change will appear in the list of augmentations to apply when generating your dataset version:

Once you have added all of the preprocessing and augmentation steps you want to apply, click "Generate" at the bottom of the page to generate your dataset.

You can then use your dataset for training a model on Roboflow. Or, you can export your dataset for use in a custom training process.

Step #5: Train a Model or Export Data

With all of your data labeled, you are now ready to train a model on Roboflow or export your data elsewhere. To train a model in Roboflow with your data, follow our Roboflow Train guide.

Alternatively, you can export your data into over 30 different formats, depending on the needs for your project.

Bonus: Automatically Label Data with Autodistill

Need more labeled data for your training dataset? Using Autodistill, you can automatically label data for object detection, segmentation, and classification. To learn more about using Autodistill, check out our Autodistill guide.

Enterprise-grade security and compliance

We take security seriously and have implemented comprehensive measures to keep your sensitive data safe

Compliant with SOC2 Type 1 requirements
All data is encrypted in transit and at rest, with SSL transport receiving a grade A+ rating from Qualys
Strict row-level permissions to ensure users cannot access sensitive data outside of their organizations
Roboflow is hosted on the Google Cloud Platform and Amazon Web Services, best-in-class infrastructure as a service providers
Authentication, database and file storage mechanisms are ISO 27001, ISO 27017, ISO 27018, SOC 1, SOC 2 and SOC 3 compliant
PCI compliant with Self-Assessment Questionnaire A and Attestation of Compliance
All card numbers and bank accounts never touch our servers and are stored by Stripe, a PCI Service Provider Level 1, the highest available security certification in the payments industry
Access to production data is heavily restricted within Roboflow and only accessible via SSO login
All Roboflow employees sign nondisclosure agreements restricting them from sharing information learned while handling customer data

Learn how to preprocess data for other models

Below, you can find our guides on how to preprocess data for other models.