Hosted or on-device deployment
SDKs optimized for maximum performance
Extensive documentation
In this guide, we are going to show how to deploy a
YOLOv5 Classification
model to
Azure Virtual Machines
using Roboflow Inference. Inference is a high-performance inference server with which you can run a range of vision models, from YOLOv8 to CLIP to CogVLM.
To deploy a
YOLOv5 Classification
model to
Azure Virtual Machines
, we will:
1. Set up our computing environment
2. Download the Roboflow Inference Server
3. Try out our model on an example image
Let's get started!
In this guide, we are going to show how to deploy a
YOLOv5 Classification
model to
Azure Virtual Machines
using the Roboflow Inference Server. This SDK works with
YOLOv5 Classification
models trained on both Roboflow and in custom training processes outside of Roboflow.
To deploy a
YOLOv5 Classification
model to
Azure Virtual Machines
, we will:
1. Train a model on (or upload a model to) Roboflow
2. Download the Roboflow Inference Server
3. Install the Python SDK to run inference on images
4. Try out the model on an example image
Let's get started!
If you want to upload your own model weights, first create a Roboflow account and create a new project. When you have created a new project, upload your project data, then generate a new dataset version. With that version ready, you can upload your model weights to Roboflow.
Download the Roboflow Python SDK:
pip install roboflow
Then, use the following script to upload your model weights:
from roboflow import Roboflow
home = "/path/to/project/folder"
rf = Roboflow(api_key=os.environ["ROBOFLOW_API_KEY"])
project = rf.workspace().project("PROJECT_ID")
project.version(PROJECT_VERSION).deploy(model_type="yolov5", model_path=f"/{home}/yolov5/runs/train/")
You will need your project name, version, API key, and model weights. The following documentation shows how to retrieve your API key and project information:
- Retrieve your Roboflow project name and version
- Retrieve your API key
Change the path in the script above to the path where your model weights are stored.
When you have configured the script above, run the code to upload your weights to Roboflow.
Now you are ready to start deploying your model.
Go to your Azure Virtual Machines homepage and create a Virtual Machine in Azure:
How you configure the virtual machine is dependent on how you plan to use the virtual machine so we will not cover specifics in this tutorial.
Roboflow Inference can run on both CPU (x86 and ARM) and NVIDIA GPU devices. For the best performance in production, we recommend deploying a machine with an NVIDIA GPU. For testing, use the system that makes sense given your needs. If you are running a GPU device, choose an Azure deep learning operating system. These operating systems often come with pre-built drivers for use in training a model.
When your virtual machine is ready, a pop up will appear. Click “View resource” to view the virtual machine. Or, go back to the Virtual Machines homepage and select your newly-deployed virtual machine.
To sign into your virtual machine, first click “Connect”. Choose the authentication method that you prefer to log into your server. When you have logged in, you are ready to move on to the next step.
The Roboflow Inference Server allows you to deploy computer vision models to a range of devices, including
Azure Virtual Machines
.
The Inference Server relies on Docker to run. If you don't already have Docker installed on the device(s) on which you want to run inference, install it by following the official Docker installation instructions.
Once you have Docker installed, run the following command to download the Roboflow Inference Server on your
Azure Virtual Machines
.
docker pull roboflow/roboflow-inference-server-cpu
docker run --net=host roboflow/roboflow-inference-server-cpu:latest
Now you have the Roboflow Inference Server running, you can use your model on
Azure Virtual Machines
.
The Roboflow Inference Server provides a HTTP API with a range of methods you can use to query your model and various popular models (i.e. SAM, CLIP). You can read more about all of the API methods available on the Roboflow Inference server in the Inference Server documentation.
The Roboflow Python SDK provides abstract convenience methods for interacting with the HTTP API. In this guide, we will use the Python SDK to run inference on a model. You can also query the HTTP API itself.
To install the Python SDK, run the following command:
pip install roboflow
With the Python SDK installed, you can run inference on your model in a few lines of Python code.
The following code will run inference on the model hosted on Roboflow and return a JSON object with predictions:
from roboflow import Roboflow
rf = Roboflow(api_key="API_KEY")
project = rf.workspace().project("PROJECT_NAME")
model = project.version(MODEL_VERSION, local="http://localhost:9001").model
# infer on a local image
print(model.predict("your_image.jpg", confidence=40, overlap=30).json())
# visualize your prediction
# model.predict("your_image.jpg", confidence=40, overlap=30).save("prediction.jpg")
# infer on an image hosted elsewhere
# print(model.predict("URL_OF_YOUR_IMAGE", hosted=True, confidence=40, overlap=30).json())
Substitute the model name, version, and API key with the values associated with your Roboflow account and project, then run the script.
We take security seriously and have implemented comprehensive measures to keep your sensitive data safe
Below, you can find our guides on how to deploy
YOLOv5 Classification
models to other devices.
The following resources are useful reference material for working with your model using Roboflow and the Roboflow Inference Server.