Hosted or on-device deployment
SDKs optimized for maximum performance
Extensive documentation
In this guide, we are going to show how to deploy a
YOLOv9
model to
Azure Virtual Machines
using Roboflow Inference. Inference is a high-performance inference server with which you can run a range of vision models, from YOLOv8 to CLIP to CogVLM.
To deploy a
YOLOv9
model to
Azure Virtual Machines
, we will:
1. Set up our computing environment
2. Download the Roboflow Inference Server
3. Try out our model on an example image
Let's get started!
In this guide, we are going to show how to deploy a
YOLOv9
model to
Azure Virtual Machines
using the Roboflow Inference Server. This SDK works with
YOLOv9
models trained on both Roboflow and in custom training processes outside of Roboflow.
To deploy a
YOLOv9
model to
Azure Virtual Machines
, we will:
1. Train a model on (or upload a model to) Roboflow
2. Download the Roboflow Inference Server
3. Install the Python SDK to run inference on images
4. Try out the model on an example image
Let's get started!
First, create a Roboflow account and create a new project. When you have created a new project, upload your project data, then generate a new dataset version. With that version ready, you can upload your model weights to Roboflow.
Download the Roboflow Python SDK:
pip install roboflow
Then, use the following script to upload your model weights:
from roboflow import Roboflow
home = "/path/to/project/folder"
rf = Roboflow(api_key=os.environ["ROBOFLOW_API_KEY"])
project = rf.workspace().project("PROJECT_ID")
project.version(PROJECT_VERSION).deploy(model_type="yolov5", model_path=f"/{home}/yolov5/runs/train/")
Read the Roboflow model weight upload documentation for more information about uploading model weights.
You will need your project name, version, API key, and model weights. The following documentation shows how to retrieve your API key and project information:
- Retrieve your Roboflow project name and version
- Retrieve your API key
Change the path in the script above to the path where your model weights are stored.
When you have configured the script above, run the code to upload your weights to Roboflow.
Now you are ready to start deploying your model.
Go to your Azure Virtual Machines homepage and create a Virtual Machine in Azure:
How you configure the virtual machine is dependent on how you plan to use the virtual machine so we will not cover specifics in this tutorial.
Roboflow Inference can run on both CPU (x86 and ARM) and NVIDIA GPU devices. But, you will need deploy a system with a GPU to deploy CogVLM. Choose an Azure deep learning operating system when you deploy your system. These operating systems often come with pre-built drivers for use in deep learning tasks like running multimodal models.
When your virtual machine is ready, a pop up will appear. Click “View resource” to view the virtual machine. Or, go back to the Virtual Machines homepage and select your newly-deployed virtual machine.
To sign into your virtual machine, first click “Connect”. Choose the authentication method that you prefer to log into your server. When you have logged in, you are ready to move on to the next step.
The Roboflow Inference Server allows you to deploy computer vision models to a range of devices, including
Azure Virtual Machines
.
You can run Roboflow Inference in Docker, or via the Python SDK.
For this guide, we will run Inference with Docker and use the Python SDK to interface with our Docker deployment. We will deploy our model on a
Azure Virtual Machines
.
To install Inference and set up an Inference server in Docker, run:
pip install inference
inference server start
Now you have the Roboflow Inference Server running, you can use your model on
Azure Virtual Machines
.
For a Jetson deployment, using Docker is highly recommended since everything has to be installed special depending on your Jetpack version. Absent Docker, it is easy to accidentally do these installs incorrectly and need to reflash everything to the device.
You can run inference on images with Roboflow Inference.
Create a new Python file and add the following code:
# import client from inference sdk
from inference_sdk import InferenceHTTPClient
# import PIL for loading image
from PIL import Image
# import os for getting api key from environment
import os
# set the project_id, model_version, image_url
project_id = "soccer-players-5fuqs"
model_version = 1
filename = "path/to/local/image.jpg"
# create a client object
client = InferenceHTTPClient(
api_url="http://localhost:9001",
api_key=os.environ["API_KEY"],
)
# load the image
pil_image = Image.open(filename)
# run inference
results = client.infer(pil_image, model_id=f"{project_id}/{model_version}")
print(results)
Substitute the model name and version with the values associated with your Roboflow account and project, then run the script.
This code will return a Python object with results from your model.
You can process these results and plot them on an image with the supervision Python package:
detections = sv.Detections.from_inference(results[0])
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
annotated_image = box_annotator.annotate(
scene=image, detections=detections)
annotated_image = label_annotator.annotate(
scene=annotated_image, detections=detections)
sv.plot_image(image=annotated_image, size=(16, 16))
You can run inference on videos with Roboflow Inference and the InferencePipeline feature.
Create a new Python file and add the following code:
# Import the InferencePipeline object
from inference import InferencePipeline
# Import the built in render_boxes sink for visualizing results
from inference.core.interfaces.stream.sinks import render_boxes
# initialize a pipeline object
pipeline = InferencePipeline.init(
model_id="rock-paper-scissors-sxsw/11", # Roboflow model to use
video_reference=0, # Path to video, device id (int, usually 0 for built in webcams), or RTSP stream url
on_prediction=render_boxes, # Function to run after each prediction
)
pipeline.start()
pipeline.join()
Substitute the model name and version with the values associated with your Roboflow account and project, then run the script.
This code will run a model on frames from a webcam stream. To use RTSP, set the video_reference
value to an RTSP stream URL. To use video, set the video_reference
value to a video file path.
Predictions are annotated using the render_boxes
helper function. You can specify any function to process each prediction in the on_prediction
parameter.
To learn how to define your own callback function with custom logic, refer to the Define Custom Prediction Logic documentation.
Below, you can find our guides on how to deploy
YOLOv9
models to other devices.
The following resources are useful reference material for working with your model using Roboflow and the Roboflow Inference Server.