In this guide, we are going to show how to deploy a
YOLO-World
model to
GCP Compute Engine
using Roboflow Inference. Inference is a high-performance inference server with which you can run a range of vision models, from YOLOv8 to CLIP to CogVLM.
To deploy a
YOLO-World
model to
GCP Compute Engine
, we will:
1. Set up our computing environment
2. Download the Roboflow Inference Server
3. Try out our model on an example image
Let's get started!
In this guide, we are going to show how to deploy a
YOLO-World
model to
GCP Compute Engine
using the Roboflow Inference Server. This SDK works with
YOLO-World
models trained on both Roboflow and in custom training processes outside of Roboflow.
To deploy a
YOLO-World
model to
GCP Compute Engine
, we will:
1. Train a model on (or upload a model to) Roboflow
2. Download the Roboflow Inference Server
3. Install the Python SDK to run inference on images
4. Try out the model on an example image
You can deploy the above workflow using a default model trained on the Microsoft COCO dataset. To deploy the model, click "Fork Workflow" to bring it into your Roboflow account. From there, you can deploy the model in two ways:
1. On images (either in the cloud or on your device), and;
2. On video streams (on your device, connected to a webcam or RTSP stream).
Once you have forked a Workflow, click "Deploy Workflow" to see instructions on how to run your model.
First, create a Roboflow account and create a new project. When you have created a new project, upload your project data, then generate a new dataset version. With that version ready, you can upload your model weights to Roboflow.
Download the Roboflow Python SDK:
Then, use the following script to upload your model weights:
Read the Roboflow model weight upload documentation for more information about uploading model weights.
You will need your project name, version, API key, and model weights. The following documentation shows how to retrieve your API key and project information:
- Retrieve your Roboflow project name and version
- Retrieve your API key
Change the path in the script above to the path where your model weights are stored.
When you have configured the script above, run the code to upload your weights to Roboflow.
Now you are ready to start deploying your model.
Open GCP Compute Engine and click the “Create Instance” button to create a virtual machine.
Next, you need to configure your instance. The requirements for configuration depend on your use case. If you are deploying a server for production, you may opt for a more powerful machine configuration. If you are testing a model and plan to deploy on another machine in the future, you may instead opt to deploy a less powerful machine.
You must deploy on a system with an NVIDIA GPU to run CogVLM with Inference.
A cost panel will appear on the right of the screen that estimates the cost of the machine you are deploying.
Fill out the required fields to configure your virtual machine. Then, click the “Create” button to create a virtual machine. It will take a few moments before your machine is ready. You can view the status from the Compute Engine Instances page.
When your virtual machine has been deployed, click on the machine name in the list of virtual machines on the Compute Engine Instances page.
To sign in using SSH in a terminal, click the arrow next to the SSH button and click “View gcloud command” If you have not already installed gcloud, follow the gcloud installation and configuration instructions to get started.
The Roboflow Inference Server allows you to deploy computer vision models to a range of devices, including
GCP Compute Engine
.
You can run Roboflow Inference in Docker, or via the Python SDK.
For this guide, we will run Inference with Docker and use the Python SDK to interface with our Docker deployment. We will deploy our model on a
GCP Compute Engine
.
To install Inference and set up an Inference server in Docker, run:
Now you have the Roboflow Inference Server running, you can use your model on
GCP Compute Engine
.
For a Jetson deployment, using Docker is highly recommended since everything has to be installed special depending on your Jetpack version. Absent Docker, it is easy to accidentally do these installs incorrectly and need to reflash everything to the device.
You can run inference on images with Roboflow Inference.
Create a new Python file and add the following code:
Substitute the model name and version with the values associated with your Roboflow account and project, then run the script.
This code will return a Python object with results from your model.
You can process these results and plot them on an image with the supervision Python package:
You can run inference on videos with Roboflow Inference and the InferencePipeline feature.
Create a new Python file and add the following code:
Substitute the model name and version with the values associated with your Roboflow account and project, then run the script.
This code will run a model on frames from a webcam stream. To use RTSP, set the video_reference
value to an RTSP stream URL. To use video, set the video_reference
value to a video file path.
Predictions are annotated using the render_boxes
helper function. You can specify any function to process each prediction in the on_prediction
parameter.
To learn how to define your own callback function with custom logic, refer to the Define Custom Prediction Logic documentation.
Below, you can find our guides on how to deploy
YOLO-World
models to other devices.
The following resources are useful reference material for working with your model using Roboflow and the Roboflow Inference Server.