HOW TO GUIDE

Deploy PaliGemma to AWS EC2

Using Roboflow Inference, you can deploy computer vision models to the edge with a few lines of code. Learn more in our guide below.

Get started for free

Overview

In this guide, we are going to show how to deploy a

PaliGemma

model to

AWS EC2

using the Roboflow Inference Server. This SDK works with

PaliGemma

models trained on both Roboflow and in custom training processes outside of Roboflow.

Run a Model on Your Device

You can deploy the above workflow using a default model trained on the Microsoft COCO dataset. To deploy the model, click "Fork Workflow" to bring it into your Roboflow account. From there, you can deploy the model in two ways:

1. On images (either in the cloud or on your device), and;
2. On video streams (on your device, connected to a webcam or RTSP stream).

Once you have forked a Workflow, click "Deploy Workflow" to see instructions on how to run your model.

To deploy a Deploy PaliGemma to AWS EC2 model, you will:

Deploy a Workflow
Upload custom model weights to Roboflow
Run a Workflow using your custom model weights on your hardware
Try out the model on an example image

Let's get started!

PaliGemma and Image Annotation Resources

Explore these resources to enhance your understanding of Deploy PaliGemma to AWS EC2 and image annotation techniques.

Step 0

Run COCO on Your Device

Let's start by getting a model trained using the Microsoft COCO dataset running on your device.

This is a great way to experiment with vision before you train your own model (which we'll also cover in more depth below).

To get started, install Inference, our open source computer vision Inference server:

pip install inference inference-sdk

Now we have Inference installed, we are ready to start using vision models on our system.

Create a new Python file and add the following code:

# import the InferencePipeline interface
from inference import InferencePipeline
# import a built-in sink called render_boxes (sinks are the logic that happens after inference)
from inference.core.interfaces.stream.sinks import render_boxes

api_key = "YOUR_ROBOFLOW_API_KEY"

# create an inference pipeline object
pipeline = InferencePipeline.init(
    model_id="yolov8n-640", # set the model id to a yolov8n model with in put size 640
    video_reference=0, # set the video reference (source of video), it can be a link/path to a video file, an RTSP stream url, or an integer representing a device id (usually 0 for built in webcams)
    on_prediction=render_boxes, # tell the pipeline object what to do with each set of inference by passing a function
    api_key=api_key, # provide your roboflow api key for loading models from the roboflow api
)
# start the pipeline
pipeline.start()
# wait for the pipeline to finish
pipeline.join()

Above, set your Roboflow API key.

When you run this code, you will have a model trained with the COCO dataset running on your webcam!

Step 1

Train a Model on or Upload a Model to Roboflow

Now that we have tried deployment with COCO, we can get started with running our own, fine-tuned model.
‍
First, create a Roboflow account and create a new project. When you have created a new project, upload your project data, then generate a new dataset version. With that version ready, you can upload your model weights to Roboflow.

Download the Roboflow Python SDK:

pip install roboflow

Then, use the following script to upload your model weights:

from roboflow import Roboflow

home = "/path/to/project/folder"

rf = Roboflow(api_key=os.environ["ROBOFLOW_API_KEY"])
project = rf.workspace().project("PROJECT_ID")

project.version(PROJECT_VERSION).deploy(model_type="yolov5", model_path=f"/{home}/yolov5/runs/train/")

Read the Roboflow model weight upload documentation for more information about uploading model weights.

You will need your project name, version, API key, and model weights. The following documentation shows how to retrieve your API key and project information:

- Retrieve your Roboflow project name and version
- Retrieve your API key

Change the path in the script above to the path where your model weights are stored.

When you have configured the script above, run the code to upload your weights to Roboflow.

Now you are ready to start deploying your model.

Step 2

Download Roboflow Inference

The Roboflow Inference Server allows you to deploy computer vision models to a range of devices, including

AWS EC2

.

You can run Roboflow Inference in Docker, or via the Python SDK.

For this guide, we will run Inference with Docker and use the Python SDK to interface with our Docker deployment. We will deploy our model on a

AWS EC2

.

To install Inference and set up an Inference server in Docker, run:
‍


pip install inference
inference server start

Now you have the Roboflow Inference Server running, you can use your model on

AWS EC2

.

For a Jetson deployment, using Docker is highly recommended since everything has to be installed special depending on your Jetpack version. Absent Docker, it is easy to accidentally do these installs incorrectly and need to reflash everything to the device.

Step 3

Run Inference on an Image

You can run inference on images with Roboflow Inference.

Create a new Python file and add the following code:

# import client from inference sdk
from inference_sdk import InferenceHTTPClient
# import PIL for loading image
from PIL import Image
# import os for getting api key from environment
import os

# set the project_id, model_version, image_url
project_id = "soccer-players-5fuqs"
model_version = 1
filename = "path/to/local/image.jpg"

# create a client object
client = InferenceHTTPClient(
    api_url="http://localhost:9001",
    api_key=os.environ["API_KEY"],
)

# load the image
pil_image = Image.open(filename)

# run inference
results = client.infer(pil_image, model_id=f"{project_id}/{model_version}")

print(results)

Substitute the model name and version with the values associated with your Roboflow account and project, then run the script.

Retrieve your Roboflow project name and version

This code will return a Python object with results from your model.

You can process these results and plot them on an image with the supervision Python package:


detections = sv.Detections.from_inference(results[0])

box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()

annotated_image = box_annotator.annotate(
    scene=image, detections=detections)
annotated_image = label_annotator.annotate(
    scene=annotated_image, detections=detections)

sv.plot_image(image=annotated_image, size=(16, 16))

Step 4

Run Inference on a Video

You can run inference on videos with Roboflow Inference and the InferencePipeline feature.

Create a new Python file and add the following code:

# Import the InferencePipeline object
from inference import InferencePipeline
# Import the built in render_boxes sink for visualizing results
from inference.core.interfaces.stream.sinks import render_boxes

# initialize a pipeline object
pipeline = InferencePipeline.init(
    model_id="rock-paper-scissors-sxsw/11", # Roboflow model to use
    video_reference=0, # Path to video, device id (int, usually 0 for built in webcams), or RTSP stream url
    on_prediction=render_boxes, # Function to run after each prediction
)
pipeline.start()
pipeline.join()

Substitute the model name and version with the values associated with your Roboflow account and project, then run the script.

Retrieve your Roboflow project name and version

This code will run a model on frames from a webcam stream. To use RTSP, set the video_reference value to an RTSP stream URL. To use video, set the video_reference value to a video file path.

Predictions are annotated using the render_boxes helper function. You can specify any function to process each prediction in the on_prediction parameter.

To learn how to define your own callback function with custom logic, refer to the Define Custom Prediction Logic documentation.

Deploy PaliGemma to AWS EC2

Overview

Run a Model on Your Device

PaliGemma and Image Annotation Resources

Run COCO on Your Device

Train a Model on or Upload a Model to Roboflow

Download Roboflow Inference

Run Inference on an Image

Run Inference on a Video

Join over 1 million developers building with Roboflow