Deploy Computer Vision Models

Deploy Segment Anything to GCP Compute Engine

Once you have a computer vision model that achieves the requisite level of performance for your production use case, the next question to answer is "how can I deploy the model?"

Roboflow offers a range of SDKs with which you can deploy object detection, classification, and segmentation models.

In this guide, we are going to show how to deploy a

Segment Anything Model (SAM)

model to

GCP Compute Engine

using the Roboflow Inference Server. This SDK works with

Segment Anything Model (SAM)

models trained on both Roboflow and in custom training processes outside of Roboflow.

To deploy a

Segment Anything Model (SAM)

model to

GCP Compute Engine

, we will:

1. Train a model on (or upload a model to) Roboflow
2. Download the Roboflow Inference Server
3. Install the Python SDK to run inference on images
4. Try out the model on an example image

Let's get started!

Train a Model on or Upload a Model to Roboflow

If you want to upload your own model weights, first create a Roboflow account and create a new project. When you have created a new project, upload your project data, then generate a new dataset version. With that version ready, you can upload your model weights to Roboflow.

Download the Roboflow Python SDK:

pip install roboflow

Then, use the following script to upload your model weights:

from roboflow import Roboflow

home = "/path/to/project/folder"

rf = Roboflow(api_key=os.environ["ROBOFLOW_API_KEY"])
project = rf.workspace().project("PROJECT_ID")

project.version(PROJECT_VERSION).deploy(model_type="yolov5", model_path=f"/{home}/yolov5/runs/train/")

You will need your project name, version, API key, and model weights. The following documentation shows how to retrieve your API key and project information:

- Retrieve your Roboflow project name and version
- Retrieve your API key

Change the path in the script above to the path where your model weights are stored.

When you have configured the script above, run the code to upload your weights to Roboflow.

Now you are ready to start deploying your model.

Set up a GCP Compute Engine Virtual Machine

Open GCP Compute Engine and click the “Create Instance” button to create a virtual machine.

Next, you need to configure your instance. The requirements for configuration depend on your use case. If you are deploying a server for production, you may opt for a more powerful machine configuration. If you are testing a model and plan to deploy on another machine in the future, you may instead opt to deploy a less powerful machine.

You can deploy Roboflow Inference on CPU and GPU devices. We recommend deploying on GPU for the highest performance. But, GPU devices are most costly to run and there is additional setup associated with using GPUs. For this guide, we will focus on CPUs.

A cost panel will appear on the right of the screen that estimates the cost of the machine you are deploying.

Fill out the required fields to configure your virtual machine. Then, click the “Create” button to create a virtual machine. It will take a few moments before your machine is ready. You can view the status from the Compute Engine Instances page.

When your virtual machine has been deployed, click on the machine name in the list of virtual machines on the Compute Engine Instances page.

To sign in using SSH in a terminal, click the arrow next to the SSH button and click “View gcloud command” If you have not already installed gcloud, follow the gcloud installation and configuration instructions to get started.

Download the Roboflow Inference Server

The Roboflow Inference Server allows you to deploy computer vision models to a range of devices, including

GCP Compute Engine


The Inference Server relies on Docker to run. If you don't already have Docker installed on the device(s) on which you want to run inference, install it by following the official Docker installation instructions.

Once you have Docker installed, run the following command to download the Roboflow Inference Server on your

GCP Compute Engine


Now you have the Roboflow Inference Server running, you can use your model on

GCP Compute Engine


Install the Roboflow Python SDK

The Roboflow Inference Server provides a HTTP API with a range of methods you can use to query your model and various popular models (i.e. SAM, CLIP). You can read more about all of the API methods available on the Roboflow Inference server in the Inference Server documentation.

The Roboflow Python SDK provides abstract convenience methods for interacting with the HTTP API. In this guide, we will use the Python SDK to run inference on a model. You can also query the HTTP API itself.

To install the Python SDK, run the following command:

pip install roboflow

Run Inference on an Image

The Segment Anything API, available at localhost:9001, allows you to retrieve image embeddings and segmentation masks for an image.

To retrieve masks, you should make a request to localhost:9001/sam/embed_image with:

1. An image from which to retrieve masks, and;

2. A prompt point. This point represents the object that you want to segment. The prompt point should have an ID and the x, y coordinates of the prompt.

To make a request to the API, you will need your Roboflow API key. Learn how to retrieve your Roboflow API key.

Below is an example showing how to make a request to the API. Read the documentation for more information about the Inference Segment Anything API.

import requests

infer_payload = {
    "image": {
        "type": "base64",
        "value": "..."
    "point_coords": [[380, 350]],
    "point_labels": [1],
    "image_id": "example_image_id",

res =

masks = request.json()['masks']

Enterprise-grade security and compliance

We take security seriously and have implemented comprehensive measures to keep your sensitive data safe

Compliant with SOC2 Type 1 requirements
All data is encrypted in transit and at rest, with SSL transport receiving a grade A+ rating from Qualys
Strict row-level permissions to ensure users cannot access sensitive data outside of their organizations
Roboflow is hosted on the Google Cloud Platform and Amazon Web Services, best-in-class infrastructure as a service providers
Authentication, database and file storage mechanisms are ISO 27001, ISO 27017, ISO 27018, SOC 1, SOC 2 and SOC 3 compliant
PCI compliant with Self-Assessment Questionnaire A and Attestation of Compliance
All card numbers and bank accounts never touch our servers and are stored by Stripe, a PCI Service Provider Level 1, the highest available security certification in the payments industry
Access to production data is heavily restricted within Roboflow and only accessible via SSO login
All Roboflow employees sign nondisclosure agreements restricting them from sharing information learned while handling customer data

Learn how to deploy models to other devices

Below, you can find our guides on how to deploy

Segment Anything Model (SAM)

models to other devices.


The following resources are useful reference material for working with your model using Roboflow and the Roboflow Inference Server.

Used by over 16,000 companies for computer vision projects
cardinal healthUSGIntel logoRivian logoMedtronic logoColumn logo