How to use YOLO11 with a Webcam Stream

Before you can train a computer vision model, you need labeled data on which to train your model. The more accurate the labels, or annotations, are, the higher the performance the model will achieve.

Overview

Many computer vision models are deployed using a webcam as an input. The Roboflow Inference Python package enables you to access a webcam and start running inference with a model in a few lines of code.

In this guide, we will show you how to run

YOLO11

on frames from a webcam stream.

We will:

1. Install supervision and Inference
2. Load the webcam stream and define an inference callback
3. Test the webcam stream

Without further ado, let's get started!

YOLO11 and Image Annotation Resources

Explore these resources to enhance your understanding of YOLO11 and image annotation techniques.

Install Dependencies

For this tutorial, we will be using supervision, Inference, and OpenCV. supervision provides a range of utilities you can use in computer vision projects. Inference provides a concise utility through which we can load webcam streams. OpenCV is helpful for annotating video frames when supervision does not have a tool we can use.

Run the following command to install the dependencies we will use in this guide:

pip install supervision inference opencv-python


Once you have installed supervision and Inference, you are ready to define an inference callback and configure a webcam stream.

Define an Inference Callback

The Inference Python package provides an inference.Stream() method through which you can access a webcam and run logic on each frame. This callback method works with webcam and RTSP streams, but for this guide we will focus on webcam streams.

In the code below, we use inference.Stream() to read webcam frames and then run inference on each frame using a

YOLO11

model.


# import the InferencePipeline interface
from inference import InferencePipeline
# import a built-in sink called render_boxes (sinks are the logic that happens after inference)
from inference.core.interfaces.stream.sinks import render_boxes

api_key = "YOUR_ROBOFLOW_API_KEY"

# create an inference pipeline object
pipeline = InferencePipeline.init(
    model_id="yolo11s-640", # set the model id
    video_reference=0, # set the video reference (source of video), it can be a link/path to a video file, an RTSP stream url, or an integer representing a device id (usually 0 for built in webcams)
    on_prediction=render_boxes, # tell the pipeline object what to do with each set of inference by passing a function
    api_key=api_key, # provide your roboflow api key for loading models from the roboflow api
)
# start the pipeline
pipeline.start()
# wait for the pipeline to finish
pipeline.join()

The above code is configured to use the base YOLO11 weights trained on the Microsoft COCO dataset.

To use a custom model, replace the model ID with the model ID of a YOLO11 model hosted on Roboflow.

To upload a model to Roboflow, first install the Roboflow Python package:

pip install roboflow

Then, create a new Python file and paste in the following code:


from roboflow import Roboflow

rf = Roboflow(api_key="API_KEY")
project = rf.workspace().project("PROJECT_ID")
project.version(DATASET_VERSION).deploy(model_type="yolov8", model_path=f"{HOME}/runs/detect/train/")

In the code above, add your API key and the path to the model weights you want to upload. Learn how to retrieve your API key. Your weights will be uploaded to Roboflow. Your model will shortly be accessible over an API, and available for use in Inference. To learn more about uploading model weights to Roboflow, check out our full guide to uploading weights to Roboflow.

Above, we:

1. Import the required dependencies.
2. Define the model we want to use.
3. Define a callback function called render() which takes in the predictions from a model and a frame and processes them. We have included some example code to show how to annotate predictions and display them on camera for use in your code.
4. Use inference.Stream() to access a webcam and run our model.

The render() function is run on each frame retrieved from the webcam stream.

In the code above, replace the API_KEY value with your Roboflow API key. Learn how to retrieve your Roboflow API key.

Test Webcam

Now that you have configured a model and webcam stream, you are ready to test your webcam.

Next Steps

supervision provides an extensive range of functionalities for working with computer vision models. With supervision, you can:

1. Process and filter detections and segmentation masks from a range of popular models (YOLOv5, Ultralytics YOLOv8, MMDetection, and more).
2. Display predictions (i.e. bounding boxes, segmentation masks).
3. Annotate images (i.e. trace predictions, draw heatmaps).
4. Compute confusion matrices.

And more! To learn about the full range of functionality in supervision, check out the supervision documentation.