Roboflow Inference: An open source and scalable way to run models on-device, with or without an internet connection.
Run inference on-device without the headache of environment management, dependencies, managing CUDA versions, and more.
HTTP interfaces for foundation models, like CLIP and SAM, which you can use directly in your application or as part of multi-stage inference processes.
Complex inference features including autobatching inference, multi-model containers, multithreading, and DMZ deployments.
Combine custom models, open source models, LLM APIs, pre-built logic, and external applications
ARM CPU
x86 CPU
Luxonis OAK
NVIDIA GPU
NVIDIA TRT
NVIDIA Jetson
Raspberry Pi
ARM CPU
x86 CPU
Luxonis OAK
NVIDIA GPU
NVIDIA TRT
NVIDIA Jetson
Raspberry Pi
Python
result = client.run_workflow(
workspace_name="suvjg",
workflow_id="abc",
images={
"image": "YOUR_IMAGE.jpg"
}
)
Deploy using a fully managed infrastructure with an API endpoint or on-device, internet connection optional
Model Monitoring
Insights into how your deployed vision models are performing
Monitor which devices are online/offline, inference volume by device, confidence metrics for each model, inference time, and see individual prediction results.