Try the Model

Use the widget below to experiment with GPT-4 with Vision. You can detect COCO classes such as people, vehicles, animals, household items.

Overview

In October 2023, OpenAI released an API for GPT-4 with vision, an extension to GPT-4 that enables you to ask questions about images. GPT-4 is now capable of performing tasks such as image classification, visual question answering, handwriting OCR, document OCR, and more. The GPT-4 with vision API opens up a new world of possibilities in building computer vision applications. Read our analysis of GPT-4 Vision’s capabilities.

The capabilities of GPT-4 are enhanced when matched with Roboflow’s object detection, classification, and segmentation models, as well as foundation models available through Roboflow Inference, an open source inference server that powers millions of inferences a month on production models.

GPT-4 with Vision License

GPT-4 with Vision

is licensed under a

license.

Performance

Deploy a GPT-4 with Vision API

You can use Roboflow Inference to deploy a

GPT-4 with Vision

API on your hardware. You can deploy the model on CPU (i.e. Raspberry Pi, AI PCs) and GPU devices (i.e. NVIDIA Jetson, NVIDIA T4).

Below are instructions on how to deploy your own model API.

Label Data Automatically with GPT-4 with Vision

You can automatically label a dataset using GPT-4 with Vision with help from Autodistill, an open source package for training computer vision models. You can label a folder of images automatically with only a few lines of code. Below, see our tutorials that demonstrate how to use GPT-4 with Vision to train a computer vision model.

No items found.