In January 2021 OpenAI released CLIP (Contrastive Language-Image Pre-Training), a zero-shot classifier that leverages knowledge of the English language to classify images without having to be trained on any specific dataset. It applies the recent advancements in large-scale transformers like GPT-3 to the vision arena.
The results are extremely impressive; we have put together a CLIP tutorial and a CLIP Colab notebook for you to experiment with the model on your own images. We've made slight modifications to make "prompt" engineering easier by extracting it into a configuration file and have automatically generated starter prompts for all of our public datasets. You can use Roboflow to generate this config file to try your own classification or object detection datasets with CLIP.
With Roboflow, you can deploy a computer vision model without having to build your own infrastructure.
Below, we show how to convert data to and from
OpenAI CLIP Classification
. We also list popular models that use the
OpenAI CLIP Classification
data format. Our conversion tools are free to use.
Free data conversion
SOC II Type 2 Compliant
Trusted by 250,000+ developers
Free data conversion
SOC II Type 1 Compliant
Trusted by 250,000+ developers
The
OpenAI CLIP
,
models all use the
data format.
An example picture from the Hard Hat dataset depicting a man wearing a hard-hat
An example picture from the Hard Hat dataset depicting several men wearing hard-hats
An example picture from the Hard Hat dataset depicting several people, some wearing hard hats
An example picture from the Hard Hat dataset depicting several people