January 17, 2025

Llama Vision 3.2 Support in Workflows

Llama Vision 3.2, a multimodal LLM developed by Meta AI, can now be used in Roboflow Workflows.

You can use the model to ask questions about the contents of images and retrieve a text response.

For example, you could use the block to:

Read the text in an image.
Ask questions about the text in an image.
Classify an image according to a specific prompt.

This response can then be returned by your Workflow, or processed further by other blocks (i.e. the Expression block).

Note: The Llama Vision 3.2 block is configured to use OpenRouter for inference. You will need an OpenRouter API key to use the Llama Vision 3.2 block.