Llama Vision 3.2, a multimodal LLM developed by Meta AI, can now be used in Roboflow Workflows.
You can use the model to ask questions about the contents of images and retrieve a text response.
For example, you could use the block to:
This response can then be returned by your Workflow, or processed further by other blocks (i.e. the Expression block).
Note: The Llama Vision 3.2 block is configured to use OpenRouter for inference. You will need an OpenRouter API key to use the Llama Vision 3.2 block.