Use LMM to Build Computer Vision Pipelines and Applications

Workflows allows you to integrate LMM with models, logic, and applications.

Connect LMM to other blocks to build a custom workflow

Ask a question to a Large Multimodal Model (LMM) with an image and text. You can specify arbitrary text prompts to an LMMBlock. The LLMBlock supports two LMMs: - OpenAI's GPT-4 with Vision, and; - CogVLM. You need to provide your OpenAI API key to use the GPT-4 with Vision model. You do not need to provide an API key to use CogVLM. _If you want to classify an image into one or more categories, we recommend using the dedicated LMMForClassificationBlock._
Model

LMM

Run a large language model.

Build a Custom Vision AI Application

Connect pre-trained models, open source models, LLM APIs, advanced logic, and external applications. Deploy as an API endpoint, on-prem, or at the edge.
Get Started

Find Other Blocks in the Model Category

This is some text inside of a div block.

Explore Popular Combinations Using LMM

No items found.

Customize Your Pipeline

Connect models from OpenAI or Meta AI, applications like Slack or Pager Duty, and logic like filtering or cropping.
View All Blocks