Connect LMM to other blocks to build a custom workflow
Ask a question to a Large Multimodal Model (LMM) with an image and text.
You can specify arbitrary text prompts to an LMMBlock.
The LLMBlock supports two LMMs:
- OpenAI's GPT-4 with Vision, and;
- CogVLM.
You need to provide your OpenAI API key to use the GPT-4 with Vision model. You do not
need to provide an API key to use CogVLM.
_If you want to classify an image into one or more categories, we recommend using the
dedicated LMMForClassificationBlock._
Connect pre-trained models, open source models, LLM APIs, advanced logic, and external applications. Deploy as an API endpoint, on-prem, or at the edge.