Qwen-VL is an LMM developed by Alibaba Cloud. Qwen-VL accepts images, text, and bounding boxes as inputs. The model can output text and bounding boxes. Qwen-VL naturally supports English, Chinese, and multilingual conversation.
Overview
Qwen-VL is an LMM developed by Alibaba Cloud. Qwen-VL accepts images, text, and bounding boxes as inputs. The model can output text and bounding boxes. Qwen-VL naturally supports English, Chinese, and multilingual conversation. Thus, this model may be worth exploring if you have a use case where you expect Chinese and English to be used in prompts or answers.
Performance
Use This Model
Label Data Automatically with QwenVL
You can automatically label a dataset using QwenVL with help from Autodistill, an open source package for training computer vision models. You can label a folder of images automatically with only a few lines of code. Below, see our tutorials that demonstrate how to use QwenVL to train a computer vision model.
No items found.
Deploy to Production
Roboflow offers a range of SDKs with which you can deploy your model to production.
YOLOv8 uses the uses the YOLOv8 PyTorch TXT annotation format. If your annotation is in a different format, you can use Roboflow's annotation conversion tools to get your data into the right format.