VLPart vs. LLaVA: Compared and Contrasted

Models

VLPart

VLPart, developed by Meta Research, is an object detection and segmentation model that works with an open vocabulary

LLaVA-1.5

LLaVA is an open source multimodal language model that you can use for visual question answering and has limited support for object detection.

Learn more about LLaVA-1.5

Model Type

Object Detection

Model Features

Item 1 Info

Item 2 Info

Architecture

Annotation Format

Instance Segmentation

Framework

GitHub

View Repo

GitHub Stars

16,000

License

MIT License

Apache-2.0

Paper

View Paper

Training Notebook

Train on Colab

Deploy Model

Deploy with Roboflow

Compare Alternatives

Compare VLPart and LLaVA-1.5 with Autodistill

Using Autodistill, you can compare VLPart and LLaVA on your own images in a few lines of code.

Here is an example comparison:

To start a comparison, first install the required dependencies:


pip install autodistill autodistill-vlpart autodistill-llava

Next, create a new Python file and add the following code:


from autodistill_vlpart import VLPart
from autodistill_llava import LLaVA

from autodistill.detection import CaptionOntology
from autodistill.utils import compare

ontology = CaptionOntology(
    {
        "solar panel": "solar panel",
    }
)

models = [
    VLPart(ontology=ontology),
    LLaVA(ontology=ontology)
]

images = [
    "/home/user/autodistill/solarpanel1.jpg",
    "/home/user/autodistill/solarpanel2.jpg"
]

compare(
    models=models,
    images=images
)

Above, replace the images in the `images` directory with the images you want to use.

The images must be absolute paths.

Then, run the script.

You should see a model comparison like this:

When you have chosen a model that works best for your use case, you can auto label a folder of images using the following code:


base_model.label(
  input_folder="./images",
  output_folder="./dataset",
  extension=".jpg"
)

‍

VLPart vs. LLaVA

Models

VLPart

LLaVA-1.5

Compare VLPart and LLaVA-1.5 with Autodistill

Compare VLPart vs. LLaVA