LLaVA vs. Kosmos-2: Compared and Contrasted

Models

LLaVA-1.5

LLaVA is an open source multimodal language model that you can use for visual question answering and has limited support for object detection.

Learn more about LLaVA-1.5

Kosmos-2

Kosmos-2 is a multimodal language model capable of object detection and grounding text in images.

Learn more about Kosmos-2

Model Type

Object Detection

Model Features

Item 1 Info

Item 2 Info

Architecture

Annotation Format

Instance Segmentation

Framework

GitHub

View Repo

GitHub Stars

16,000

License

Apache-2.0

Paper

View Paper

Training Notebook

Train on Colab

Deploy Model

Deploy with Roboflow

Compare Alternatives

Compare LLaVA-1.5 and Kosmos-2 with Autodistill

Using Autodistill, you can compare LLaVA and Kosmos-2 on your own images in a few lines of code.

Here is an example comparison:

To start a comparison, first install the required dependencies:


pip install autodistill autodistill-llava autodistill-kosmos-2

Next, create a new Python file and add the following code:


from autodistill_kosmos2 import Kosmos2
from autodistill_llava import LLaVA

from autodistill.detection import CaptionOntology
from autodistill.utils import compare

ontology = CaptionOntology(
    {
        "solar panel": "solar panel",
    }
)

models = [
    Kosmos2(ontology=ontology),
    LLaVA(ontology=ontology)
]

images = [
    "/home/user/autodistill/solarpanel1.jpg",
    "/home/user/autodistill/solarpanel2.jpg"
]

compare(
    models=models,
    images=images
)

Above, replace the images in the `images` directory with the images you want to use.

The images must be absolute paths.

Then, run the script.

You should see a model comparison like this:

When you have chosen a model that works best for your use case, you can auto label a folder of images using the following code:


base_model.label(
  input_folder="./images",
  output_folder="./dataset",
  extension=".jpg"
)

‍

LLaVA vs. Kosmos-2

Models

LLaVA-1.5

Kosmos-2

Compare LLaVA-1.5 and Kosmos-2 with Autodistill

Compare LLaVA vs. Kosmos-2