Use the widget below to experiment with Moondream 2. You can detect COCO classes such as people, vehicles, animals, household items.
Moondream 2 is the latest model in the Moondream series of “tiny vision language models”. The model, developed by vikhyat, is trained to perform a wide range of tasks, from VQA to image captioning to object detection and calculating x-y points of regions in an image. The model is licensed under an Apache 2.0 license.
Moondream 2 can be run on both CPU and GPUs. You can run the model with the moondream Python package or through the Hugging Face Transformers Python package. The moondream Python package does not support GPUs at the time of writing this guide according to the project repository, although this may change in the future.
The Moondream transformers implementation has four modes of inference:
Here is how Moondream performs when evaluated on various qualitative tests:
Moondream 2
is licensed under a
Apache 2.0
license.
You can use Roboflow Inference to deploy a
Moondream 2
API on your hardware. You can deploy the model on CPU (i.e. Raspberry Pi, AI PCs) and GPU devices (i.e. NVIDIA Jetson, NVIDIA T4).
Below are instructions on how to deploy your own model API.