Moondream is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint, enabling efficient multimodal understanding with low latency.
- Original paper / reference: Moondream project repository — https://github.com/vikhyat/moondream
Moondream2-2B
This model implements the Moondream2-2B, optimized for efficient multimodal reasoning while maintaining a small compute footprint. It is well suited for applications such as visual question answering, image captioning, document understanding, and real-time multimodal assistants on edge or resource-constrained devices.
Model Configuration:
- Reference implementation: moondream
- Original Weight: moondream2
- Resolution: 3x378x378
- Support Cooper version:
- Cooper SDK: [2.5.3]
- Cooper Foundry: [2.2]
| Model | Device | Model Link |
|---|---|---|
| Moondream2-2B | N1-655 | Model_Link |
| Moondream2-2B | CV7 | Model_Link |
| Moondream2-2B | CV72 | Model_Link |
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
