Update README.md
Browse files
README.md
CHANGED
|
@@ -22,3 +22,23 @@ tags:
|
|
| 22 |
- molmo
|
| 23 |
- molmo2
|
| 24 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
- molmo
|
| 23 |
- molmo2
|
| 24 |
---
|
| 25 |
+
|
| 26 |
+
<img src="molmo_2_logo_RGB.png" alt="Logo for the Molmo2 Project" style="width: auto; height: 50px;">
|
| 27 |
+
|
| 28 |
+
# Molmo2 8B
|
| 29 |
+
|
| 30 |
+
Molmo2 is a family of open vision-language models developed by the Allen Institute for AI (Ai2). Molmo2 models are trained on Molmo2 data, a dataset of highly-curated video-text pairs. It has state-of-the-art performance among multimodal models with a similar size while being fully open-source. You can find all models in the Molmo2 family [here](https://huggingface.co/collections/allenai/molmo2).
|
| 31 |
+
|
| 32 |
+
**Learn more** about the Molmo2 family [in our announcement blog post](http://allenai.org/news/molmo2)
|
| 33 |
+
|
| 34 |
+
Molmo2 8B is based on Qwen3-8B and uses SIGLIP2 as vision backbone. It outperforms others in the class of open weight and data models on short videos, counting, and captioning, and is competitive on long-videos. On video-grounding Molmo2 outperforms larger proprietary models, including 32.9% (Molmo2) vs 17% (Gemini 2.5 Pro) on video pointing.
|
| 35 |
+
|
| 36 |
+
Try it here! -
|
| 37 |
+
|
| 38 |
+
Ai2 is commited to open science. All artifacts used in creating Molmo (Molmo2 dataset, training code, evaluations, intermediate checkpoints) will be made available at a later date, furthering our commitment to open-source AI development and reproducibility.
|
| 39 |
+
|
| 40 |
+
Quick links:
|
| 41 |
+
- 💬 [Demo](https://molmo.allenai.org/](https://playground.allenai.org/?model=molmo2-8b)
|
| 42 |
+
- 📂 [All Models](https://huggingface.co/collections/allenai/molmo2)
|
| 43 |
+
- 📃 [Paper](UPDATE THIS LINK)
|
| 44 |
+
- 🎥 [Blog with Videos](http://allenai.org/news/molmo2)
|