Update README.md
Browse files
README.md
CHANGED
|
@@ -1,10 +1,28 @@
|
|
| 1 |
---
|
| 2 |
title: README
|
| 3 |
-
emoji:
|
| 4 |
-
colorFrom:
|
| 5 |
colorTo: red
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
title: README
|
| 3 |
+
emoji: π
|
| 4 |
+
colorFrom: red
|
| 5 |
colorTo: red
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
+
license: apache-2.0
|
| 9 |
---
|
| 10 |
|
| 11 |
+
π₯π₯ **Introducing MMR1** β a Multimodal Reasoning Model trained with **Variance-Aware Sampling (VAS)**
|
| 12 |
+
|
| 13 |
+
π‘ **Highlights**
|
| 14 |
+
* **Variance-Aware Sampling (VAS)** for multimodal RL training:
|
| 15 |
+
- Establishes a theoretical link between reward variance and gradient signal strength;
|
| 16 |
+
- Proposes the **Variance Promotion Score (VPS)** integrating Outcome Variance and Trajectory Diversity;
|
| 17 |
+
- Enables more efficient and stable optimization under limited data conditions.
|
| 18 |
+
* Open-sources **~1.6M Long-CoT cold-start samples**, annotated by Gemini 2.5 Pro/Flash and verified with GPT-4o.
|
| 19 |
+
* Releases a suite of **SFT and RL checkpoints** at multiple scales: 3B, 7B, and 32B variants.
|
| 20 |
+
|
| 21 |
+
π¦ **Resources**
|
| 22 |
+
* π Paper: [MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources](https://huggingface.co/papers/2509.21268)
|
| 23 |
+
* π Model Checkpoints (SFT & RL):
|
| 24 |
+
- [MMR1-3B-SFT](https://huggingface.co/MMR1/MMR1-3B-SFT) | [MMR1-3B-RL](https://huggingface.co/MMR1/MMR1-3B-RL)
|
| 25 |
+
- [MMR1-7B-SFT](https://huggingface.co/MMR1/MMR1-7B-SFT) | [MMR1-7B-RL](https://huggingface.co/MMR1/MMR1-7B-RL)
|
| 26 |
+
- [MMR1-32B-SFT](https://huggingface.co/MMR1/MMR1-32B-SFT) | **MMR1-32B-RL coming soon!**
|
| 27 |
+
* π Datasets: [MMR1-SFT](https://huggingface.co/datasets/MMR1/MMR1-SFT), [MMR1-RL](https://huggingface.co/datasets/MMR1/MMR1-RL)
|
| 28 |
+
* π» Code: [GitHub - MMR1](https://github.com/LengSicong/MMR1)
|