26hzhang commited on
Commit
30debb6
Β·
verified Β·
1 Parent(s): a66bec0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -3
README.md CHANGED
@@ -1,10 +1,28 @@
1
  ---
2
  title: README
3
- emoji: πŸ“‰
4
- colorFrom: green
5
  colorTo: red
6
  sdk: static
7
  pinned: false
 
8
  ---
9
 
10
- Edit this `README.md` markdown file to author your organization card.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: README
3
+ emoji: πŸš€
4
+ colorFrom: red
5
  colorTo: red
6
  sdk: static
7
  pinned: false
8
+ license: apache-2.0
9
  ---
10
 
11
+ πŸ”₯πŸ”₯ **Introducing MMR1** β€” a Multimodal Reasoning Model trained with **Variance-Aware Sampling (VAS)**
12
+
13
+ πŸ’‘ **Highlights**
14
+ * **Variance-Aware Sampling (VAS)** for multimodal RL training:
15
+ - Establishes a theoretical link between reward variance and gradient signal strength;
16
+ - Proposes the **Variance Promotion Score (VPS)** integrating Outcome Variance and Trajectory Diversity;
17
+ - Enables more efficient and stable optimization under limited data conditions.
18
+ * Open-sources **~1.6M Long-CoT cold-start samples**, annotated by Gemini 2.5 Pro/Flash and verified with GPT-4o.
19
+ * Releases a suite of **SFT and RL checkpoints** at multiple scales: 3B, 7B, and 32B variants.
20
+
21
+ πŸ“¦ **Resources**
22
+ * πŸ“„ Paper: [MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources](https://huggingface.co/papers/2509.21268)
23
+ * πŸš€ Model Checkpoints (SFT & RL):
24
+ - [MMR1-3B-SFT](https://huggingface.co/MMR1/MMR1-3B-SFT) | [MMR1-3B-RL](https://huggingface.co/MMR1/MMR1-3B-RL)
25
+ - [MMR1-7B-SFT](https://huggingface.co/MMR1/MMR1-7B-SFT) | [MMR1-7B-RL](https://huggingface.co/MMR1/MMR1-7B-RL)
26
+ - [MMR1-32B-SFT](https://huggingface.co/MMR1/MMR1-32B-SFT) | **MMR1-32B-RL coming soon!**
27
+ * πŸ“Š Datasets: [MMR1-SFT](https://huggingface.co/datasets/MMR1/MMR1-SFT), [MMR1-RL](https://huggingface.co/datasets/MMR1/MMR1-RL)
28
+ * πŸ’» Code: [GitHub - MMR1](https://github.com/LengSicong/MMR1)