Update README.md
Browse files
README.md
CHANGED
|
@@ -43,7 +43,18 @@ tags:
|
|
| 43 |
|
| 44 |
**Skywork-R1V3-38B** is the **latest and most powerful open-source multimodal reasoning model** in the Skywork series, pushing the boundaries of multimodal and cross-disciplinary intelligence. With elaborate RL algorithm in the post-training stage, R1V3 significantly enhances multimodal reasoning ablity and achieves **open-source state-of-the-art (SOTA)** performance across multiple multimodal reasoning benchmarks.
|
| 45 |
|
| 46 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
|
| 48 |
### π Key Results
|
| 49 |
- **MMMU:** 76.0 β *Open-source SOTA, approaching human experts (76.2)*
|
|
|
|
| 43 |
|
| 44 |
**Skywork-R1V3-38B** is the **latest and most powerful open-source multimodal reasoning model** in the Skywork series, pushing the boundaries of multimodal and cross-disciplinary intelligence. With elaborate RL algorithm in the post-training stage, R1V3 significantly enhances multimodal reasoning ablity and achieves **open-source state-of-the-art (SOTA)** performance across multiple multimodal reasoning benchmarks.
|
| 45 |
|
| 46 |
+
|
| 47 |
+
## 2. Technical Highlights
|
| 48 |
+
Skywork-R1V3 is an advanced, open-source Vision-Language Model (VLM) built on several core innovations:
|
| 49 |
+
|
| 50 |
+
- **Refined Post-Training RL**: Instead of relying on reasoning pre-training, our fine-grained cold-start finetuning effectively primes the model for Reinforcement Learning (RL), which dramatically enhances its reasoning ability.
|
| 51 |
+
|
| 52 |
+
- **Essential Connector Module**: We've uncovered the critical role of the connector module in achieving robust cross-modal alignment for strong multimodal reasoning. What's more, Connector-only Finetuning can further boost the model's performance post-RL.
|
| 53 |
+
|
| 54 |
+
- **Entropy of Critical Reasoning Tokens**: This unique indicator effectively gauges reasoning capability, guiding checkpoint selection during RL training.
|
| 55 |
+
|
| 56 |
+
These innovations lead to Broad Reasoning Generalization, allowing our RL-powered approach to successfully extend mathematical reasoning to diverse subject areas. Additionally, our work delves into RL-specific explorations like curriculum learning and learning rate strategies, alongside a broader discussion on multimodal reasoning. For more details, refer to our [π R1V3 Report].
|
| 57 |
+
## 3. Evaluation
|
| 58 |
|
| 59 |
### π Key Results
|
| 60 |
- **MMMU:** 76.0 β *Open-source SOTA, approaching human experts (76.2)*
|