Skywork
/

Skywork-R1V3-38B

Image-Text-to-Text

Model card Files Files and versions

fakerbaby commited on Jul 8, 2025

Commit

adaa7d6

·

verified ·

1 Parent(s): c510b41

Update README.md

Files changed (1) hide show

README.md +12 -1

README.md CHANGED Viewed

@@ -43,7 +43,18 @@ tags:
 **Skywork-R1V3-38B** is the **latest and most powerful open-source multimodal reasoning model** in the Skywork series, pushing the boundaries of multimodal and cross-disciplinary intelligence. With elaborate RL algorithm in the post-training stage, R1V3 significantly enhances multimodal reasoning ablity and achieves **open-source state-of-the-art (SOTA)** performance across multiple multimodal reasoning benchmarks.
-## 2. Evaluation
 ### 🌟 Key Results
 - **MMMU:** 76.0 — *Open-source SOTA, approaching human experts (76.2)*

 **Skywork-R1V3-38B** is the **latest and most powerful open-source multimodal reasoning model** in the Skywork series, pushing the boundaries of multimodal and cross-disciplinary intelligence. With elaborate RL algorithm in the post-training stage, R1V3 significantly enhances multimodal reasoning ablity and achieves **open-source state-of-the-art (SOTA)** performance across multiple multimodal reasoning benchmarks.
+## 2. Technical Highlights
+Skywork-R1V3 is an advanced, open-source Vision-Language Model (VLM) built on several core innovations:
+- **Refined Post-Training RL**: Instead of relying on reasoning pre-training, our fine-grained cold-start finetuning effectively primes the model for Reinforcement Learning (RL), which dramatically enhances its reasoning ability.
+- **Essential Connector Module**: We've uncovered the critical role of the connector module in achieving robust cross-modal alignment for strong multimodal reasoning. What's more, Connector-only Finetuning can further boost the model's performance post-RL.
+- **Entropy of Critical Reasoning Tokens**: This unique indicator effectively gauges reasoning capability, guiding checkpoint selection during RL training.
+These innovations lead to Broad Reasoning Generalization, allowing our RL-powered approach to successfully extend mathematical reasoning to diverse subject areas. Additionally, our work delves into RL-specific explorations like curriculum learning and learning rate strategies, alongside a broader discussion on multimodal reasoning. For more details, refer to our [📖 R1V3 Report].
+## 3. Evaluation
 ### 🌟 Key Results
 - **MMMU:** 76.0 — *Open-source SOTA, approaching human experts (76.2)*