Intelligent-Internet
/

II-Thought-1.5B-Preview

Model card Files Files and versions

Phu Nguyen commited on Mar 24, 2025

Commit

54cc0b9

·

verified ·

1 Parent(s): 612c8e0

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -7,7 +7,8 @@
 ## Training Methodology
 - **Framework**: [ii_thought](https://github.com/Intelligent-Internet/ii-thought) / [verl](https://github.com/volcengine/verl)
-- **Algorithm**: GRPO
 - **Reward Modeling**
   - **Answer correctness reward**
   <img src="https://cdn-uploads.huggingface.co/production/uploads/67c563afa34e1ad5a3533ccf/X15GjihIRO9hkfL361Pfd.png" width="300">

 ## Training Methodology
 - **Framework**: [ii_thought](https://github.com/Intelligent-Internet/ii-thought) / [verl](https://github.com/volcengine/verl)
+- **Algorithm**: GRPO
+- **Base Model**: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
 - **Reward Modeling**
   - **Answer correctness reward**
   <img src="https://cdn-uploads.huggingface.co/production/uploads/67c563afa34e1ad5a3533ccf/X15GjihIRO9hkfL361Pfd.png" width="300">