Phu Nguyen commited on
Commit
54cc0b9
·
verified ·
1 Parent(s): 612c8e0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -7,7 +7,8 @@
7
  ## Training Methodology
8
 
9
  - **Framework**: [ii_thought](https://github.com/Intelligent-Internet/ii-thought) / [verl](https://github.com/volcengine/verl)
10
- - **Algorithm**: GRPO
 
11
  - **Reward Modeling**
12
  - **Answer correctness reward**
13
  <img src="https://cdn-uploads.huggingface.co/production/uploads/67c563afa34e1ad5a3533ccf/X15GjihIRO9hkfL361Pfd.png" width="300">
 
7
  ## Training Methodology
8
 
9
  - **Framework**: [ii_thought](https://github.com/Intelligent-Internet/ii-thought) / [verl](https://github.com/volcengine/verl)
10
+ - **Algorithm**: GRPO
11
+ - **Base Model**: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
12
  - **Reward Modeling**
13
  - **Answer correctness reward**
14
  <img src="https://cdn-uploads.huggingface.co/production/uploads/67c563afa34e1ad5a3533ccf/X15GjihIRO9hkfL361Pfd.png" width="300">