Add model card for checkpoint 4
Browse files
README.md
CHANGED
|
@@ -1,3 +1,7 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# DeepSeek R1 Checkpoint 4
|
| 2 |
+
|
| 3 |
+
This is checkpoint 4 from training the DeepSeek R1 model on mathematical reasoning tasks using GRPO.
|
| 4 |
+
|
| 5 |
+
- **Model**: deepseek-ai/deepseek-r1-distill-qwen-7b
|
| 6 |
+
- **Checkpoint Step**: 4
|
| 7 |
+
- **Training Details**: Trained on 8 NVIDIA B200 GPUs with batch size 6 per GPU, 6 epochs.
|