File size: 299 Bytes
e741c65 |
1 2 3 4 5 6 7 8 |
# DeepSeek R1 Checkpoint 4
This is checkpoint 4 from training the DeepSeek R1 model on mathematical reasoning tasks using GRPO.
- **Model**: deepseek-ai/deepseek-r1-distill-qwen-7b
- **Checkpoint Step**: 4
- **Training Details**: Trained on 8 NVIDIA B200 GPUs with batch size 6 per GPU, 6 epochs.
|