Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
KokosDev
/
checkpoint
like
0
Safetensors
qwen2
Model card
Files
Files and versions
xet
Community
main
checkpoint
/
README.md
KokosDev
Add model card for checkpoint 4
e741c65
verified
9 months ago
preview
code
|
raw
Copy download link
history
blame
contribute
delete
Safe
299 Bytes
DeepSeek R1 Checkpoint 4
This is checkpoint 4 from training the DeepSeek R1 model on mathematical reasoning tasks using GRPO.
Model
: deepseek-ai/deepseek-r1-distill-qwen-7b
Checkpoint Step
: 4
Training Details
: Trained on 8 NVIDIA B200 GPUs with batch size 6 per GPU, 6 epochs.