Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Blancy
/
DeepSeek-R1-Distill-Qwen-0.5B-GRPO
like
0
Text Generation
Transformers
Safetensors
simplescaling/s1K-1.1
qwen2
Generated from Trainer
open-r1
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
DeepSeek-R1-Distill-Qwen-0.5B-GRPO
Commit History
End of training
6be64b0
verified
Blancy
commited on
Apr 18, 2025
Model save
a5f1ee4
verified
Blancy
commited on
Apr 18, 2025
Training in progress, epoch 0
9e097a3
verified
Blancy
commited on
Apr 18, 2025
End of training
882671c
verified
Blancy
commited on
Apr 17, 2025
Model save
d75d467
verified
Blancy
commited on
Apr 17, 2025
Training in progress, epoch 0
188c40f
verified
Blancy
commited on
Apr 17, 2025
End of training
8932d6d
verified
Blancy
commited on
Apr 17, 2025
Model save
8809c7b
verified
Blancy
commited on
Apr 17, 2025
Training in progress, epoch 0
52536a8
verified
Blancy
commited on
Apr 17, 2025
End of training
97726b8
verified
Blancy
commited on
Apr 16, 2025
Model save
0e9211a
verified
Blancy
commited on
Apr 16, 2025
Training in progress, epoch 0
c4064c4
verified
Blancy
commited on
Apr 16, 2025
initial commit
7121af1
verified
Blancy
commited on
Apr 16, 2025