Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
EricLabile
/
DeepSeek-R1-Distill-Qwen-1.5B-GRPO_generation_num_2_epoch_2
like
0
Text Generation
Transformers
Safetensors
DigitalLearningGmbH/MATH-lighteval
qwen2
Generated from Trainer
open-r1
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
DeepSeek-R1-Distill-Qwen-1.5B-GRPO_generation_num_2_epoch_2
Commit History
End of training
caa0548
verified
EricLabile
commited on
Mar 24
Model save
5654f9e
verified
EricLabile
commited on
Mar 24
Training in progress, step 104
85fbfed
verified
EricLabile
commited on
Mar 24
Training in progress, step 100
82a2854
verified
EricLabile
commited on
Mar 24
Training in progress, step 90
33843c7
verified
EricLabile
commited on
Mar 24
Training in progress, step 80
80b2fb6
verified
EricLabile
commited on
Mar 24
Training in progress, step 70
5f06c5a
verified
EricLabile
commited on
Mar 24
Training in progress, step 60
4199136
verified
EricLabile
commited on
Mar 24
Training in progress, step 50
946932e
verified
EricLabile
commited on
Mar 24
Training in progress, step 40
e2b068d
verified
EricLabile
commited on
Mar 24
Training in progress, step 30
bbc886d
verified
EricLabile
commited on
Mar 24
Training in progress, step 20
4f5dca0
verified
EricLabile
commited on
Mar 24
Training in progress, step 10
4ecb58a
verified
EricLabile
commited on
Mar 24
initial commit
1d65286
verified
EricLabile
commited on
Mar 24