Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
EricLabile
/
DeepSeek-R1-Distill-Qwen-1.5B-GRPO_generation_num_3
like
0
Text Generation
Transformers
Safetensors
DigitalLearningGmbH/MATH-lighteval
qwen2
Generated from Trainer
open-r1
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
DeepSeek-R1-Distill-Qwen-1.5B-GRPO_generation_num_3
Commit History
End of training
fc38b94
verified
EricLabile
commited on
Mar 24
Model save
cb2b116
verified
EricLabile
commited on
Mar 24
Training in progress, step 117
a69f35d
verified
EricLabile
commited on
Mar 24
Training in progress, step 110
c1eac68
verified
EricLabile
commited on
Mar 24
Training in progress, step 100
ac49f16
verified
EricLabile
commited on
Mar 24
Training in progress, step 90
6fd6b6e
verified
EricLabile
commited on
Mar 24
Training in progress, step 80
fb8fdec
verified
EricLabile
commited on
Mar 24
Training in progress, step 70
0fc08f9
verified
EricLabile
commited on
Mar 24
Training in progress, step 60
e263ff7
verified
EricLabile
commited on
Mar 24
Training in progress, step 50
57721d6
verified
EricLabile
commited on
Mar 24
Training in progress, step 40
e533c2f
verified
EricLabile
commited on
Mar 24
Training in progress, step 30
703bdaf
verified
EricLabile
commited on
Mar 24
Training in progress, step 20
1f528bf
verified
EricLabile
commited on
Mar 24
Training in progress, step 10
e26a86f
verified
EricLabile
commited on
Mar 24
initial commit
886248b
verified
EricLabile
commited on
Mar 24