Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
EricLabile
/
DeepSeek-R1-Distill-Qwen-1.5B-GRPO_generation_num_2
like
0
Text Generation
Transformers
Safetensors
DigitalLearningGmbH/MATH-lighteval
qwen2
Generated from Trainer
open-r1
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
DeepSeek-R1-Distill-Qwen-1.5B-GRPO_generation_num_2
Commit History
End of training
eaa5ccf
verified
EricLabile
commited on
Mar 25
Model save
63ff1cc
verified
EricLabile
commited on
Mar 25
Training in progress, step 52
2174921
verified
EricLabile
commited on
Mar 25
Training in progress, step 50
799e2af
verified
EricLabile
commited on
Mar 25
Training in progress, step 40
116ad90
verified
EricLabile
commited on
Mar 25
Training in progress, step 30
833834a
verified
EricLabile
commited on
Mar 25
Training in progress, step 20
c99b1f3
verified
EricLabile
commited on
Mar 25
Training in progress, step 10
08f7a16
verified
EricLabile
commited on
Mar 25
initial commit
88aad4d
verified
EricLabile
commited on
Mar 25