Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
TimaeusWorld
/
Qwen2-0.5B-GRPO-test
like
0
Transformers
TensorBoard
Safetensors
AI-MO/NuminaMath-TIR
Generated from Trainer
trl
grpo
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Metrics
Training metrics
Community
Deploy
Use this model
main
Qwen2-0.5B-GRPO-test
/
runs
52.9 kB
Ctrl+K
Ctrl+K
1 contributor
History:
35 commits
TimaeusWorld
Training in progress, step 20
ca56961
verified
about 1 year ago
Apr04_17-55-17_27528ee7ef5d
Training in progress, step 90
about 1 year ago
Apr04_19-52-36_290cea10b5e7
Training in progress, step 113
about 1 year ago
Apr04_23-03-39_6ff748dd8aa7
Training in progress, step 60
about 1 year ago
Apr05_02-48-34_62af6bed405b
Training in progress, step 60
about 1 year ago
Apr05_17-45-31_a8e4496cb155
Training in progress, step 20
about 1 year ago