Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
KRadim
/
custom-SmolGRPO-135M
like
0
Safetensors
llama
trl
grpo
GRPO
Reasoning-Course
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
main
custom-SmolGRPO-135M
Commit History
Upload LlamaForCausalLM
c120293
verified
KRadim
commited on
Oct 7, 2025
Upload LlamaForCausalLM
7a89922
verified
KRadim
commited on
Oct 7, 2025
initial commit
e26fbff
verified
KRadim
commited on
Oct 7, 2025