Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
KRadim
/
custom-SmolGRPO-135M
like
0
Safetensors
llama
trl
grpo
GRPO
Reasoning-Course
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
main
custom-SmolGRPO-135M
/
config.json
Commit History
Upload LlamaForCausalLM
7a89922
verified
KRadim
commited on
Oct 7, 2025