Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
KRadim
/
custom-SmolGRPO-135M
like
0
Safetensors
llama
trl
grpo
GRPO
Reasoning-Course
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
main
custom-SmolGRPO-135M
269 MB
Ctrl+K
Ctrl+K
1 contributor
History:
3 commits
KRadim
Upload LlamaForCausalLM
c120293
verified
6 months ago
.gitattributes
Safe
1.52 kB
initial commit
6 months ago
README.md
81 Bytes
Upload LlamaForCausalLM
6 months ago
config.json
687 Bytes
Upload LlamaForCausalLM
6 months ago
generation_config.json
166 Bytes
Upload LlamaForCausalLM
6 months ago
model.safetensors
269 MB
xet
Upload LlamaForCausalLM
6 months ago