Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
leosaros
/
14bgrpo
like
0
PEFT
Safetensors
Transformers
AutoModel
grpo
lora
trl
unsloth
custom_code
arxiv:
1910.09700
Model card
Files
Files and versions
xet
Community
Use this model
main
14bgrpo
/
config.json
3v324v23
Add minimal config.json
166d114
5 months ago
raw
Copy download link
history
blame
contribute
delete
179 Bytes
{
"base_model"
:
"Qwen/Qwen2.5-14B-Instruct"
,
"model_type"
:
"AutoModel"
,
"auto_map"
:
{
"AutoModel"
:
"AutoModel"
,
"AutoModelForCausalLM"
:
"AutoModelForCausalLM"
}
}