Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
ilgee
/
Multiclass-Think-RM-8B
like
0
Safetensors
English
llama
reward-model
RLHF
reasoning
preference-learning
arxiv:
2505.16265
License:
llama3.1
Model card
Files
Files and versions
xet
Community
main
Multiclass-Think-RM-8B
Commit History
Upload README.md with huggingface_hub
4d25f89
verified
ilgee
commited on
Nov 2, 2025
Upload README.md with huggingface_hub
a4cb3ef
verified
ilgee
commited on
Oct 23, 2025
Update model card
041d596
verified
ilgee
commited on
Oct 12, 2025
Update model card
3bffae5
verified
ilgee
commited on
Oct 12, 2025
Upload model with updated chat template
b47cba8
verified
ilgee
commited on
May 8, 2025
initial commit
e8bf70e
verified
ilgee
commited on
May 8, 2025