Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
WisdomShell
/
RewardAnything-8B-v1
like
22
Follow
WisdomShell
21
Text Generation
Safetensors
English
Chinese
qwen3
reward-model
rlhf
principle-following
qwen
conversational
arxiv:
2506.03637
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
3
Deploy
Use this model
80a8a96
RewardAnything-8B-v1
Commit History
Update README.md
80a8a96
verified
zhuohaoyu
commited on
Jun 4, 2025
Update README.md
a742dfb
verified
zhuohaoyu
commited on
Jun 4, 2025
Update README.md
980c40c
verified
zhuohaoyu
commited on
Jun 4, 2025
Create README.md
6b55d1c
verified
zhuohaoyu
commited on
Jun 4, 2025
Duplicate from zhuohaoyu/RewardAnything-8B-v1
243e876
verified
zhuohaoyu
commited on
Jun 4, 2025