Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Tongyi-ConvAI
's Collections
RM-NLHF
RM-NLHF
updated
4 days ago
Official collection for paper "Reward Modeling from Natural Language Human Feedback".
Upvote
1
Tongyi-ConvAI/Baseline-Outcome-Reward-Qwen-7B
8B
•
Updated
4 days ago
•
14
Tongyi-ConvAI/RM-NLHF-Qwen-32B
33B
•
Updated
4 days ago
•
13
Tongyi-ConvAI/Final-MetaRM-RM-NLHF-Qwen-32B
32B
•
Updated
4 days ago
•
8
Tongyi-ConvAI/Final-MetaRM-RM-NLHF-Qwen-7B
7B
•
Updated
4 days ago
•
11
Tongyi-ConvAI/Cold-Start-MetaRM-RM-NLHF-Qwen-7B
7B
•
Updated
4 days ago
•
16
Tongyi-ConvAI/Cold-Start-MetaRM-RM-NLHF-Qwen-32B
32B
•
Updated
3 days ago
•
11
•
1
Tongyi-ConvAI/RM-NLHF-Qwen-7B
8B
•
Updated
4 days ago
•
24
Tongyi-ConvAI/RM-NLHF
Viewer
•
Updated
4 days ago
•
49.5k
•
14
Upvote
1
Share collection
View history
Collection guide
Browse collections