Reward Models 10-2025 Collection A collection of great reward models for research and production • 7 items • Updated 1 day ago • 12
Reward Models 10-2025 Collection A collection of great reward models for research and production • 7 items • Updated 1 day ago • 12
Reward Models 10-2025 Collection A collection of great reward models for research and production • 7 items • Updated 1 day ago • 12
ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge Paper • 2510.18941 • Published Oct 21, 2025 • 10
ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge Paper • 2510.18941 • Published Oct 21, 2025 • 10
nvidia/Llama-3.3-Nemotron-70B-Reward-Principle Text Generation • 71B • Updated Oct 30, 2025 • 319 • 6
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25, 2025 • 8
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25, 2025 • 8
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25, 2025 • 8 • 2