Running Agents 432 Reward Bench Leaderboard 📐 432 Explore and compare model scores on RewardBench benchmarks
KTO: Model Alignment as Prospect Theoretic Optimization Paper • 2402.01306 • Published Feb 2, 2024 • 22