\$OneMillion-Bench: How Far are Language Agents from Human Experts? Paper • 2603.07980 • Published 4 days ago • 25
PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives Paper • 2505.19558 • Published May 26, 2025
Adaptive Preference Optimization with Uncertainty-aware Utility Anchor Paper • 2509.10515 • Published Sep 3, 2025 • 1
Adaptive Preference Optimization with Uncertainty-aware Utility Anchor Paper • 2509.10515 • Published Sep 3, 2025 • 1