ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents Paper • 2601.12294 • Published 27 days ago • 17
Generalizing Test-time Compute-optimal Scaling as an Optimizable Graph Paper • 2511.00086 • Published Oct 29, 2025 • 42
Who's Your Judge? On the Detectability of LLM-Generated Judgments Paper • 2509.25154 • Published Sep 29, 2025 • 30
Representation & Optimization Collection Understanding about representation sheds light on optimization • 126 items • Updated 4 days ago • 7