DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints Paper • 2601.18137 • Published 4 days ago • 19
Budget-aware Test-time Scaling via Discriminative Verification Paper • 2510.14913 • Published Oct 16, 2025 • 5