UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios Paper • 2509.21766 • Published Sep 26, 2025 • 24