ResearchMath-14K: Scaling Research-Level Mathematics via Agents Paper • 2605.28003 • Published 30 days ago • 50
Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs Paper • 2605.09063 • Published May 9 • 82
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published Apr 20 • 87
view article Article How to Ground a Korean AI Agent in Real Demographics with Synthetic Personas nvidia • Apr 21 • 26