From Runnable to Shippable: Multi-Agent Test-Driven Development for Generating Full-Stack Web Applications from Requirements Paper • 2605.17242 • Published 3 days ago • 9
Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification Paper • 2601.15808 • Published Jan 22 • 20