PatRe: A Full-Stage Office Action and Rebuttal Generation Benchmark for Patent Examination Paper • 2605.03571 • Published 5 days ago • 6
InteractWeb-Bench: Can Multimodal Agent Escape Blind Execution in Interactive Website Generation? Paper • 2604.27419 • Published 10 days ago • 13
FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration Paper • 2603.29557 • Published Mar 31 • 17