AutoLab: Can Frontier Models Solve Long-Horizon Auto Research and Engineering Tasks? Paper • 2606.05080 • Published 22 days ago • 30
AutoLab: Can Frontier Models Solve Long-Horizon Auto Research and Engineering Tasks? Paper • 2606.05080 • Published 22 days ago • 30
Agent Skills Should Go Beyond Text: The Case for Visual Skills Paper • 2606.01414 • Published 25 days ago • 10
Agent Skills Should Go Beyond Text: The Case for Visual Skills Paper • 2606.01414 • Published 25 days ago • 10
Agent Skills Should Go Beyond Text: The Case for Visual Skills Paper • 2606.01414 • Published 25 days ago • 10
MementoGUI: Learning Agentic Multimodal Memory Control for Long-Horizon GUI Agents Paper • 2605.18652 • Published May 18 • 8
MementoGUI: Learning Agentic Multimodal Memory Control for Long-Horizon GUI Agents Paper • 2605.18652 • Published May 18 • 8
MementoGUI: Learning Agentic Multimodal Memory Control for Long-Horizon GUI Agents Paper • 2605.18652 • Published May 18 • 8
Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty? Paper • 2605.12684 • Published May 12 • 11
Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty? Paper • 2605.12684 • Published May 12 • 11
ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding Paper • 2603.27064 • Published Mar 28 • 29
ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding Paper • 2603.27064 • Published Mar 28 • 29
ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding Paper • 2603.27064 • Published Mar 28 • 29
SPARC: Separating Perception And Reasoning Circuits for Test-time Scaling of VLMs Paper • 2602.06566 • Published Feb 6 • 3