Urban Socio-Semantic Segmentation with Vision-Language Reasoning Paper • 2601.10477 • Published 11 days ago • 154
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published 17 days ago • 161
Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation Paper • 2512.24271 • Published 27 days ago • 62
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners Paper • 2504.14239 • Published Apr 19, 2025 • 14
InfiGUI: Advanced Vision-Native Agent for GUI Interaction Collection 7 items • Updated Oct 15, 2025 • 1
Computer-Use Agents as Judges for Generative User Interface Paper • 2511.15567 • Published Nov 19, 2025 • 53