Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing Paper • 2603.03143 • Published 14 days ago • 139
From Scale to Speed: Adaptive Test-Time Scaling for Image Editing Paper • 2603.00141 • Published 22 days ago • 135
MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios Paper • 2602.22638 • Published 20 days ago • 107
HyTRec: A Hybrid Temporal-Aware Attention Architecture for Long Behavior Sequential Recommendation Paper • 2602.18283 • Published 25 days ago • 56
HyTRec: A Hybrid Temporal-Aware Attention Architecture for Long Behavior Sequential Recommendation Paper • 2602.18283 • Published 25 days ago • 56
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Paper • 2511.02778 • Published Nov 4, 2025 • 103
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published Feb 10 • 201
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published Feb 10 • 201
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation Paper • 2601.20614 • Published Jan 28 • 120
Urban Socio-Semantic Segmentation with Vision-Language Reasoning Paper • 2601.10477 • Published Jan 15 • 156
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published Jan 8 • 169
Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation Paper • 2512.24271 • Published Dec 30, 2025 • 64
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners Paper • 2504.14239 • Published Apr 19, 2025 • 14