From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models Paper • 2602.22859 • Published 1 day ago • 142
Decoupling Reasoning and Perception: An LLM-LMM Framework for Faithful Visual Reasoning Paper • 2509.23322 • Published Sep 27, 2025 • 1
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management Paper • 2512.12967 • Published Dec 15, 2025 • 108
OSWorld-MCP: Benchmarking MCP Tool Invocation In Computer-Use Agents Paper • 2510.24563 • Published Oct 28, 2025 • 23
Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation Paper • 2506.04614 • Published Jun 5, 2025 • 19
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization Paper • 2411.11909 • Published Nov 17, 2024 • 22