GUICrafter: Weakly-Supervised GUI Agent Leveraging Massive Unannotated Screenshots Paper • 2606.29705 • Published 5 days ago • 13
Bridging VideoQA and Video-Guided Agentic Tasks via Generalized Keyframe Extraction Paper • 2606.29445 • Published 6 days ago • 26
EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments Paper • 2606.13681 • Published 23 days ago • 142
AffordanceVLA: A Vision-Language-Action Model Empowering Action Generation through Affordance-Aware Understanding Paper • 2606.06155 • Published 30 days ago • 10
WarriorMath: Enhancing the Mathematical Ability of Large Language Models with a Defect-aware Framework Paper • 2508.01245 • Published Aug 2, 2025 • 1
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward Paper • 2511.20561 • Published Nov 25, 2025 • 33
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs Paper • 2508.16153 • Published Aug 22, 2025 • 162
A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17, 2025 • 264
Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology Paper • 2507.07999 • Published Jul 10, 2025 • 51