OmniEAR: Benchmarking Agent Reasoning in Embodied Tasks Paper β’ 2508.05614 β’ Published Aug 7, 2025 β’ 20 β’ 2
Hierarchical Budget Policy Optimization for Adaptive Reasoning Paper β’ 2507.15844 β’ Published Jul 21, 2025 β’ 17 β’ 2
LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization Paper β’ 2507.15758 β’ Published Jul 21, 2025 β’ 35 β’ 1
GUI-G$^2$: Gaussian Reward Modeling for GUI Grounding Paper β’ 2507.15846 β’ Published Jul 21, 2025 β’ 133 β’ 7
SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation Paper β’ 2506.03139 β’ Published Jun 3, 2025 β’ 17 β’ 2
ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models Paper β’ 2505.21500 β’ Published May 27, 2025 β’ 13 β’ 2
Let LLMs Break Free from Overthinking via Self-Braking Tuning Paper β’ 2505.14604 β’ Published May 20, 2025 β’ 23 β’ 2