SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks Paper • 2606.09669 • Published 20 days ago • 46
Efficient Agentic Reinforcement Learning with On-Policy Intrinsic Knowledge Boundary Enhancement Paper • 2605.26952 • Published May 26 • 16
Rethinking Cross-Layer Information Routing in Diffusion Transformers Paper • 2605.20708 • Published May 20 • 111
Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps Paper • 2605.16928 • Published May 16 • 97
MLLM-CL: Continual Learning for Multimodal Large Language Models Paper • 2506.05453 • Published Jun 5, 2025 • 4
iFSQ: Improving FSQ for Image Generation with 1 Line of Code Paper • 2601.17124 • Published Jan 23 • 34
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head Paper • 2601.07832 • Published Jan 12 • 53
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published Jan 11 • 215
AT^2PO: Agentic Turn-based Policy Optimization via Tree Search Paper • 2601.04767 • Published Jan 8 • 28
TongSIM: A General Platform for Simulating Intelligent Machines Paper • 2512.20206 • Published Dec 23, 2025 • 28
Kinematify: Open-Vocabulary Synthesis of High-DoF Articulated Objects Paper • 2511.01294 • Published Nov 3, 2025 • 14
TabTune: A Unified Library for Inference and Fine-Tuning Tabular Foundation Models Paper • 2511.02802 • Published Nov 4, 2025 • 16
Do Vision-Language Models Measure Up? Benchmarking Visual Measurement Reading with MeasureBench Paper • 2510.26865 • Published Oct 30, 2025 • 12
Orion-MSP: Multi-Scale Sparse Attention for Tabular In-Context Learning Paper • 2511.02818 • Published Nov 4, 2025 • 15
Value Drifts: Tracing Value Alignment During LLM Post-Training Paper • 2510.26707 • Published Oct 30, 2025 • 13