SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks Paper • 2606.09669 • Published 19 days ago • 45
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation Paper • 2503.19622 • Published Mar 25, 2025 • 31
Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis Paper • 2411.07132 • Published Nov 11, 2024
Pixels, Patterns, but No Poetry: To See The World like Humans Paper • 2507.16863 • Published Jul 21, 2025 • 69
Representation Entanglement for Generation:Training Diffusion Transformers Is Much Easier Than You Think Paper • 2507.01467 • Published Jul 2, 2025
How Far Are LLMs from Professional Poker Players? Revisiting Game-Theoretic Reasoning with Agentic Tool Use Paper • 2602.00528 • Published Jan 31
Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks Paper • 2602.01630 • Published Feb 2 • 50
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning Paper • 2505.13426 • Published May 19, 2025 • 13
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning Paper • 2505.11049 • Published May 16, 2025 • 62
Efficient Inference for Large Reasoning Models: A Survey Paper • 2503.23077 • Published Mar 29, 2025 • 45
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models Paper • 2406.13233 • Published Jun 19, 2024 • 1
StruEdit: Structured Outputs Enable the Fast and Accurate Knowledge Editing for Large Language Models Paper • 2409.10132 • Published Sep 16, 2024
Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows Paper • 2411.07763 • Published Nov 12, 2024 • 2
GuardReasoner: Towards Reasoning-based LLM Safeguards Paper • 2501.18492 • Published Jan 30, 2025 • 89
Kimi k1.5: Scaling Reinforcement Learning with LLMs Paper • 2501.12599 • Published Jan 22, 2025 • 131
Exploring the Universal Vulnerability of Prompt-based Learning Paradigm Paper • 2204.05239 • Published Apr 11, 2022