LYNX: Learning Dynamic Exits for Confidence-Controlled Reasoning Paper • 2512.05325 • Published Dec 5, 2025 • 4
X-Ego: Acquiring Team-Level Tactical Situational Awareness via Cross-Egocentric Contrastive Video Representation Learning Paper • 2510.19150 • Published Oct 22, 2025 • 1
Implicit Behavioral Alignment of Language Agents in High-Stakes Crowd Simulations Paper • 2509.16457 • Published Sep 19, 2025
Localized Gaussian Splatting Editing with Contextual Awareness Paper • 2408.00083 • Published Jul 31, 2024
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion Paper • 2504.04010 • Published Apr 5, 2025 • 9
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding Paper • 2406.09411 • Published Jun 13, 2024 • 19
Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries Paper • 2502.20475 • Published Feb 27, 2025 • 3
X-Dancer: Expressive Music to Human Dance Video Generation Paper • 2502.17414 • Published Feb 24, 2025 • 14
DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis Paper • 2312.13016 • Published Dec 20, 2023 • 6
MagicPose4D: Crafting Articulated Models with Appearance and Motion Control Paper • 2405.14017 • Published May 22, 2024 • 3
Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems Paper • 2210.15037 • Published Oct 26, 2022 • 1
TLDR: Token-Level Detective Reward Model for Large Vision Language Models Paper • 2410.04734 • Published Oct 7, 2024 • 18
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks Paper • 2410.10563 • Published Oct 14, 2024 • 37