Canvas-to-Image: Compositional Image Generation with Multimodal Controls Paper • 2511.21691 • Published Nov 26, 2025 • 36
Agentic Learner with Grow-and-Refine Multimodal Semantic Memory Paper • 2511.21678 • Published Nov 26, 2025 • 12
ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction Paper • 2511.20937 • Published Nov 26, 2025 • 16
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation Paper • 2512.10949 • Published Dec 11, 2025 • 47
Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving Paper • 2512.10739 • Published Dec 11, 2025 • 47
OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification Paper • 2512.10756 • Published Dec 11, 2025 • 35
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens Paper • 2603.23516 • Published 26 days ago • 43
AVControl: Efficient Framework for Training Audio-Visual Controls Paper • 2603.24793 • Published 6 days ago • 24
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models Paper • 2603.17051 • Published 14 days ago • 106
RealMaster: Lifting Rendered Scenes into Photorealistic Video Paper • 2603.23462 • Published 8 days ago • 31
Gen-Searcher: Reinforcing Agentic Search for Image Generation Paper • 2603.28767 • Published 1 day ago • 45