Motif-2-12.7B-Reasoning: A Practitioner's Guide to RL Training Recipes Paper • 2512.11463 • Published Dec 11, 2025 • 5
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer Paper • 2510.06590 • Published Oct 8, 2025 • 76
Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs Paper • 2510.01954 • Published Oct 2, 2025 • 14
Scaling Language-Centric Omnimodal Representation Learning Paper • 2510.11693 • Published Oct 13, 2025 • 102
Map the Flow: Revealing Hidden Pathways of Information in VideoLLMs Paper • 2510.13251 • Published Oct 15, 2025 • 14
Music Flamingo: Scaling Music Understanding in Audio Language Models Paper • 2511.10289 • Published Nov 13, 2025 • 14
Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data Paper • 2511.12609 • Published Nov 16, 2025 • 105
MobileLLM-R1: Exploring the Limits of Sub-Billion Language Model Reasoners with Open Training Recipes Paper • 2509.24945 • Published Sep 29, 2025 • 5
LTX-2: Efficient Joint Audio-Visual Foundation Model Paper • 2601.03233 • Published 20 days ago • 134