Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models Paper • 2603.25750 • Published 11 days ago • 12
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference Paper • 2603.25730 • Published 5 days ago • 42
RealMaster: Lifting Rendered Scenes into Photorealistic Video Paper • 2603.23462 • Published 7 days ago • 31
WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG Paper • 2603.23497 • Published 7 days ago • 88