LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation Paper • 2508.03694 • Published Aug 5 • 51
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30 • 119
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory Paper • 2508.09736 • Published Aug 13 • 57
view article Article Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies Feb 17 • 28
FastVLM: Efficient Vision Encoding for Vision Language Models Paper • 2412.13303 • Published Dec 17, 2024 • 72