Utonia: Toward One Encoder for All Point Clouds Paper โข 2603.03283 โข Published 10 days ago โข 165
Running on Zero Featured 97 DreamOmni2 Gen ๐ผ 97 Multimodal Instruction-based Editing and Generation
MGM-Omni: Scaling Omni LLMs to Personalized Long-Horizon Speech Paper โข 2509.25131 โข Published Sep 29, 2025 โข 16
view post Post 4859 ๐ Update: We release the technical report of MGM-Omni. Moreover, we introduce Long-TTS-Eval, a benchmark for long-form and complex case TTS evaluation.๐ Arxiv: https://arxiv.org/abs/2509.25131๐ benchmark: wcy1122/Long-TTS-Eval-------------------------๐ Introducing MGM-Omni, an omni-chatbot capable of processing text, image, video, and speech inputs, and can generate both text and speech responses.๐ MGM-Omni support hour-level audio understanding.๐ฃ๏ธ MGM-Omni support 10-minute speech generation and voice cloning.For more details, please check:๐ Blog: https://mgm-omni.notion.site/MGM-Omni-An-Open-source-Omni-Chatbot-2395728e0b0180149ac9f24683fc9907 ๐ Code: https://github.com/dvlab-research/MGM-Omni ๐ค Model: wcy1122/mgm-omni-6896075e97317a88825032e1 ๐ฎ Demo: wcy1122/MGM-Omni See translation ๐ 8 8 ๐ฅ 3 3 + Reply
MGM-Omni Collection MGM-Omni: Scaling Omni LLMs to Personalized Long-Horizon Speech โข 13 items โข Updated 11 days ago โข 11
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition Paper โข 2412.09501 โข Published Dec 12, 2024 โข 48