HeartMuLa: A Family of Open Sourced Music Foundation Models Paper • 2601.10547 • Published 11 days ago • 37
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head Paper • 2601.07832 • Published 14 days ago • 51
LTX-2: Efficient Joint Audio-Visual Foundation Model Paper • 2601.03233 • Published 20 days ago • 134
LTX-2 Collection LTX-2 base models and accompanying LoRAs and IC-LoRAs • 12 items • Updated 20 days ago • 44
HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming Paper • 2512.21338 • Published Dec 24, 2025 • 22
SiD-DiT Collection Collection of Distilled Flow Matching Models with Score Identity Distillation • 17 items • Updated Nov 29, 2025 • 1
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling Paper • 2512.14614 • Published Dec 16, 2025 • 70
Chatterbox Turbo Collection Ultra-Fast, Open-Source Text-to-Speech for Real-Time Voice AI • 3 items • Updated Dec 15, 2025 • 16
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows Paper • 2512.05150 • Published Dec 3, 2025 • 75
Video Foundation Models Collection A list of all the (usable) video generation diffusion models. Models that are not upto current standards are skipped. • 10 items • Updated Dec 3, 2025 • 2
Ovis-Image Collection Ovis-Image is a 7B text-to-image model specifically optimized for high-quality text rendering under stringent computational constraints. • 7 items • Updated Dec 4, 2025 • 6