MOSS-Audio Collection An open-source audio understanding model supporting speech recognition, environmental sound analysis, music understanding, time-aware QA, and complex • 7 items • Updated 5 days ago • 55
Step-Audio-R1 Collection Step-Audio-R1 is the first audio language model to successfully unlock test-time compute scaling. • 4 items • Updated Jan 14 • 18
wan2.1 controlnets Collection See code on github: https://github.com/TheDenk/wan2.1-dilated-controlnet • 6 items • Updated Oct 7, 2025 • 6
Zero-Shot Voice Cloning Collection TTS models that support zero-shot voice cloning • 8 items • Updated Dec 25, 2025 • 15
Flux tools in NF4 Collection Contains Flux Fill, Canny, and Dev checkpoints in NF4. • 3 items • Updated Nov 24, 2024 • 10