DeepVQE: Real Time Deep Voice Quality Enhancement for Joint Acoustic Echo Cancellation, Noise Suppression and Dereverberation Paper • 2306.03177 • Published Jun 5, 2023 • 1
ERNIE-Image Collection The serieas of image generation models, including text2img、img2img. • 2 items • Updated 23 days ago • 23
One-Step Diffusion Transformer for Controllable Real-World Image Super-Resolution Paper • 2511.17138 • Published Nov 21, 2025 • 2
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale Paper • 2504.16030 • Published Apr 22, 2025 • 37
AuraSR Collection Fastest super resolution model for AI generated images • 2 items • Updated Jul 30, 2024 • 7
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning Paper • 2509.24650 • Published Sep 29, 2025 • 11
OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models Paper • 2604.00688 • Published Apr 1 • 12
Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient and High-Performance Models for Multilingual ASR and AST Paper • 2509.14128 • Published Sep 17, 2025 • 2
Demucs MLX — Music Source Separation Collection Demucs music stem separation for Apple Silicon. Float32 and float16 variants. • 2 items • Updated Mar 16 • 1