Good Audio Generation space, model, dataset
Good Audio Generation space, model, dataset collection
-
Audio-to-Audio • Updated • 448k • 108 -
KittenML/kitten-tts-nano-0.1
Updated • 39.1k • 515 -
FunAudioLLM/ThinkSound
Video-to-Video • Updated • 53 -
ThinkSound
🔊320Generate audio for a silent video using text prompts
-
Higgs Audio Demo
🎤399Higgs Audio Demo
-
bosonai/higgs-audio-v2-generation-3B-base
Text-to-Speech • 6B • Updated • 162k • 678 -
Song Generation
🎵740Generate a song from your lyrics and prompts
-
Hibiki Samples
🤗53Translate speech in real-time with high fidelity
-
kyutai/moshiko-pytorch-bf16
Updated • 154k • 242 -
kyutai/mimi
Feature Extraction • 96.2M • Updated • 2.32M • • 304 -
maya-research/Veena
Text-to-Speech • 4B • Updated • 8.53k • 233 -
MiniMax Speech Tech Report
🎙106Generate natural speech in any voice from text
-
google/magenta-realtime
Updated • 234 • 550 -
PlayDiffusion
🎨120Generate modified audio from text and voice
-
Qwen2.5 Omni 7B Demo
🏆372Chat with text, audio, images, and video, get spoken replies
-
Open ASR Leaderboard
🏆1.37kCompare speech-to-text models using benchmark scores
-
Open NotebookLM
🎙143Generate a podcast to discuss the topic of your choice!
-
Voila Demo
💻44Chat with a voice-clone AI
-
Voice Clone
🗣2.65kClone a voice and generate speech from text
-
moonshotai/Kimi-Audio-7B-Instruct
Text-to-Speech • 10B • Updated • 92.4k • 401 -
moonshotai/Kimi-Audio-7B
Text-to-Speech • 10B • Updated • 140 • 83 -
Dia 1.6B
👯1.78kGenerate realistic dialogue from a script, using Dia!
-
nari-labs/Dia-1.6B
Text-to-Speech • 2B • Updated • 5.6k • • 2.87k -
ByteDance/MegaTTS3
Text-to-Speech • Updated • 133 • 419 -
Di♪♪Rhythm
🎶688Blazingly Fast and Embarrassingly Simple Song Generation
-
Gemini Audio Video
♊35Gemini understands audio and video!
-
nvidia/diar_sortformer_4spk-v1
Automatic Speech Recognition • 0.1B • Updated • 13.4k • 139 -
ACE Step
😻661A Step Towards Music Generation Foundation Model
-
ACE-Step/ACE-Step-v1-3.5B
Text-to-Audio • Updated • 735 -
stepfun-ai/Step-Audio-2-mini
Any-to-Any • 8B • Updated • 1.26k • 258 -
neuphonic/neutts-air
Text-to-Speech • 0.7B • Updated • 13.7k • 873 -
NeuTTS-Air
☁318Clone a voice and generate custom speech
-
KaniTTS
😻114Generate expressive speech from your text in seconds
-
microsoft/UserLM-8b
Text Generation • 8B • Updated • 606 • • 367 -
pipecat-ai/smart-turn-v3
Voice Activity Detection • Updated • 165 -
meituan-longcat/LongCat-Audio-Codec
Updated • 42 -
Qwen3 TTS Voice Design
📈112Generate custom speech from text and voice description
-
Qwen TTS Clone Demo
👀64Create a custom voice and synthesize speech from text
-
ResembleAI/chatterbox-turbo
Text-to-Speech • Updated • • 650 -
Chatterbox Turbo Demo
⚡502Chatterbox Turbo Demo
-
zai-org/GLM-TTS
Text-to-Speech • Updated • 390 • 338 -
Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice
Text-to-Speech • 2B • Updated • 1.99M • 1.57k -
Qwen3-TTS Demo
🎙1.94kGenerate custom speech from text, voice descriptions, or samples
-
Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice
Text-to-Speech • 0.9B • Updated • 973k • 153 -
FlashLabs/Chroma-4B
Any-to-Any • 6B • Updated • 180 • 382 -
FlashLabs Chroma 1.0: A Real-Time End-to-End Spoken Dialogue Model with Personalized Voice Cloning
Paper • 2601.11141 • Published • 23 -
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization
Paper • 2601.01554 • Published • 62 -
FunAudioLLM/Fun-Audio-Chat-8B
Any-to-Any • 9B • Updated • 2.02k • 185 -
OpenMOSS-Team/MOSS-TTS-Nano-100M
Text-to-Speech • Updated • 150k • 213 -
KittenTTS Demo
😻85Generate natural‑sounding speech from any text