Voice
Audio-to-Audio • Updated • 289k • 2.32kNote speech-to-speech model conversational AI - 7 billion parameter model — one of the larger/heavier ones here. You'd need significant GPU power (cloud compute), so running costs would be real.
nineninesix/kani-tts-2-pt
Text-to-Speech • 0.4B • Updated • 699 • 41Note text-to-speech model optimized specifically for real-time conversations. **Generally does not allow open commercial use without explicit permission. cheaper to run than the others.**
Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice
Text-to-Speech • 2B • Updated • 953k • 1.33kNote text-to-speech model with multiple modes: - 9 pre-built Custom Voice - Voice Design: Describe a voice in plain language (e.g., "warm female voice with a slight accent") and it creates it - Voice Clone: Give it 3 seconds of someone's voice and it clones it It supports 10 languages including English, Chinese, Japanese, Korean, French, German, Spanish, and more. It's been benchmarked against ElevenLabs and competes well. Free, commercial use allowed, no royalties. Cost to run: Medium-sized model.
NAMAA-Space/NAMAA-Saudi-TTS
Text-to-Speech • 0.5B • Updated • 141 • 42Note A text-to-speech model specifically for Saudi Arabic. Free for both research and commercial use. Very app-builder-friendly. Cost to run: Small model (0.5B), so one of the cheaper ones to run.