Voice

geleai 's Collections

updated Feb 22

nvidia/personaplex-7b-v1

Audio-to-Audio • Updated 25 days ago • 289k • 2.32k

Note speech-to-speech model conversational AI - 7 billion parameter model — one of the larger/heavier ones here. You'd need significant GPU power (cloud compute), so running costs would be real.
nineninesix/kani-tts-2-pt

Text-to-Speech • 0.4B • Updated Feb 19 • 699 • 41

Note text-to-speech model optimized specifically for real-time conversations. **Generally does not allow open commercial use without explicit permission. cheaper to run than the others.**
Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice

Text-to-Speech • 2B • Updated Jan 29 • 953k • 1.33k

Note text-to-speech model with multiple modes: - 9 pre-built Custom Voice - Voice Design: Describe a voice in plain language (e.g., "warm female voice with a slight accent") and it creates it - Voice Clone: Give it 3 seconds of someone's voice and it clones it It supports 10 languages including English, Chinese, Japanese, Korean, French, German, Spanish, and more. It's been benchmarked against ElevenLabs and competes well. Free, commercial use allowed, no royalties. Cost to run: Medium-sized model.
NAMAA-Space/NAMAA-Saudi-TTS

Text-to-Speech • 0.5B • Updated Jan 29 • 141 • 42

Note A text-to-speech model specifically for Saudi Arabic. Free for both research and commercial use. Very app-builder-friendly. Cost to run: Small model (0.5B), so one of the cheaper ones to run.