ybtts / README.md
masbudjj's picture
Major Update: Kokoro-82M with 54 Premium Voices
a0e227e verified
|
raw
history blame
2.98 kB
metadata
title: Kokoro-82M TTS - 54 Premium Voices
emoji: πŸŽ™οΈ
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0

πŸŽ™οΈ Kokoro-82M Text-to-Speech

World-Class TTS with 54 Premium Voices

✨ Features

🎭 54 Premium Voices

πŸ‡ΊπŸ‡Έ American English (19 voices)

Female (11 voices):

  • Heart - Warm & Friendly
  • Bella - Elegant & Smooth
  • Nicole - Professional
  • Aoede - Cheerful
  • Kore - Gentle
  • Sarah - Clear
  • Nova - Modern
  • Sky - Light
  • Alloy - Versatile
  • Jessica - Natural
  • River - Calm

Male (8 voices):

  • Michael - Deep & Authoritative
  • Fenrir - Strong
  • Puck - Playful
  • Echo - Resonant
  • Eric - Professional
  • Liam - Friendly
  • Onyx - Rich
  • Adam - Natural

πŸ‡¬πŸ‡§ British English (8 voices)

Female (4 voices):

  • Emma - Refined
  • Isabella - Elegant
  • Alice - Clear
  • Lily - Soft

Male (4 voices):

  • George - Distinguished
  • Fable - Storyteller
  • Lewis - Smooth
  • Daniel - Professional

πŸ—οΈ Model Architecture

Kokoro-82M based on StyleTTS 2:


🎯 Features

βœ… 54 Unique Voices - American & British accents βœ… Natural Prosody - Human-like intonation βœ… Fast Generation - 2-5 seconds per sentence βœ… Speed Control - 0.5x to 2x playback βœ… High Quality - StyleTTS 2 architecture βœ… Open Source - Apache 2.0 license


πŸ’» Technology Stack

  • Backend: Gradio + Hugging Face Inference API
  • Model: Kokoro-82M (hexgrad/Kokoro-82M)
  • Architecture: StyleTTS 2 + ISTFTNet
  • Deployment: Hugging Face Spaces

πŸš€ Usage

  1. Choose Voice - Select from 54 premium voices
  2. Enter Text - Type or paste your content
  3. Adjust Speed - Control playback rate (0.5x - 2x)
  4. Generate - Click to synthesize speech
  5. Download - Save audio as WAV file

πŸ“Š Comparison with Other Models

Feature Kokoro-82M SpeechT5 VITS
Voices 54 1 Variable
Quality Excellent Good Good
Speed Fast Medium Fast
Accents US/UK Generic Variable
License Apache 2.0 Apache 2.0 MIT

πŸŽ“ Credits

  • Model: hexgrad/Kokoro-82M
  • Base Architecture: StyleTTS 2 by Li et al.
  • Decoder: ISTFTNet
  • Training: Ethical permissive-licensed data only

πŸ“ License

Apache 2.0 - Free for commercial use


πŸ”— Links


Built with ❀️ using Kokoro-82M & Gradio