Major Update: Kokoro-82M with 54 Premium Voices

#8
by masbudjj - opened
WS YB YT org

πŸŽ™οΈ Kokoro-82M Implementation - 54 Premium Voices

Major Changes:

  • βœ… Replace SpeechT5 with Kokoro-82M
  • βœ… 54 premium voices (American & British)
  • βœ… StyleTTS 2 architecture (82M parameters)
  • βœ… Gradio backend for better UX
  • βœ… HF Inference API integration

Voice Categories:

  1. πŸ‡ΊπŸ‡Έ American Female (11 voices)
  2. πŸ‡ΊπŸ‡Έ American Male (8 voices)
  3. πŸ‡¬πŸ‡§ British Female (4 voices)
  4. πŸ‡¬πŸ‡§ British Male (4 voices)

Technology:

  • Model: hexgrad/Kokoro-82M
  • Architecture: StyleTTS 2 + ISTFTNet
  • Backend: Gradio 4.x
  • API: Hugging Face Inference

Features:

  • 54 unique voice characters
  • Speed control (0.5x - 2x)
  • High-quality audio output
  • Natural prosody & emotion
  • Fast generation (~2-5s)
masbudjj changed pull request status to merged

Sign up or log in to comment