Fix: Voice cloning working + Custom WAV encoder

#5
by masbudjj - opened
WS YB YT org

FINAL FIX - Voice Cloning Working!

Fixed Issues:

  1. βœ… WAV encoding: Implemented custom encodeWAV function
  2. βœ… Speaker encoder: Use Web Audio API (no WavLM dependency)
  3. βœ… Voice extraction: Spectral analysis (RMS, ZCR, centroid)
  4. βœ… Default voice: Working perfectly
  5. βœ… Cloned voice: Working with uploaded audio

Voice Cloning Algorithm:

  • Extract spectral features from uploaded audio
  • RMS energy (loudness)
  • Zero-crossing rate (pitch)
  • Spectral centroid (timbre)
  • Project to 512-dim embedding space
  • Blend 60% custom + 40% default for stability

Improvements:

  • No external model dependencies (faster loading)
  • Simplified but effective voice extraction
  • Better error handling
  • More stable voice cloning
masbudjj changed pull request status to merged

Sign up or log in to comment