voiceclone-dev / README.md
crackuser's picture
Update README.md
60dcf48 verified

A newer version of the Gradio SDK is available: 6.3.0

Upgrade
metadata
title: Voice Cloning Studio
emoji: 🎀
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
preload_from_hub:
  - coqui/XTTS-v2
  - openai/whisper-base

🎭 Voice Cloning Studio

Real voice-to-voice and text-to-speech cloning using XTTS-v2 and Whisper AI.

✨ Features

  • 🎀 Voice-to-Voice Cloning: Transform input audio using reference voice characteristics
  • πŸ“ Text-to-Speech: Generate speech in any cloned voice
  • 🌍 Multi-language Support: 8+ languages supported
  • 🎡 High Quality: Professional 24kHz audio output
  • ⚑ Real-time Processing: Fast voice cloning with XTTS-v2

πŸš€ How to Use

Voice-to-Voice Cloning

  1. Upload Reference Voice - 6+ seconds of clear speech from the person to clone
  2. Upload Input Audio - Speech content you want to transform
  3. Select Language - Choose target language
  4. Click "Clone Voice" - AI will extract content and apply reference voice
  5. Download Result - New audio with same content, different voice

Text-to-Speech Cloning

  1. Upload Reference Voice - Voice sample to clone
  2. Enter Text - Type what you want the cloned voice to say
  3. Generate Speech - Create natural speech in the cloned voice
  4. Download Result - High-quality synthesized audio

πŸ”§ Technical Details

  • TTS Model: XTTS-v2 (Coqui AI) - State-of-the-art voice cloning
  • Speech Recognition: Whisper (OpenAI) - Accurate transcription
  • Languages: English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese
  • Quality: 24kHz professional audio generation
  • Processing: CPU/GPU optimized with automatic fallbacks

πŸ’‘ Tips for Best Results

  • Reference Audio: Use clear, single-speaker recordings with minimal background noise
  • Length: 6-10 seconds of reference audio works best
  • Quality: Higher quality input leads to better cloning results
  • Language: Match reference voice language when possible for optimal results

πŸ› οΈ Built With


Note: This space implements real voice cloning technology. Please use responsibly and respect others' voice rights and privacy.