Spaces:

crackuser
/

voiceclone-dev

Sleeping

App Files Files Community

voiceclone-dev / README.md

crackuser

Update README.md

60dcf48 verified 4 months ago

preview code

raw

history blame contribute delete

2.45 kB

A newer version of the Gradio SDK is available: 6.3.0

Upgrade

metadata

title: Voice Cloning Studio
emoji: 🎤
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
preload_from_hub:
  - coqui/XTTS-v2
  - openai/whisper-base

🎭 Voice Cloning Studio

Real voice-to-voice and text-to-speech cloning using XTTS-v2 and Whisper AI.

✨ Features

🎤 Voice-to-Voice Cloning: Transform input audio using reference voice characteristics
📝 Text-to-Speech: Generate speech in any cloned voice
🌍 Multi-language Support: 8+ languages supported
🎵 High Quality: Professional 24kHz audio output
⚡ Real-time Processing: Fast voice cloning with XTTS-v2

🚀 How to Use

Voice-to-Voice Cloning

Upload Reference Voice - 6+ seconds of clear speech from the person to clone
Upload Input Audio - Speech content you want to transform
Select Language - Choose target language
Click "Clone Voice" - AI will extract content and apply reference voice
Download Result - New audio with same content, different voice

Text-to-Speech Cloning

Upload Reference Voice - Voice sample to clone
Enter Text - Type what you want the cloned voice to say
Generate Speech - Create natural speech in the cloned voice
Download Result - High-quality synthesized audio

🔧 Technical Details

TTS Model: XTTS-v2 (Coqui AI) - State-of-the-art voice cloning
Speech Recognition: Whisper (OpenAI) - Accurate transcription
Languages: English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese
Quality: 24kHz professional audio generation
Processing: CPU/GPU optimized with automatic fallbacks

💡 Tips for Best Results

Reference Audio: Use clear, single-speaker recordings with minimal background noise
Length: 6-10 seconds of reference audio works best
Quality: Higher quality input leads to better cloning results
Language: Match reference voice language when possible for optimal results

🛠️ Built With

XTTS-v2 - Voice cloning model
Whisper - Speech recognition
Gradio - Web interface
HuggingFace Spaces - Hosting platform

Note: This space implements real voice cloning technology. Please use responsibly and respect others' voice rights and privacy.