Spaces:

crackuser
/

voiceclone-dev

Sleeping

App Files Files Community

voiceclone-dev / README.md

crackuser

Update README.md

60dcf48 verified 4 months ago

preview code

raw

history blame contribute delete

2.45 kB

	---
	title: Voice Cloning Studio
	emoji: 🎤
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: "4.44.0"
	app_file: app.py
	pinned: false
	preload_from_hub:
	- coqui/XTTS-v2
	- openai/whisper-base
	---

	# 🎭 Voice Cloning Studio

	Real voice-to-voice and text-to-speech cloning using XTTS-v2 and Whisper AI.

	## ✨ Features

	- 🎤 Voice-to-Voice Cloning: Transform input audio using reference voice characteristics
	- 📝 Text-to-Speech: Generate speech in any cloned voice
	- 🌍 Multi-language Support: 8+ languages supported
	- 🎵 High Quality: Professional 24kHz audio output
	- ⚡ Real-time Processing: Fast voice cloning with XTTS-v2

	## 🚀 How to Use

	### Voice-to-Voice Cloning
	1. Upload Reference Voice - 6+ seconds of clear speech from the person to clone
	2. Upload Input Audio - Speech content you want to transform
	3. Select Language - Choose target language
	4. Click "Clone Voice" - AI will extract content and apply reference voice
	5. Download Result - New audio with same content, different voice

	### Text-to-Speech Cloning
	1. Upload Reference Voice - Voice sample to clone
	2. Enter Text - Type what you want the cloned voice to say
	3. Generate Speech - Create natural speech in the cloned voice
	4. Download Result - High-quality synthesized audio

	## 🔧 Technical Details

	- TTS Model: XTTS-v2 (Coqui AI) - State-of-the-art voice cloning
	- Speech Recognition: Whisper (OpenAI) - Accurate transcription
	- Languages: English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese
	- Quality: 24kHz professional audio generation
	- Processing: CPU/GPU optimized with automatic fallbacks

	## 💡 Tips for Best Results

	- Reference Audio: Use clear, single-speaker recordings with minimal background noise
	- Length: 6-10 seconds of reference audio works best
	- Quality: Higher quality input leads to better cloning results
	- Language: Match reference voice language when possible for optimal results

	## 🛠️ Built With

	- [XTTS-v2](https://huggingface.co/coqui/XTTS-v2) - Voice cloning model
	- [Whisper](https://github.com/openai/whisper) - Speech recognition
	- [Gradio](https://gradio.app/) - Web interface
	- [HuggingFace Spaces](https://huggingface.co/spaces) - Hosting platform

	---

	Note: This space implements real voice cloning technology. Please use responsibly and respect others' voice rights and privacy.