--- title: Hindi Voice Cloning (VibeVoice) emoji: ๐ŸŽ™๏ธ colorFrom: red colorTo: purple sdk: gradio sdk_version: "4.44.0" app_file: app.py pinned: false --- # ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi Voice Cloning with Emotion This Hugging Face Space provides **high-quality Hindi Text-to-Speech with voice cloning and expressive emotion**. Users can upload a short reference voice sample and generate Hindi speech in the **same voice, tone, and emotional style**. The system is powered by **VibeVoice-7B** with **Hindi LoRA fine-tuning**, optimized for natural prosody and long-form speech. --- ## โœจ Features - ๐ŸŽ™๏ธ Voice cloning from uploaded reference audio - ๐ŸŽญ Emotion & speaking style transfer - ๐Ÿ—ฃ๏ธ Natural-sounding Hindi TTS - ๐Ÿ“„ Long-form narration support - ๐Ÿš€ GPU-accelerated inference - ๐ŸŽš๏ธ Expression strength control (CFG scale) --- ## ๐Ÿงช How to Use 1. Enter Hindi text in the text box 2. Upload a **reference voice (WAV format)** 3. Adjust **Expression Strength (CFG Scale)** 4. Click **๐Ÿš€ Generate Voice** 5. Listen to or download the generated audio --- ## ๐ŸŽง Reference Voice Guidelines (Very Important) For best quality voice cloning: - WAV format only - 10โ€“30 seconds duration recommended - Single speaker - Clear audio, minimal background noise - Natural emotion (happy, calm, sad, etc.) > โš ๏ธ Emotion is copied from the **reference voice**, not from the text. --- ## ๐ŸŽญ Expression Control (CFG Scale) | CFG Scale | Effect | |---------|------| | 0.8 โ€“ 1.0 | Calm / neutral | | 1.2 โ€“ 1.4 | Natural & expressive (recommended) | | 1.5 โ€“ 2.0 | Strong emotion (may distort if too high) | --- ## โš ๏ธ System Requirements - โœ… GPU required - Recommended: A10 / A100 / H100 - โŒ CPU-only Spaces will not work - โณ First run may take time due to model loading --- ## ๐Ÿ” Privacy & Data Handling - Uploaded voice files are used **only for generation** - Voice files are overwritten per request - No permanent storage or reuse of user voices --- ## ๐Ÿšซ Responsible Use Policy This Space is intended for **research and demonstration purposes only**. โŒ Do NOT clone voices of real individuals without **explicit consent** โŒ Do NOT use for impersonation, fraud, or misinformation โŒ Do NOT present generated audio as real recordings โœ” Always disclose AI-generated audio when sharing publicly --- ## ๐Ÿง  Model Information - **Base Model:** VibeVoice-7B - **Hindi Fine-Tuning:** Hindi LoRA adapters - **Architecture:** LLM + acoustic & semantic tokenizers + diffusion head - **Technique:** LoRA (parameter-efficient fine-tuning) --- ## ๐Ÿ“œ License MIT License (Same as the base VibeVoice model and adapters) --- ## ๐Ÿ™ Acknowledgements - Microsoft Research โ€“ VibeVoice - VibeVoice Community - Hugging Face Open-Source Ecosystem --- ### โšก Note This is a **research/demo Space**, not recommended for production or real-time applications.