indicF5 / README.md
ashishkblink's picture
Update README.md
11f0ab3 verified
---
title: Vakya 2.0 - Text-to-Speech
emoji: πŸŽ™οΈ
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
pinned: false
license: mit
---
# πŸŽ™οΈ Vakya 2.0 - Text-to-Speech Playground
**Vakya** is a high-quality Text-to-Speech model based on the IndicF5 architecture, supporting **11 Indian languages**.
## 🌟 Features
- **Multi-language Support**: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi, Tamil, Telugu
- **Voice Cloning**: Uses reference audio to clone voice characteristics
- **High Quality**: 24kHz sample rate, 0.4B parameter model
- **Easy to Use**: Simple interface for testing and experimentation
## πŸš€ How to Use
1. **Load Model**: Click the "Load Model" button (first time may take a few minutes to download)
2. **Upload Reference Audio**: Upload a short audio clip (<15 seconds recommended) that represents the voice you want to clone
3. **Enter Reference Text** (Optional): Type what is spoken in the reference audio. If left blank, the model will auto-transcribe it
4. **Enter Text to Generate**: Type the text you want to synthesize in any supported language
5. **Adjust Settings** (Optional):
- Speed: Control the speech rate (0.5x to 2.0x)
- Remove Silences: Experimental feature to remove pauses
6. **Generate**: Click "Generate Speech" and wait for the audio output
## πŸ“‹ Model Information
- **Model**: Vakya 2.0
- **Repository**: [ashishkblink/vakya2.0](https://huggingface.co/ashishkblink/vakya2.0)
- **Based on**: [IndicF5](https://github.com/AI4Bharat/IndicF5) by AI4Bharat (IIT Madras)
- **Model Size**: 0.4B parameters
- **Sample Rate**: 24000 Hz
- **Training Data**: 1417 hours of high-quality speech
- **License**: MIT License
## πŸ’‘ Tips for Best Results
- Keep reference audio clips short (<15 seconds) for best results
- Use clear, high-quality reference audio
- Provide reference text when possible for better voice matching
- The model works best with native speakers of the target language
## ⚠️ Terms of Use
- You must have explicit permission to clone voices
- Unauthorized voice cloning is strictly prohibited
- Any misuse of this model is the responsibility of the user
- This model is for research and educational purposes
## πŸ”— Links
- **Model Repository**: [ashishkblink/vakya2.0](https://huggingface.co/ashishkblink/vakya2.0)
- **GitHub**: [ashishkblink/vakya](https://github.com/ashishkblink/vakya)
- **IndicF5**: [AI4Bharat/IndicF5](https://github.com/AI4Bharat/IndicF5)
## πŸ™ Acknowledgments
This model is based on **IndicF5** developed by AI4Bharat (IIT Madras).
---
**Vakya** - Bringing voices to Indian languages πŸŽ™οΈ