indicF5

Sleeping

File size: 2,673 Bytes

---
title: Vakya 2.0 - Text-to-Speech
emoji: 🎙️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
pinned: false
license: mit
---

# 🎙️ Vakya 2.0 - Text-to-Speech Playground

**Vakya** is a high-quality Text-to-Speech model based on the IndicF5 architecture, supporting **11 Indian languages**.

## 🌟 Features

- **Multi-language Support**: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi, Tamil, Telugu
- **Voice Cloning**: Uses reference audio to clone voice characteristics
- **High Quality**: 24kHz sample rate, 0.4B parameter model
- **Easy to Use**: Simple interface for testing and experimentation

## 🚀 How to Use

1. **Load Model**: Click the "Load Model" button (first time may take a few minutes to download)
2. **Upload Reference Audio**: Upload a short audio clip (<15 seconds recommended) that represents the voice you want to clone
3. **Enter Reference Text** (Optional): Type what is spoken in the reference audio. If left blank, the model will auto-transcribe it
4. **Enter Text to Generate**: Type the text you want to synthesize in any supported language
5. **Adjust Settings** (Optional): 
   - Speed: Control the speech rate (0.5x to 2.0x)
   - Remove Silences: Experimental feature to remove pauses
6. **Generate**: Click "Generate Speech" and wait for the audio output

## 📋 Model Information

- **Model**: Vakya 2.0
- **Repository**: [ashishkblink/vakya2.0](https://huggingface.co/ashishkblink/vakya2.0)
- **Based on**: [IndicF5](https://github.com/AI4Bharat/IndicF5) by AI4Bharat (IIT Madras)
- **Model Size**: 0.4B parameters
- **Sample Rate**: 24000 Hz
- **Training Data**: 1417 hours of high-quality speech
- **License**: MIT License

## 💡 Tips for Best Results

- Keep reference audio clips short (<15 seconds) for best results
- Use clear, high-quality reference audio
- Provide reference text when possible for better voice matching
- The model works best with native speakers of the target language

## ⚠️ Terms of Use

- You must have explicit permission to clone voices
- Unauthorized voice cloning is strictly prohibited
- Any misuse of this model is the responsibility of the user
- This model is for research and educational purposes

## 🔗 Links

- **Model Repository**: [ashishkblink/vakya2.0](https://huggingface.co/ashishkblink/vakya2.0)
- **GitHub**: [ashishkblink/vakya](https://github.com/ashishkblink/vakya)
- **IndicF5**: [AI4Bharat/IndicF5](https://github.com/AI4Bharat/IndicF5)

## 🙏 Acknowledgments

This model is based on **IndicF5** developed by AI4Bharat (IIT Madras).

---

**Vakya** - Bringing voices to Indian languages 🎙️