Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,59 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: gpl-3.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: gpl-3.0
|
| 3 |
+
datasets:
|
| 4 |
+
- ai4bharat/IndicVoices
|
| 5 |
+
language:
|
| 6 |
+
- hi
|
| 7 |
+
- bn
|
| 8 |
+
- ta
|
| 9 |
+
- te
|
| 10 |
+
- ml
|
| 11 |
+
- kn
|
| 12 |
+
- gu
|
| 13 |
+
- mr
|
| 14 |
+
- or
|
| 15 |
+
- pa
|
| 16 |
+
- as
|
| 17 |
+
- en
|
| 18 |
+
base_model:
|
| 19 |
+
- Plachta/Seed-VC
|
| 20 |
+
pipeline_tag: audio-to-audio
|
| 21 |
+
tags:
|
| 22 |
+
- voice-conversion
|
| 23 |
+
- Voice-Changer
|
| 24 |
+
- Voice
|
| 25 |
+
---
|
| 26 |
+
|
| 27 |
+
# IndicVoiceChanger
|
| 28 |
+
|
| 29 |
+
**IndicVoiceChanger** is a finetuned version of the [Seed Voice Conversion](https://huggingface.co/Plachta/Seed-VC/tree/main) model, adapted for Indian languages.
|
| 30 |
+
It enables high-quality voice conversion across multiple Indian languages, preserving speaker identity while changing the voice characteristics.
|
| 31 |
+
|
| 32 |
+
## Overview
|
| 33 |
+
This model is built upon the Seed Voice Conversion checkpoints and finetuned with a mix of publicly available open-source datasets and our own proprietary dataset.
|
| 34 |
+
It is designed to work well on speech data from diverse Indian languages, accents, and speaking styles.
|
| 35 |
+
|
| 36 |
+
## Try It Out
|
| 37 |
+
|
| 38 |
+
Experience the model firsthand at: **[Hugging Face Spaces Demo](https://huggingface.co/spaces/DreamSyncCo/IndicVoiceChanger)**
|
| 39 |
+
|
| 40 |
+
## Fine-tuning on Custom Data
|
| 41 |
+
|
| 42 |
+
While the zero shot performance of this model is usually good, it can be further improved with fine-tuning. This model supports efficient fine-tuning on your custom speakers with remarkable data efficiency and speed:
|
| 43 |
+
|
| 44 |
+
- **Minimal Data Requirements**: Train on new speakers with as little as **1 utterance per speaker**
|
| 45 |
+
- **Ultra-Fast Training**: Achieve good results in just **100 training steps** (approximately **2 minutes on T4 GPU**)
|
| 46 |
+
- **Speaker Adaptation**: Significantly improve performance on specific target speakers through personalized fine-tuning
|
| 47 |
+
|
| 48 |
+
### Getting Started with Fine-tuning
|
| 49 |
+
|
| 50 |
+
For detailed instructions on installation, usage, and fine-tuning, please refer to the comprehensive guide at: **https://github.com/Plachtaa/seed-vc**
|
| 51 |
+
|
| 52 |
+
**Note**: We will be updating this repository with example notebooks demonstrating the fine-tuning process soon.
|
| 53 |
+
|
| 54 |
+
**Important**: When fine-tuning for Indian languages, make sure to use the checkpoints provided in this repository as your starting point for optimal performance.
|
| 55 |
+
|
| 56 |
+
## Acknowledgments
|
| 57 |
+
|
| 58 |
+
Special thanks to the [SeedVC](https://github.com/Plachtaa/seed-vc) project for providing the foundational architecture and training framework that made this Indian language adaptation possible.
|
| 59 |
+
|