IndexTTS2 Kazakh Finetune
Fine-tuned GPT component of IndexTTS2 on Kazakh language data.
Training Details
- Base model: IndexTeam/IndexTTS-2
- Dataset: InflexionLab/ISSAI-KSC2-Structured — 4.8GB subset
- Hardware: Google Colab A100 40GB
- Training: 3 epochs, 912 steps
- BPE vocab size: 2000 tokens (Kazakh-specific)
Validated on Tesla V100 16GB. Inference uses approximately 10GB VRAM.
Audio Sample
Text: Менің досымның жақсы және жаман әдеттері бар. Жақсы әдетт ол маған көмектеседі. Ол мені әрқашан қолдайтын болады. Ал жаман әдетті кейде ол менің затымды рұқсатсыз алып кетуі мүмкін. Кейде ол мені ренжітеді. Бірақ артынан менден кешірім сұрайды.
Training Curves
Limitations
The training data contains a single speaker throughout all 4.8GB. Voice cloning does not generalize to other voices at this stage. The model produces Kazakh speech but defaults to the training speaker regardless of the reference audio provided.
This will improve with a multi-speaker dataset. Contributions and experiments are welcome.
Installation
Download the base model first:
from huggingface_hub import snapshot_download
snapshot_download(
repo_id="IndexTeam/IndexTTS-2",
local_dir="checkpoints"
)
Then replace the GPT checkpoint and tokenizer with the Kazakh versions:
# Download this repo
hf download AnuarSv/IndexTTS2-Kazakh --local-dir kz_files
# Replace
cp kz_files/gpt.pth checkpoints/gpt.pth
cp kz_files/bpe.model checkpoints/bpe.model
cp kz_files/config.yaml checkpoints/config.yaml
Clone the training branch and run:
git clone -b training_v2 https://github.com/JarodMica/index-tts.git
cd index-tts
uv sync --all-extras
uv run webui.py --model_dir checkpoints --fp16
Config Change
The base model uses number_text_tokens: 12000. This finetune uses 2000 to match the Kazakh BPE vocabulary. The config.yaml in this repo already has this applied.
What's in This Repo
| File | Description |
|---|---|
gpt.pth |
Fine-tuned GPT checkpoint (Kazakh) |
bpe.model |
SentencePiece BPE tokenizer trained on Kazakh text |
config.yaml |
Model config with number_text_tokens: 2000 |
Files not included (s2mel.pth, feat1.pt, feat2.pt, qwen0.6bemo4-merge) are part of the base model and should be downloaded from IndexTeam/IndexTTS-2.
- Downloads last month
- 7
Model tree for AnuarSv/IndexTTS2-Kazakh
Base model
IndexTeam/IndexTTS-2