You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Configuration Parsing Warning:Invalid JSON for config file config.json

XTTS v2 — Gujarati & Hindi fine-tune

A fine-tuned XTTS v2 model for Gujarati and Hindi text-to-speech with voice cloning.

Model details

Attribute	Value
Base model	coqui/XTTS-v2
Languages	Gujarati (`gu`), Hindi (`hi`)
Training	5 epochs on NVIDIA L4 (24 GB)
Effective batch size	32 (batch_size=4, grad_acumm=8)
Learning rate	5e-6
Vocab extension	+404 Gujarati tokens

Training data

Fine-tuned on Arjun4707/gu-hi-tts: ~40K Gujarati + ~11K Hindi clips.

Data source: Audio clips scraped from publicly available YouTube videos, transcribed, cleaned (silence-trimmed, peak-normalized to -3 dBFS), stored as 24kHz mono PCM-16 WAV.

Files

File	Description
`model.pth`	Fine-tuned GPT encoder weights
`dvae.pth`	Discrete VAE (from base XTTS v2)
`vocab.json`	Extended vocabulary (base + 404 Gujarati tokens)
`config.json`	Model configuration
`mel_stats.pth`	Mel spectrogram statistics

Known limitations

Short sentences (< 5 words) may produce blabbering artifacts due to noisy scraped data
Longer sentences (10+ words) produce noticeably better quality

Training code

Full training pipeline, patches, and troubleshooting: BhammarArjun/TTS_1_training

License

CC-BY-NC-4.0 — Non-commercial use only.

The base XTTS v2 is licensed under Coqui Public Model License (non-commercial). The training data was sourced from YouTube audio. Both factors require a non-commercial license.

Citation

@misc{arjun2026xttsguhi,
  title   = {XTTS v2 fine-tuned for Gujarati and Hindi},
  author  = {Arjun Bhammar},
  year    = {2026},
  url     = {https://huggingface.co/Arjun4707/xtts-v2-gujarati-hindi}
}

Acknowledgements

Coqui AI / TTS for the original XTTS v2
anhnh2002 for the fine-tuning framework

Downloads last month: 3

Model tree for Arjun4707/xtts-v2-gujarati-hindi

Base model

coqui/XTTS-v2

Finetuned

(64)

this model