SpireLab
/

Sourashtra_Vits_TTS_Models

Model card Files Files and versions

Sourashtra_Vits_TTS_Models / README.md

SpireLab's picture

Upload 21 files

b9102cf verified 12 days ago

|

History Blame Contribute Delete

2.72 kB

	---
	language:
	- saz
	- ta
	license: cc-by-4.0
	tags:
	- text-to-speech
	- tts
	- vits
	- sourashtra
	- low-resource
	pipeline_tag: text-to-speech
	---

	# Sourashtra VITS TTS Models

	VITS text-to-speech models for the [Sourashtra language](https://en.wikipedia.org/wiki/Sourashtra_language) (ISO 639-3: `saz`), a minority Indo-Aryan language spoken primarily in Tamil Nadu, India. Trained using [Coqui TTS](https://github.com/coqui-ai/TTS) on a custom annotated speech corpus.

	Four variants: 2 speakers (male, female) × 2 input scripts (Tamil script, Sourashtra script).

	---

	## Models

	\| Folder \| Speaker \| Input Script \| Training Steps \|
	\|--------\|---------\|--------------\|----------------\|
	\| `Sourashtra-Male_Script-tamil` \| Male \| Tamil (தமிழ்) \| 300,000 \|
	\| `Sourashtra-Male_Script-sourashtra` \| Male \| Sourashtra (ꢪꢾꢥꢶꢒ) \| 300,000 \|
	\| `Sourashtra-Female_Script-tamil` \| Female \| Tamil (தமிழ்) \| 340,000 \|
	\| `Sourashtra-Female_Script-sourashtra` \| Female \| Sourashtra (ꢪꢾꢥꢶꢒ) \| 340,000 \|

	Each folder contains `best_model.pth`, `config.json`, `inference.py`, and `requirements.txt`.

	---

	## Setup

	```bash
	pip install -r requirements.txt
	```

	For GPU inference, install the CUDA-enabled PyTorch build matching your driver first — see [pytorch.org](https://pytorch.org/get-started/locally/).

	---

	## Usage

	Run `inference.py` from inside the model folder:

	```bash
	# Male — Tamil script
	cd Sourashtra-Male_Script-tamil
	python inference.py "சொராஷ்ட்ர மொழி" -o output.wav

	# Male — Sourashtra script
	cd Sourashtra-Male_Script-sourashtra
	python inference.py "ꢪꢾꢥꢶꢒ ꢪꢒꢡ" -o output.wav

	# Female — Tamil script
	cd Sourashtra-Female_Script-tamil
	python inference.py "சொராஷ்ட்ர மொழி" -o output.wav

	# Female — Sourashtra script
	cd Sourashtra-Female_Script-sourashtra
	python inference.py "ꢪꢾꢥꢶꢒ ꢪꢒꢡ" -o output.wav
	```

	Use `--gpu <id>` to select a GPU, or `--cpu` to force CPU inference.

	---

	## Script Notes

	The Tamil-script and Sourashtra-script models produce speech from the same speaker — only the input orthography differs. Choose based on your text source.

	- Tamil script models — strip `:`, `.`, `'` and apply NFC normalization automatically
	- Sourashtra script models — strip Sourashtra Danda (꣎) and Double Danda (꣏) automatically

	---

	## Training

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Architecture \| VITS (end-to-end, flow-based) \|
	\| Sample rate \| 22050 Hz \|
	\| Mel bins \| 80 \|
	\| Batch size \| 16 \|
	\| Mixed precision \| Yes \|
	\| Phonemes \| No (character-level) \|

	Male training data: ~9,800–10,000 utterances. Female training data: ~11,400 utterances.