ZLSCompLing
/

CoquiTTS-Maxine

Model card Files Files and versions

CoquiTTS-Maxine / README.md

ZLSCompLing's picture

Upload README.md with huggingface_hub

4dd70da verified 3 days ago

|

history blame contribute delete

2.61 kB

	---
	license: mit
	language:
	- lb
	tags:
	- text-to-speech
	- tts
	- vits
	- coqui
	- luxembourgish
	library_name: coqui
	pipeline_tag: text-to-speech
	---

	# Coqui TTS - Maxine (Luxembourgish Female Voice)

	A VITS-based text-to-speech model for Luxembourgish, featuring a synthetic female voice.

	## Model Description

	This model was trained using the [Coqui TTS](https://github.com/coqui-ai/TTS) framework on Luxembourgish speech data from the [Lëtzebuerger Online Dictionnaire (LOD)](https://lod.lu) example sentences.

	"Maxine" is a synthetic female Luxembourgish voice created by modulating the original LOD recordings to produce a distinct female voice character.

	### Model Details

	- Architecture: VITS
	- Language: Luxembourgish (lb)
	- Speaker: Single speaker (female, synthetic)
	- Sample Rate: 22050 Hz
	- Checkpoint: ~90,000 steps
	- License: MIT

	## Usage

	Note: Text should be lowercased before synthesis. Additional text normalization may be required.

	```python
	import torch
	import scipy.io.wavfile as wavfile
	from TTS.utils.synthesizer import Synthesizer

	# Load the model
	synthesizer = Synthesizer(
	tts_checkpoint="path/to/coqui-tts-maxine.pth",
	tts_config_path="path/to/config.json",
	use_cuda=torch.cuda.is_available()
	)

	# Generate speech
	wav = synthesizer.tts("moien, wéi geet et dir?")

	# Save to file
	wavfile.write("output.wav", 22050, wav)
	```

	## Technical Specifications

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Hidden Channels \| 192 \|
	\| Text Encoder Layers \| 6 \|
	\| Posterior Encoder Layers \| 16 \|
	\| Flow Layers \| 4 \|
	\| Mel Channels \| 80 \|
	\| FFT Size \| 1024 \|

	## Citation

	If you use this model, please cite:

	```bibtex
	@misc{zls2025coquimaxine,
	title={Coqui TTS Maxine - Luxembourgish Female Voice},
	author={Zenter fir d'Lëtzebuerger Sprooch},
	year={2025},
	publisher={Hugging Face},
	url={https://huggingface.co/ZLSCompLing/CoquiTTS-Maxine}
	}
	```

	## Acknowledgments

	Developed by [Zenter fir d'Lëtzebuerger Sprooch](https://zls.lu).

	Voice data sourced from the [Lëtzebuerger Online Dictionnaire (LOD)](https://lod.lu). The original audio files are available via the [LOD linguistic data on data.public.lu](https://data.public.lu/en/datasets/letzebuerger-online-dictionnaire-lod-linguistesch-daten/), which provides an XML file containing example sentence IDs. Audio files can be accessed at:

	```
	https://lod.lu/uploads/examples/AAC/{folder}/{id}.m4a
	```

	where `{folder}` is the first 2 characters of `{id}`.

	This model is used in [Sproochmaschinn](https://sproochmaschinn.lu), a Luxembourgish speech processing platform.