File size: 2,857 Bytes
596eab1 5b700f0 596eab1 d42e822 6866d41 596eab1 b44cb01 596eab1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
---
license: cc-by-nc-4.0
language:
- lb
tags:
- text-to-speech
- tts
- vits
- coqui
- luxembourgish
library_name: coqui
pipeline_tag: text-to-speech
---
# Coqui TTS - Max (Luxembourgish Male Voice)
A VITS-based text-to-speech model for Luxembourgish, featuring a natural male voice.
## Model Description
This model was trained using the [Coqui TTS](https://github.com/coqui-ai/TTS) framework on Luxembourgish speech data from the [Lëtzebuerger Online Dictionnaire (LOD)](https://lod.lu) example sentences.
"Max" is a male Luxembourgish voice based on recordings from a real speaker.
### Model Details
- **Architecture:** VITS
- **Language:** Luxembourgish (lb)
- **Speaker:** Single speaker (male)
- **Sample Rate:** 22050 Hz
- **Checkpoint:** 50,000 steps
- **License:** CC BY-NC 4.0 (Non-commercial use only)
## License Notice
**This model is for non-commercial use only.** All commercial uses are prohibited. The voice data is derived from recordings of a real speaker and may only be used freely for non-commercial purposes.
## Usage
**Note:** Text should be lowercased before synthesis. Additional text normalization may be required.
```python
import torch
import scipy.io.wavfile as wavfile
from TTS.utils.synthesizer import Synthesizer
# Load the model
synthesizer = Synthesizer(
tts_checkpoint="path/to/coqui-tts-max.pth",
tts_config_path="path/to/config.json",
use_cuda=torch.cuda.is_available()
)
# Generate speech
wav = synthesizer.tts("moien, wéi geet et dir?")
# Save to file
wavfile.write("output.wav", 22050, wav)
```
## Technical Specifications
| Parameter | Value |
|-----------|-------|
| Hidden Channels | 192 |
| Text Encoder Layers | 6 |
| Posterior Encoder Layers | 16 |
| Flow Layers | 4 |
| Mel Channels | 80 |
| FFT Size | 1024 |
## Citation
If you use this model, please cite:
```bibtex
@misc{zls2025coquimax,
title={Coqui TTS Max - Luxembourgish Male Voice},
author={Zenter fir d'Lëtzebuerger Sprooch},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/ZLSCompLing/CoquiTTS-Max}
}
```
## Acknowledgments
Originally trained by [Marco Barnig](https://huggingface.co/mbarnig). Now developed and maintained by [Zenter fir d'Lëtzebuerger Sprooch](https://zls.lu).
Voice data sourced from the [Lëtzebuerger Online Dictionnaire (LOD)](https://lod.lu). The original audio files are available via the [LOD linguistic data on data.public.lu](https://data.public.lu/en/datasets/letzebuerger-online-dictionnaire-lod-linguistesch-daten/), which provides an XML file containing example sentence IDs. Audio files can be accessed at:
```
https://lod.lu/uploads/examples/AAC/{folder}/{id}.m4a
```
where `{folder}` is the first 2 characters of `{id}`.
This model is used in [Sproochmaschinn](https://sproochmaschinn.lu), a Luxembourgish speech processing platform.
|