ZLSCompLing commited on
Commit
248b831
·
verified ·
1 Parent(s): ce10279

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +91 -3
README.md CHANGED
@@ -1,3 +1,91 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - lb
5
+ tags:
6
+ - text-to-speech
7
+ - tts
8
+ - vits
9
+ - coqui
10
+ - luxembourgish
11
+ library_name: coqui
12
+ pipeline_tag: text-to-speech
13
+ ---
14
+
15
+ # Coqui TTS - Maxine (Luxembourgish Female Voice)
16
+
17
+ A VITS-based text-to-speech model for Luxembourgish, featuring a synthetic female voice.
18
+
19
+ ## Model Description
20
+
21
+ This model was trained using the [Coqui TTS](https://github.com/coqui-ai/TTS) framework on Luxembourgish speech data from the [Lëtzebuerger Online Dictionnaire (LOD)](https://lod.lu) example sentences.
22
+
23
+ "Maxine" is a synthetic female Luxembourgish voice created by modulating the original LOD recordings to produce a distinct female voice character.
24
+
25
+ ### Model Details
26
+
27
+ - **Architecture:** VITS
28
+ - **Language:** Luxembourgish (lb)
29
+ - **Speaker:** Single speaker (female, synthetic)
30
+ - **Sample Rate:** 22050 Hz
31
+ - **License:** MIT
32
+
33
+ ## Usage
34
+
35
+ ```python
36
+ import torch
37
+ import scipy.io.wavfile as wavfile
38
+ from TTS.utils.synthesizer import Synthesizer
39
+
40
+ # Load the model
41
+ synthesizer = Synthesizer(
42
+ tts_checkpoint="path/to/coqui-tts-maxine.pth",
43
+ tts_config_path="path/to/config.json",
44
+ use_cuda=torch.cuda.is_available()
45
+ )
46
+
47
+ # Generate speech
48
+ wav = synthesizer.tts("Moien, wéi geet et dir?")
49
+
50
+ # Save to file
51
+ wavfile.write("output.wav", 22050, wav)
52
+ ```
53
+
54
+ ## Technical Specifications
55
+
56
+ | Parameter | Value |
57
+ |-----------|-------|
58
+ | Hidden Channels | 192 |
59
+ | Text Encoder Layers | 6 |
60
+ | Posterior Encoder Layers | 16 |
61
+ | Flow Layers | 4 |
62
+ | Mel Channels | 80 |
63
+ | FFT Size | 1024 |
64
+
65
+ ## Citation
66
+
67
+ If you use this model, please cite:
68
+
69
+ ```bibtex
70
+ @misc{zls2025coquimaxine,
71
+ title={Coqui TTS Maxine - Luxembourgish Female Voice},
72
+ author={Zenter fir d'Lëtzebuerger Sprooch},
73
+ year={2025},
74
+ publisher={Hugging Face},
75
+ url={https://huggingface.co/ZLSCompLing/CoquiTTS-Maxine}
76
+ }
77
+ ```
78
+
79
+ ## Acknowledgments
80
+
81
+ Developed by [Zenter fir d'Lëtzebuerger Sprooch](https://zls.lu).
82
+
83
+ Voice data sourced from the [Lëtzebuerger Online Dictionnaire (LOD)](https://lod.lu). The original audio files are available via the [LOD linguistic data on data.public.lu](https://data.public.lu/en/datasets/letzebuerger-online-dictionnaire-lod-linguistesch-daten/), which provides an XML file containing example sentence IDs. Audio files can be accessed at:
84
+
85
+ ```
86
+ https://lod.lu/uploads/examples/AAC/{folder}/{id}.m4a
87
+ ```
88
+
89
+ where `{folder}` is the first 2 characters of `{id}`.
90
+
91
+ This model is used in [Sproochmaschinn](https://sproochmaschinn.lu), a Luxembourgish speech processing platform.