| --- |
| license: apache-2.0 |
| language: |
| - eu |
| base_model: |
| - HiTZ/TTS-eu_maider |
| pipeline_tag: text-to-speech |
| tags: |
| - TTS |
| - speech-synthesis |
| - Basque |
| - piper |
| datasets: |
| - itzune/maider-dataset |
| --- |
| |
| # Basque TTS: Maider (Piper Version) |
|
|
| This repository contains a [Piper](https://github.com/OHF-Voice/piper1-gpl) compatible version of the **Maider** Basque text-to-speech model. The original model was developed by **HiTZ Basque Center for Language Technology - Aholab Signal Processing Laboratory** (University of the Basque Country UPV/EHU). |
|
|
| This version has been exported/trained specifically for use with the Piper TTS engine, a fast, local neural text-to-speech engine. |
|
|
| ## Model Details |
|
|
| - **Language:** Basque (eu) |
| - **Speaker:** Maider (Female) |
| - **Architecture:** VITS (Optimized for Piper) |
| - **Original Credits:** HiTZ Center / Aholab (Project ILENIA) |
| - **Format:** Piper (`.onnx` and `.onnx.json` config) |
|
|
| ## Training Details |
|
|
| - **Dataset:** [itzune/maider-dataset](https://huggingface.co/datasets/itzune/maider-dataset) |
| - **Data Volume:** 99,996 high-quality audio samples (~100k files). |
| - **Architecture:** VITS |
| - **Training Engine:** Piper (PyTorch Lightning) |
| - **Iterations:** 22 epochs (258,750 steps) |
| - **Sample Rate:** 22050 Hz |
| - **Phonemization:** espeak-ng (Basque) |
|
|
| ## Data Source & Dataset Integration |
|
|
| This model has been fine-tuned/trained using the **maider_dataset**, a large-scale Basque speech corpus specifically curated for high-fidelity synthesis. |
| |
| - **Link to Dataset:** [Maider Dataset on Hugging Face](https://huggingface.co/datasets/itzune/maider-dataset) |
| - **Dataset structure:** The data was processed using WebDataset (sharded .tar files) to handle the 100k samples efficiently after the Piper training process. |
| - **Content:** Each audio file is paired with its corresponding Basque transcription in a `metadata.csv` file, ensuring precise alignment during the 22 epochs of training. |
| |
| ## Files Included |
| * `eu-maider-medium.onnx`: The exported model for fast inference. |
| * `eu-maider-medium.onnx.json`: The configuration file (includes phoneme map and synthesis settings). |
| * `epoch=22-step=258750.ckpt`: The PyTorch Lightning checkpoint from the 22nd iteration (useful for further training/fine-tuning). |
| |
| ## Usage |
| |
| ### Using Piper CLI |
| You can run the model locally using the Piper binary: |
| |
| ```bash |
| echo "Kaixo, hau Maider da, Piper motorra erabiliz euskaraz hitz egiten." | \ |
| ./piper --model eu-maider-medium.onnx --output_file output.wav |
| ``` |
| ### Python API |
| |
| ```Python |
| from piper.voice import PiperVoice |
| |
| voice = PiperVoice.load("eu-maider-medium.onnx", "eu-maider-medium.onnx.json") |
| with open("output.wav", "wb") as f: |
| voice.synthesize_wav("Gaur egun eguzkitsua dugu.", f) |
| ``` |
| |
| ## Original Model & Data Source |
| |
| The base model belongs to the Aholab TTS collection. All voices in this collection are based on the VITS architecture proposed by Kim et al. (2021). |
| |
| Maider & Antton: Developed by HiTZ with funding from Project ILENIA. |
| |
| License: Public Creative Commons Attribution 4.0 (for the voice resource) and Apache License 2.0 (for the code/model). |
| |
| ## Authors & Credits |
| |
| The original Maider model was created by: |
| HiTZ Basque Center for Language Technology - Aholab Signal Processing Laboratory, University of the Basque Country EHU. |