--- license: apache-2.0 language: - eu base_model: - HiTZ/TTS-eu_maider pipeline_tag: text-to-speech tags: - TTS - speech-synthesis - Basque - piper datasets: - itzune/maider-dataset --- # Basque TTS: Maider (Piper Version) This repository contains a [Piper](https://github.com/OHF-Voice/piper1-gpl) compatible version of the **Maider** Basque text-to-speech model. The original model was developed by **HiTZ Basque Center for Language Technology - Aholab Signal Processing Laboratory** (University of the Basque Country UPV/EHU). This version has been exported/trained specifically for use with the Piper TTS engine, a fast, local neural text-to-speech engine. ## Model Details - **Language:** Basque (eu) - **Speaker:** Maider (Female) - **Architecture:** VITS (Optimized for Piper) - **Original Credits:** HiTZ Center / Aholab (Project ILENIA) - **Format:** Piper (`.onnx` and `.onnx.json` config) ## Training Details - **Dataset:** [itzune/maider-dataset](https://huggingface.co/datasets/itzune/maider-dataset) - **Data Volume:** 99,996 high-quality audio samples (~100k files). - **Architecture:** VITS - **Training Engine:** Piper (PyTorch Lightning) - **Iterations:** 22 epochs (258,750 steps) - **Sample Rate:** 22050 Hz - **Phonemization:** espeak-ng (Basque) ## Data Source & Dataset Integration This model has been fine-tuned/trained using the **maider_dataset**, a large-scale Basque speech corpus specifically curated for high-fidelity synthesis. - **Link to Dataset:** [Maider Dataset on Hugging Face](https://huggingface.co/datasets/itzune/maider-dataset) - **Dataset structure:** The data was processed using WebDataset (sharded .tar files) to handle the 100k samples efficiently after the Piper training process. - **Content:** Each audio file is paired with its corresponding Basque transcription in a `metadata.csv` file, ensuring precise alignment during the 22 epochs of training. ## Files Included * `eu-maider-medium.onnx`: The exported model for fast inference. * `eu-maider-medium.onnx.json`: The configuration file (includes phoneme map and synthesis settings). * `epoch=22-step=258750.ckpt`: The PyTorch Lightning checkpoint from the 22nd iteration (useful for further training/fine-tuning). ## Usage ### Using Piper CLI You can run the model locally using the Piper binary: ```bash echo "Kaixo, hau Maider da, Piper motorra erabiliz euskaraz hitz egiten." | \ ./piper --model eu-maider-medium.onnx --output_file output.wav ``` ### Python API ```Python from piper.voice import PiperVoice voice = PiperVoice.load("eu-maider-medium.onnx", "eu-maider-medium.onnx.json") with open("output.wav", "wb") as f: voice.synthesize_wav("Gaur egun eguzkitsua dugu.", f) ``` ## Original Model & Data Source The base model belongs to the Aholab TTS collection. All voices in this collection are based on the VITS architecture proposed by Kim et al. (2021). Maider & Antton: Developed by HiTZ with funding from Project ILENIA. License: Public Creative Commons Attribution 4.0 (for the voice resource) and Apache License 2.0 (for the code/model). ## Authors & Credits The original Maider model was created by: HiTZ Basque Center for Language Technology - Aholab Signal Processing Laboratory, University of the Basque Country EHU.