| # Saudi Arabic (MSA) TTS Model - Piper |
|
|
| This repository contains a high-quality Piper TTS model trained on Saudi Arabic (Modern Standard Arabic) dataset for **455 epochs**. |
|
|
| ## Model Details |
|
|
| - **Language**: Arabic (Saudi dialect) |
| - **Framework**: Piper TTS |
| - **Sample Rate**: 22050 Hz |
| - **Training Epochs**: 455 |
| - **Dataset Size**: 11,592 audio samples |
| - **Speakers**: 5 speakers (SPK1-SPK5) |
| - **Model Quality**: Professional grade |
|
|
| ## Model Files |
|
|
| - `checkpoints/epoch=455-step=1189248.ckpt` - PyTorch Lightning checkpoint (807 MB) |
| - `config.json` - Model configuration file |
| - `training_data.csv` - Training dataset metadata |
| - `scripts/export_jit.py` - ONNX export script |
|
|
| ## Quick Start |
|
|
| ### Export to ONNX |
|
|
| ```bash |
| python3 scripts/export_jit.py |
| ``` |
|
|
| This will create an ONNX model file that can be used with Piper for inference. |
|
|
| ### Usage with Piper |
|
|
| ```bash |
| # Install Piper TTS |
| pip install piper-tts |
| |
| # After exporting to ONNX |
| echo 'ู
ุฑุญุจุง ุจู ูู ูุธุงู
ุงูุชุญููู ุงููุตู ุฅูู ููุงู
' | \ |
| piper --model saudi_msa_epoch455.onnx --output_file output.wav |
| ``` |
|
|
| ### Python Usage |
|
|
| ```python |
| from piper import PiperVoice |
| |
| voice = PiperVoice.load("saudi_msa_epoch455.onnx") |
| |
| # Synthesize speech |
| with open("output.wav", "wb") as f: |
| voice.synthesize_stream_raw("ู
ุฑุญุจุง ุจู", f) |
| ``` |
|
|
| ## Training Details |
|
|
| ### Dataset Statistics |
|
|
| | Speaker | Samples | |
| |---------|---------| |
| | SPK1 | 3,000 | |
| | SPK2 | 714 | |
| | SPK3 | 1,656 | |
| | SPK4 | 2,057 | |
| | SPK5 | 4,193 | |
| | **Total** | **11,592** | |
|
|
| ### Training Configuration |
|
|
| ```yaml |
| voice_name: saudi_msa |
| sample_rate: 22050 |
| espeak_voice: ar |
| batch_size: 8 |
| epochs: 455 |
| optimizer: Adam |
| ``` |
|
|
| ### Training Environment |
|
|
| - Python 3.11 |
| - PyTorch 2.x with CUDA |
| - Lightning 2.x |
| - Total training time: ~85+ hours |
|
|
| ## Model Performance |
|
|
| This model has been trained for **455 epochs**, providing: |
|
|
| - โ
**Excellent audio quality** with minimal background noise |
| - โ
**Clear pronunciation** of Arabic words |
| - โ
**Natural prosody** and intonation |
| - โ
**Professional-grade output** suitable for production use |
|
|
| The model performs exceptionally well on: |
| - Customer service dialogues |
| - Banking and financial terminology |
| - General conversational Arabic |
| - Saudi dialect expressions |
|
|
| ## Export Instructions |
|
|
| To export the checkpoint to ONNX format: |
|
|
| ```bash |
| cd scripts |
| python3 export_jit.py |
| ``` |
|
|
| The script will: |
| 1. Load the checkpoint from `checkpoints/epoch=455-step=1189248.ckpt` |
| 2. Export to ONNX format with optimizations |
| 3. Create `saudi_msa_epoch455.onnx` file |
|
|
| Make sure to copy the `config.json` file alongside the ONNX model: |
|
|
| ```bash |
| cp config.json saudi_msa_epoch455.onnx.json |
| ``` |
|
|
| ## Files Structure |
|
|
| ``` |
| . |
| โโโ README.md |
| โโโ config.json # Model configuration |
| โโโ training_data.csv # Dataset metadata |
| โโโ checkpoints/ |
| โ โโโ epoch=455-step=1189248.ckpt # Latest checkpoint (807 MB) |
| โโโ scripts/ |
| โโโ export_jit.py # ONNX export script |
| โโโ train_piper.sh # Training script |
| โโโ create_training_file.py # Data preparation script |
| ``` |
|
|
| ## License |
|
|
| This model is trained using the Piper TTS framework which is licensed under GPL-3.0. |
|
|
| ## Citation |
|
|
| If you use this model, please cite: |
|
|
| ```bibtex |
| @misc{saudi_msa_piper_2026, |
| title={Saudi Arabic TTS Model for Piper - Epoch 455}, |
| author={Piper MSA Project}, |
| year={2026}, |
| publisher={Hugging Face}, |
| howpublished={\url{https://huggingface.co/YOUR_USERNAME/saudi-msa-piper}} |
| } |
| ``` |
|
|
| ## Acknowledgments |
|
|
| - Piper TTS: https://github.com/rhasspy/piper |
| - eSpeak-ng for Arabic phonemization |
| - Original dataset contributors |
|
|
| ## Sample Usage |
|
|
| ```python |
| # Example: Generate customer service greeting |
| text = "ุญูุงู ุงููู ุนู
ูููุง ุงูุนุฒูุฒุ ููู ุงูุฏุฑ ุงุณุงุนุฏู ุงูููู
ุ" |
| echo text | piper --model saudi_msa_epoch455.onnx --output_file greeting.wav |
| ``` |
|
|
| ## Model Comparison |
|
|
| | Epoch | Quality | Noise Level | Clarity | |
| |-------|---------|-------------|---------| |
| | 65 | Good | Moderate | Fair | |
| | 176 | Very Good | Low | Good | |
| | 438 | Excellent | Very Low | Excellent | |
| | **455** | **Professional** | **Minimal** | **Excellent** | |
|
|
| --- |
|
|
| For questions or issues, please open an issue on the repository. |
|
|