# Saudi Arabic (MSA) TTS Model - Piper This repository contains a high-quality Piper TTS model trained on Saudi Arabic (Modern Standard Arabic) dataset for **455 epochs**. ## Model Details - **Language**: Arabic (Saudi dialect) - **Framework**: Piper TTS - **Sample Rate**: 22050 Hz - **Training Epochs**: 455 - **Dataset Size**: 11,592 audio samples - **Speakers**: 5 speakers (SPK1-SPK5) - **Model Quality**: Professional grade ## Model Files - `checkpoints/epoch=455-step=1189248.ckpt` - PyTorch Lightning checkpoint (807 MB) - `config.json` - Model configuration file - `training_data.csv` - Training dataset metadata - `scripts/export_jit.py` - ONNX export script ## Quick Start ### Export to ONNX ```bash python3 scripts/export_jit.py ``` This will create an ONNX model file that can be used with Piper for inference. ### Usage with Piper ```bash # Install Piper TTS pip install piper-tts # After exporting to ONNX echo 'مرحبا بك في نظام التحويل النصي إلى كلام' | \ piper --model saudi_msa_epoch455.onnx --output_file output.wav ``` ### Python Usage ```python from piper import PiperVoice voice = PiperVoice.load("saudi_msa_epoch455.onnx") # Synthesize speech with open("output.wav", "wb") as f: voice.synthesize_stream_raw("مرحبا بك", f) ``` ## Training Details ### Dataset Statistics | Speaker | Samples | |---------|---------| | SPK1 | 3,000 | | SPK2 | 714 | | SPK3 | 1,656 | | SPK4 | 2,057 | | SPK5 | 4,193 | | **Total** | **11,592** | ### Training Configuration ```yaml voice_name: saudi_msa sample_rate: 22050 espeak_voice: ar batch_size: 8 epochs: 455 optimizer: Adam ``` ### Training Environment - Python 3.11 - PyTorch 2.x with CUDA - Lightning 2.x - Total training time: ~85+ hours ## Model Performance This model has been trained for **455 epochs**, providing: - ✅ **Excellent audio quality** with minimal background noise - ✅ **Clear pronunciation** of Arabic words - ✅ **Natural prosody** and intonation - ✅ **Professional-grade output** suitable for production use The model performs exceptionally well on: - Customer service dialogues - Banking and financial terminology - General conversational Arabic - Saudi dialect expressions ## Export Instructions To export the checkpoint to ONNX format: ```bash cd scripts python3 export_jit.py ``` The script will: 1. Load the checkpoint from `checkpoints/epoch=455-step=1189248.ckpt` 2. Export to ONNX format with optimizations 3. Create `saudi_msa_epoch455.onnx` file Make sure to copy the `config.json` file alongside the ONNX model: ```bash cp config.json saudi_msa_epoch455.onnx.json ``` ## Files Structure ``` . ├── README.md ├── config.json # Model configuration ├── training_data.csv # Dataset metadata ├── checkpoints/ │ └── epoch=455-step=1189248.ckpt # Latest checkpoint (807 MB) └── scripts/ ├── export_jit.py # ONNX export script ├── train_piper.sh # Training script └── create_training_file.py # Data preparation script ``` ## License This model is trained using the Piper TTS framework which is licensed under GPL-3.0. ## Citation If you use this model, please cite: ```bibtex @misc{saudi_msa_piper_2026, title={Saudi Arabic TTS Model for Piper - Epoch 455}, author={Piper MSA Project}, year={2026}, publisher={Hugging Face}, howpublished={\url{https://huggingface.co/YOUR_USERNAME/saudi-msa-piper}} } ``` ## Acknowledgments - Piper TTS: https://github.com/rhasspy/piper - eSpeak-ng for Arabic phonemization - Original dataset contributors ## Sample Usage ```python # Example: Generate customer service greeting text = "حياك الله عميلنا العزيز، كيف اقدر اساعدك اليوم؟" echo text | piper --model saudi_msa_epoch455.onnx --output_file greeting.wav ``` ## Model Comparison | Epoch | Quality | Noise Level | Clarity | |-------|---------|-------------|---------| | 65 | Good | Moderate | Fair | | 176 | Very Good | Low | Good | | 438 | Excellent | Very Low | Excellent | | **455** | **Professional** | **Minimal** | **Excellent** | --- For questions or issues, please open an issue on the repository.