saudi-msa-piper / README.md
ISTNetworks's picture
Upload Saudi Arabic Piper TTS model - Epoch 455
b51190f verified
# Saudi Arabic (MSA) TTS Model - Piper
This repository contains a high-quality Piper TTS model trained on Saudi Arabic (Modern Standard Arabic) dataset for **455 epochs**.
## Model Details
- **Language**: Arabic (Saudi dialect)
- **Framework**: Piper TTS
- **Sample Rate**: 22050 Hz
- **Training Epochs**: 455
- **Dataset Size**: 11,592 audio samples
- **Speakers**: 5 speakers (SPK1-SPK5)
- **Model Quality**: Professional grade
## Model Files
- `checkpoints/epoch=455-step=1189248.ckpt` - PyTorch Lightning checkpoint (807 MB)
- `config.json` - Model configuration file
- `training_data.csv` - Training dataset metadata
- `scripts/export_jit.py` - ONNX export script
## Quick Start
### Export to ONNX
```bash
python3 scripts/export_jit.py
```
This will create an ONNX model file that can be used with Piper for inference.
### Usage with Piper
```bash
# Install Piper TTS
pip install piper-tts
# After exporting to ONNX
echo 'ู…ุฑุญุจุง ุจูƒ ููŠ ู†ุธุงู… ุงู„ุชุญูˆูŠู„ ุงู„ู†ุตูŠ ุฅู„ู‰ ูƒู„ุงู…' | \
piper --model saudi_msa_epoch455.onnx --output_file output.wav
```
### Python Usage
```python
from piper import PiperVoice
voice = PiperVoice.load("saudi_msa_epoch455.onnx")
# Synthesize speech
with open("output.wav", "wb") as f:
voice.synthesize_stream_raw("ู…ุฑุญุจุง ุจูƒ", f)
```
## Training Details
### Dataset Statistics
| Speaker | Samples |
|---------|---------|
| SPK1 | 3,000 |
| SPK2 | 714 |
| SPK3 | 1,656 |
| SPK4 | 2,057 |
| SPK5 | 4,193 |
| **Total** | **11,592** |
### Training Configuration
```yaml
voice_name: saudi_msa
sample_rate: 22050
espeak_voice: ar
batch_size: 8
epochs: 455
optimizer: Adam
```
### Training Environment
- Python 3.11
- PyTorch 2.x with CUDA
- Lightning 2.x
- Total training time: ~85+ hours
## Model Performance
This model has been trained for **455 epochs**, providing:
- โœ… **Excellent audio quality** with minimal background noise
- โœ… **Clear pronunciation** of Arabic words
- โœ… **Natural prosody** and intonation
- โœ… **Professional-grade output** suitable for production use
The model performs exceptionally well on:
- Customer service dialogues
- Banking and financial terminology
- General conversational Arabic
- Saudi dialect expressions
## Export Instructions
To export the checkpoint to ONNX format:
```bash
cd scripts
python3 export_jit.py
```
The script will:
1. Load the checkpoint from `checkpoints/epoch=455-step=1189248.ckpt`
2. Export to ONNX format with optimizations
3. Create `saudi_msa_epoch455.onnx` file
Make sure to copy the `config.json` file alongside the ONNX model:
```bash
cp config.json saudi_msa_epoch455.onnx.json
```
## Files Structure
```
.
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ config.json # Model configuration
โ”œโ”€โ”€ training_data.csv # Dataset metadata
โ”œโ”€โ”€ checkpoints/
โ”‚ โ””โ”€โ”€ epoch=455-step=1189248.ckpt # Latest checkpoint (807 MB)
โ””โ”€โ”€ scripts/
โ”œโ”€โ”€ export_jit.py # ONNX export script
โ”œโ”€โ”€ train_piper.sh # Training script
โ””โ”€โ”€ create_training_file.py # Data preparation script
```
## License
This model is trained using the Piper TTS framework which is licensed under GPL-3.0.
## Citation
If you use this model, please cite:
```bibtex
@misc{saudi_msa_piper_2026,
title={Saudi Arabic TTS Model for Piper - Epoch 455},
author={Piper MSA Project},
year={2026},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/YOUR_USERNAME/saudi-msa-piper}}
}
```
## Acknowledgments
- Piper TTS: https://github.com/rhasspy/piper
- eSpeak-ng for Arabic phonemization
- Original dataset contributors
## Sample Usage
```python
# Example: Generate customer service greeting
text = "ุญูŠุงูƒ ุงู„ู„ู‡ ุนู…ูŠู„ู†ุง ุงู„ุนุฒูŠุฒุŒ ูƒูŠู ุงู‚ุฏุฑ ุงุณุงุนุฏูƒ ุงู„ูŠูˆู…ุŸ"
echo text | piper --model saudi_msa_epoch455.onnx --output_file greeting.wav
```
## Model Comparison
| Epoch | Quality | Noise Level | Clarity |
|-------|---------|-------------|---------|
| 65 | Good | Moderate | Fair |
| 176 | Very Good | Low | Good |
| 438 | Excellent | Very Low | Excellent |
| **455** | **Professional** | **Minimal** | **Excellent** |
---
For questions or issues, please open an issue on the repository.