# Saudi MSA Piper TTS - Inference Guide Complete guide for running the Saudi Arabic TTS model on any computer. ## Quick Start ### 1. Download the Model ```bash # Clone the repository git clone https://huggingface.co/ISTNetworks/saudi-msa-piper cd saudi-msa-piper # Or download specific files wget https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx wget https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx.json ``` ### 2. Install Dependencies ```bash # Install piper-tts pip install piper-tts # Or install all dependencies pip install -r requirements.txt ``` ### 3. Run Inference **Option A: Using the provided Python script** ```bash python3 inference.py -t "مرحبا بك في نظام التحويل النصي إلى كلام" -o output.wav ``` **Option B: Using the bash script** ```bash chmod +x inference.sh ./inference.sh "مرحبا بك" output.wav ``` **Option C: Using piper directly** ```bash echo "مرحبا بك" | piper --model saudi_msa_epoch455.onnx --output_file output.wav ``` ## Detailed Usage ### Python Script (inference.py) The Python script provides the most flexibility and error handling. **Basic usage:** ```bash python3 inference.py -t "Arabic text here" -o output.wav ``` **Read from stdin:** ```bash echo "مرحبا بك" | python3 inference.py -o output.wav ``` **Read from file:** ```bash cat arabic_text.txt | python3 inference.py -o output.wav ``` **Specify custom model path:** ```bash python3 inference.py -t "مرحبا بك" -m /path/to/model.onnx -o output.wav ``` **Full options:** ```bash python3 inference.py --help Options: -t, --text TEXT Arabic text to synthesize -m, --model PATH Path to ONNX model file -o, --output PATH Output WAV file path (required) -c, --config PATH Path to config JSON file (auto-detected) ``` ### Bash Script (inference.sh) Simple shell script for quick inference. **Basic usage:** ```bash ./inference.sh "مرحبا بك" output.wav ``` **Read from stdin:** ```bash echo "مرحبا بك" | ./inference.sh - output.wav ``` **Custom model path:** ```bash MODEL_FILE=/path/to/model.onnx ./inference.sh "مرحبا بك" output.wav ``` ### Direct Piper Usage For advanced users who want direct control. **Basic:** ```bash echo "مرحبا بك" | piper --model saudi_msa_epoch455.onnx --output_file output.wav ``` **With custom config:** ```bash echo "مرحبا بك" | piper \ --model saudi_msa_epoch455.onnx \ --config saudi_msa_epoch455.onnx.json \ --output_file output.wav ``` **Output to stdout (for piping):** ```bash echo "مرحبا بك" | piper --model saudi_msa_epoch455.onnx --output-raw | \ aplay -r 22050 -f S16_LE -t raw - ``` ## Python API Usage For integration into Python applications: ```python from piper import PiperVoice # Load the model voice = PiperVoice.load("saudi_msa_epoch455.onnx") # Synthesize to file with open("output.wav", "wb") as f: voice.synthesize_stream_raw("مرحبا بك في نظام التحويل النصي إلى كلام", f) # Or get audio data audio_data = voice.synthesize("مرحبا بك") ``` **Advanced usage:** ```python from piper import PiperVoice import wave # Load model voice = PiperVoice.load("saudi_msa_epoch455.onnx") # Synthesize with custom parameters text = "مرحبا بك" # Get raw audio with open("output.wav", "wb") as f: # Synthesize voice.synthesize_stream_raw(text, f) print("Audio generated successfully!") ``` ## System Requirements ### Minimum Requirements - **OS:** Linux, macOS, or Windows - **Python:** 3.8 or higher - **RAM:** 2 GB - **Storage:** 100 MB for model files ### Recommended Requirements - **OS:** Linux or macOS - **Python:** 3.10 or higher - **RAM:** 4 GB - **Storage:** 1 GB ## Installation on Different Systems ### Ubuntu/Debian ```bash # Install system dependencies sudo apt-get update sudo apt-get install -y python3 python3-pip # Install piper-tts pip3 install piper-tts # Download model wget https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx wget https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx.json ``` ### macOS ```bash # Install Python (if not installed) brew install python3 # Install piper-tts pip3 install piper-tts # Download model curl -L -o saudi_msa_epoch455.onnx \ https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx curl -L -o saudi_msa_epoch455.onnx.json \ https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx.json ``` ### Windows ```powershell # Install Python from python.org # Install piper-tts pip install piper-tts # Download model (using PowerShell) Invoke-WebRequest -Uri "https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx" -OutFile "saudi_msa_epoch455.onnx" Invoke-WebRequest -Uri "https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx.json" -OutFile "saudi_msa_epoch455.onnx.json" ``` ## Example Use Cases ### Customer Service Greeting ```bash python3 inference.py -t "حياك الله عميلنا العزيز، كيف اقدر اساعدك اليوم؟" -o greeting.wav ``` ### Banking Message ```bash python3 inference.py -t "تراني راسلت الفرع الرئيسي باكر الصبح، وان شا الله بيردون علينا قبل الظهر" -o banking.wav ``` ### Batch Processing ```bash # Process multiple texts while IFS= read -r line; do filename=$(echo "$line" | md5sum | cut -d' ' -f1).wav python3 inference.py -t "$line" -o "$filename" done < texts.txt ``` ### Web Service Integration ```python from flask import Flask, request, send_file from piper import PiperVoice import tempfile app = Flask(__name__) voice = PiperVoice.load("saudi_msa_epoch455.onnx") @app.route('/synthesize', methods=['POST']) def synthesize(): text = request.json.get('text') # Create temporary file with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as f: voice.synthesize_stream_raw(text, f) temp_path = f.name return send_file(temp_path, mimetype='audio/wav') if __name__ == '__main__': app.run(host='0.0.0.0', port=5000) ``` ## Troubleshooting ### Model file not found ```bash # Make sure you're in the correct directory ls -lh saudi_msa_epoch455.onnx # Or specify full path python3 inference.py -m /full/path/to/saudi_msa_epoch455.onnx -t "مرحبا" -o output.wav ``` ### Config file not found ```bash # The config file should have the same name as the model with .json extension # saudi_msa_epoch455.onnx -> saudi_msa_epoch455.onnx.json # Or specify manually python3 inference.py -t "مرحبا" -c config.json -o output.wav ``` ### piper-tts not installed ```bash pip install piper-tts # If that fails, try: pip install --upgrade pip pip install piper-tts ``` ### Permission denied ```bash chmod +x inference.sh chmod +x inference.py ``` ## Performance Tips 1. **First run is slower:** The model loads into memory on first use 2. **Batch processing:** Load the model once and reuse for multiple texts 3. **Memory usage:** The model uses ~500 MB RAM when loaded 4. **CPU vs GPU:** This model runs on CPU; no GPU required ## File Structure After downloading, you should have: ``` saudi-msa-piper/ ├── saudi_msa_epoch455.onnx # Main model file (61 MB) ├── saudi_msa_epoch455.onnx.json # Config file (5 KB) ├── inference.py # Python inference script ├── inference.sh # Bash inference script ├── INFERENCE_GUIDE.md # This guide └── requirements.txt # Python dependencies ``` ## Support For issues or questions: - Repository: https://huggingface.co/ISTNetworks/saudi-msa-piper - Piper TTS: https://github.com/rhasspy/piper ## License This model is based on Piper TTS (GPL-3.0 license).