ISTNetworks
/

saudi-msa-piper

ONNX

Model card Files Files and versions

xet

Community

ISTNetworks commited on Jan 8

Commit

6bc88af

verified ·

1 Parent(s): b67625e

Add comprehensive inference guide

Browse files

Files changed (1) hide show

INFERENCE_GUIDE.md +328 -0

INFERENCE_GUIDE.md ADDED Viewed

	@@ -0,0 +1,328 @@

+# Saudi MSA Piper TTS - Inference Guide
+Complete guide for running the Saudi Arabic TTS model on any computer.
+## Quick Start
+### 1. Download the Model
+```bash
+# Clone the repository
+git clone https://huggingface.co/ISTNetworks/saudi-msa-piper
+cd saudi-msa-piper
+# Or download specific files
+wget https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx
+wget https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx.json
+```
+### 2. Install Dependencies
+```bash
+# Install piper-tts
+pip install piper-tts
+# Or install all dependencies
+pip install -r requirements.txt
+```
+### 3. Run Inference
+**Option A: Using the provided Python script**
+```bash
+python3 inference.py -t "مرحبا بك في نظام التحويل النصي إلى كلام" -o output.wav
+```
+**Option B: Using the bash script**
+```bash
+chmod +x inference.sh
+./inference.sh "مرحبا بك" output.wav
+```
+**Option C: Using piper directly**
+```bash
+echo "مرحبا بك" | piper --model saudi_msa_epoch455.onnx --output_file output.wav
+```
+## Detailed Usage
+### Python Script (inference.py)
+The Python script provides the most flexibility and error handling.
+**Basic usage:**
+```bash
+python3 inference.py -t "Arabic text here" -o output.wav
+```
+**Read from stdin:**
+```bash
+echo "مرحبا بك" | python3 inference.py -o output.wav
+```
+**Read from file:**
+```bash
+cat arabic_text.txt | python3 inference.py -o output.wav
+```
+**Specify custom model path:**
+```bash
+python3 inference.py -t "مرحبا بك" -m /path/to/model.onnx -o output.wav
+```
+**Full options:**
+```bash
+python3 inference.py --help
+Options:
+  -t, --text TEXT       Arabic text to synthesize
+  -m, --model PATH      Path to ONNX model file
+  -o, --output PATH     Output WAV file path (required)
+  -c, --config PATH     Path to config JSON file (auto-detected)
+```
+### Bash Script (inference.sh)
+Simple shell script for quick inference.
+**Basic usage:**
+```bash
+./inference.sh "مرحبا بك" output.wav
+```
+**Read from stdin:**
+```bash
+echo "مرحبا بك" | ./inference.sh - output.wav
+```
+**Custom model path:**
+```bash
+MODEL_FILE=/path/to/model.onnx ./inference.sh "مرحبا بك" output.wav
+```
+### Direct Piper Usage
+For advanced users who want direct control.
+**Basic:**
+```bash
+echo "مرحبا بك" | piper --model saudi_msa_epoch455.onnx --output_file output.wav
+```
+**With custom config:**
+```bash
+echo "مرحبا بك" | piper \
+  --model saudi_msa_epoch455.onnx \
+  --config saudi_msa_epoch455.onnx.json \
+  --output_file output.wav
+```
+**Output to stdout (for piping):**
+```bash
+echo "مرحبا بك" | piper --model saudi_msa_epoch455.onnx --output-raw | \
+  aplay -r 22050 -f S16_LE -t raw -
+```
+## Python API Usage
+For integration into Python applications:
+```python
+from piper import PiperVoice
+# Load the model
+voice = PiperVoice.load("saudi_msa_epoch455.onnx")
+# Synthesize to file
+with open("output.wav", "wb") as f:
+    voice.synthesize_stream_raw("مرحبا بك في نظام التحويل النصي إلى كلام", f)
+# Or get audio data
+audio_data = voice.synthesize("مرحبا بك")
+```
+**Advanced usage:**
+```python
+from piper import PiperVoice
+import wave
+# Load model
+voice = PiperVoice.load("saudi_msa_epoch455.onnx")
+# Synthesize with custom parameters
+text = "مرحبا بك"
+# Get raw audio
+with open("output.wav", "wb") as f:
+    # Synthesize
+    voice.synthesize_stream_raw(text, f)
+print("Audio generated successfully!")
+```
+## System Requirements
+### Minimum Requirements
+- **OS:** Linux, macOS, or Windows
+- **Python:** 3.8 or higher
+- **RAM:** 2 GB
+- **Storage:** 100 MB for model files
+### Recommended Requirements
+- **OS:** Linux or macOS
+- **Python:** 3.10 or higher
+- **RAM:** 4 GB
+- **Storage:** 1 GB
+## Installation on Different Systems
+### Ubuntu/Debian
+```bash
+# Install system dependencies
+sudo apt-get update
+sudo apt-get install -y python3 python3-pip
+# Install piper-tts
+pip3 install piper-tts
+# Download model
+wget https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx
+wget https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx.json
+```
+### macOS
+```bash
+# Install Python (if not installed)
+brew install python3
+# Install piper-tts
+pip3 install piper-tts
+# Download model
+curl -L -o saudi_msa_epoch455.onnx \
+  https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx
+curl -L -o saudi_msa_epoch455.onnx.json \
+  https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx.json
+```
+### Windows
+```powershell
+# Install Python from python.org
+# Install piper-tts
+pip install piper-tts
+# Download model (using PowerShell)
+Invoke-WebRequest -Uri "https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx" -OutFile "saudi_msa_epoch455.onnx"
+Invoke-WebRequest -Uri "https://huggingface.co/ISTNetworks/saudi-msa-piper/resolve/main/saudi_msa_epoch455.onnx.json" -OutFile "saudi_msa_epoch455.onnx.json"
+```
+## Example Use Cases
+### Customer Service Greeting
+```bash
+python3 inference.py -t "حياك الله عميلنا العزيز، كيف اقدر اساعدك اليوم؟" -o greeting.wav
+```
+### Banking Message
+```bash
+python3 inference.py -t "تراني راسلت الفرع الرئيسي باكر الصبح، وان شا الله بيردون علينا قبل الظهر" -o banking.wav
+```
+### Batch Processing
+```bash
+# Process multiple texts
+while IFS= read -r line; do
+    filename=$(echo "$line" | md5sum | cut -d' ' -f1).wav
+    python3 inference.py -t "$line" -o "$filename"
+done < texts.txt
+```
+### Web Service Integration
+```python
+from flask import Flask, request, send_file
+from piper import PiperVoice
+import tempfile
+app = Flask(__name__)
+voice = PiperVoice.load("saudi_msa_epoch455.onnx")
+@app.route('/synthesize', methods=['POST'])
+def synthesize():
+    text = request.json.get('text')
+    # Create temporary file
+    with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as f:
+        voice.synthesize_stream_raw(text, f)
+        temp_path = f.name
+    return send_file(temp_path, mimetype='audio/wav')
+if __name__ == '__main__':
+    app.run(host='0.0.0.0', port=5000)
+```
+## Troubleshooting
+### Model file not found
+```bash
+# Make sure you're in the correct directory
+ls -lh saudi_msa_epoch455.onnx
+# Or specify full path
+python3 inference.py -m /full/path/to/saudi_msa_epoch455.onnx -t "مرحبا" -o output.wav
+```
+### Config file not found
+```bash
+# The config file should have the same name as the model with .json extension
+# saudi_msa_epoch455.onnx -> saudi_msa_epoch455.onnx.json
+# Or specify manually
+python3 inference.py -t "مرحبا" -c config.json -o output.wav
+```
+### piper-tts not installed
+```bash
+pip install piper-tts
+# If that fails, try:
+pip install --upgrade pip
+pip install piper-tts
+```
+### Permission denied
+```bash
+chmod +x inference.sh
+chmod +x inference.py
+```
+## Performance Tips
+1. **First run is slower:** The model loads into memory on first use
+2. **Batch processing:** Load the model once and reuse for multiple texts
+3. **Memory usage:** The model uses ~500 MB RAM when loaded
+4. **CPU vs GPU:** This model runs on CPU; no GPU required
+## File Structure
+After downloading, you should have:
+```
+saudi-msa-piper/
+├── saudi_msa_epoch455.onnx          # Main model file (61 MB)
+├── saudi_msa_epoch455.onnx.json     # Config file (5 KB)
+├── inference.py                      # Python inference script
+├── inference.sh                      # Bash inference script
+├── INFERENCE_GUIDE.md               # This guide
+└── requirements.txt                  # Python dependencies
+```
+## Support
+For issues or questions:
+- Repository: https://huggingface.co/ISTNetworks/saudi-msa-piper
+- Piper TTS: https://github.com/rhasspy/piper
+## License
+This model is based on Piper TTS (GPL-3.0 license).