--- license: mit language: - ar base_model: - ResembleAI/chatterbox pipeline_tag: text-to-speech tags: - Saudi - Arabic - Saudi-Dialect - Chatterbox - TTS - voice-cloning - multilingual-tts library_name: chatterbox --- ![NAMAA Saudi TTS Banner](https://cdn-uploads.huggingface.co/production/uploads/628f7a71dd993507cfcbe587/2d4VIgVYji-CS2w8n_3tS.png) # 🇸🇦 NAMAA-Saudi-TTS **NAMAA-Saudi-TTS** is a Saudi Arabic Text-to-Speech (TTS) model built on top of the **Chatterbox Multilingual TTS** architecture. The model is configured and refined to generate **natural Saudi dialect speech**, targeting everyday conversational usage rather than Modern Standard Arabic (MSA). This model is developed and released by **NAMAA Community (Network for Advancing Modern Arabic AI)** as part of its efforts to advance high-quality Arabic speech and language technologies. --- ## 🔊 Live Demo (Hugging Face Space) 👉 **Try the model here:** https://huggingface.co/spaces/omarelshehy/NAMAA-Saudi-Voice --- ## ✨ Model Capabilities The model supports: - **Saudi Arabic text input** (`language_id = "ar"`) - Natural conversational prosody - Saudi dialect phrasing and rhythm - Optional **reference audio prompting** for: - Speaker similarity - Style and tone transfer - GPU-accelerated inference This repository contains all required **model checkpoints and assets** for local or hosted inference. --- ## 🗣️ Example Text (Saudi Dialect) ```text آبي أروح البقالة أشتري كم غرض وأرجع بسرعة. ``` ## ⚠️ Limitations Please be aware of the following current limitations: - Lack of tashkeel may affect pronunciation accuracy. - Numeric normalization will be improved in future releases. - This is a known limitation of the current flow-based generation. These limitations are actively being addressed in upcoming versions. ## 🧪 Example Usage (Inference) ```python import numpy as np import torchaudio as ta from huggingface_hub import snapshot_download from safetensors.torch import load_file as load_safetensors from chatterbox import mtl_tts device = "cuda" # or "cpu" / "mps" ckpt_dir = snapshot_download( repo_id="NAMAA-Space/NAMAA-Saudi-TTS", repo_type="model", revision="main" ) # Load model model = mtl_tts.ChatterboxMultilingualTTS.from_pretrained(device=device) t3_state = load_safetensors( f"{ckpt_dir}/t3_mtl23ls_v2.safetensors", device=device ) model.t3.load_state_dict(t3_state) model.t3.to(device).eval() # Saudi Arabic text text = "أنا الحين بروح الشغل وإذا رجعت بمرّ البقالة" wav = model.generate(text, language_id="ar") ta.save("namma_saudi.wav", wav, model.sr) ``` ### 🔹 Inference with Reference Audio (Voice / Style Transfer) ```python text = "آبي أخلص الشغل اليوم وأرتاح بكرة" wav = model.generate( text, language_id="ar", audio_prompt_path="/content/reference_saudi.wav" ) ta.save("namma_saudi_ref.wav", wav, model.sr) ``` ## 🧠 Base Model This model is built on top of: - **ResembleAI/chatterbox** - **Chatterbox Multilingual TTS architecture** The Saudi dialect behavior is achieved through **specialized configuration, prompting, and curated usage patterns**, rather than training focused on Modern Standard Arabic (MSA). --- ## 📜 License This model is released under the **MIT License**, allowing both **research and commercial usage** with proper attribution. --- ## 🤝 Community & Contributions Developed and maintained by **NAMAA Community** *(Network for Advancing Modern Arabic NLP & AI)* We welcome: - Feedback and evaluations - Dialect-specific test cases - Contributions toward improving Arabic Text-to-Speech systems --- ## 📌 Citation If you use this model in research or production, please cite: ```bibtex @misc{namaa_saudi_tts, title = {NAMAA-Saudi-TTS: Saudi Dialect Text-to-Speech}, author = {{NAMAA Community}}, year = {2026}, url = {https://huggingface.co/NAMAA-Space/NAMAA-Saudi-TTS} }