egyptian-tts

πŸ‡ͺπŸ‡¬ NAMAA-Egyptian-TTS

NAMAA-Egyptian-TTS is a Egyptian Arabic Text-to-Speech (TTS) model built on top of the Chatterbox Multilingual TTS architecture.
The model is configured and refined to generate natural Egypian dialect speech, targeting everyday conversational usage rather than Modern Standard Arabic (MSA).

This model is developed and released by NAMAA Community (Network for Advancing Modern Arabic AI) as part of its efforts to advance high-quality Arabic speech and language technologies.


πŸ”Š Live Demo (Hugging Face Space)

πŸ‘‰ Try the model here:
https://huggingface.co/spaces/omarelshehy/NAMAA-Egyptian-Voice


✨ Model Capabilities

The model supports:

  • Egyptian Arabic text input (language_id = "ar")
  • Natural conversational prosody
  • Egyptian dialect phrasing and rhythm
  • Optional reference audio prompting for:
    • Speaker similarity
    • Style and tone transfer
  • GPU-accelerated inference

This repository contains all required model checkpoints and assets for local or hosted inference.


πŸ—£οΈ Example Text (Egyptian Dialect)

Ψ§Ω†Ψ§ Ψ³Ψ¨Ψͺ Ψ§Ω„Ψ΄ΨΊΩ„ و Ψ±Ψ§Ψ¬ΨΉ Ψ―Ω„ΩˆΩ‚Ψͺي ΨΉΩ„ΩŠ Ψ·ΩˆΩ„.

⚠️ Limitations

Please be aware of the following current limitations:

  • The pronunciuaion of "Ω‚" in egyptian mught sometimes be missed.
  • Numbers are sometimes not uttered correclty.
  • There might be inconsistencies regarding the input text and the flow of the audio depending on the audio prompt.

These limitations are actively being addressed in upcoming versions.

πŸ§ͺ Example Usage (Inference)

import numpy as np
import torchaudio as ta
from huggingface_hub import snapshot_download
from safetensors.torch import load_file as load_safetensors
from chatterbox import mtl_tts

device = "cuda"  # or "cpu" / "mps"

ckpt_dir = snapshot_download(
    repo_id="NAMAA-Space/NAMAA-Egyptian-TTS",
    repo_type="model",
    revision="main"
)

# Load model
model = mtl_tts.ChatterboxMultilingualTTS.from_pretrained(device=device)

t3_state = load_safetensors(
    f"{ckpt_dir}/t3_mtl23ls_v2.safetensors",
    device=device
)
model.t3.load_state_dict(t3_state)
model.t3.to(device).eval()

# Egyptian Arabic text
text = "Ψ§Ω†Ψ§ Ψ³Ψ¨Ψͺ Ψ§Ω„Ψ΄ΨΊΩ„ و Ψ±Ψ§Ψ¬ΨΉ Ψ―Ω„ΩˆΩ‚Ψͺي ΨΉΩ„ΩŠ Ψ·ΩˆΩ„"

wav = model.generate(text, language_id="ar")
ta.save("namma_egyptian.wav", wav, model.sr)

πŸ”Ή Inference with Reference Audio (Voice / Style Transfer)

text = "Ψ§Ω†Ψ§ Ψ³Ψ¨Ψͺ Ψ§Ω„Ψ΄ΨΊΩ„ و Ψ±Ψ§Ψ¬ΨΉ Ψ―Ω„ΩˆΩ‚Ψͺي ΨΉΩ„ΩŠ Ψ·ΩˆΩ„"

wav = model.generate(
    text,
    language_id="ar",
    audio_prompt_path="/content/reference_egyptian.wav"
)

ta.save("namma_egyptian_ref.wav", wav, model.sr)

🧠 Base Model

This model is built on top of:

  • ResembleAI/chatterbox
  • Chatterbox Multilingual TTS architecture

The Egyptian dialect behavior is achieved through specialized configuration, prompting, and curated usage patterns, rather than training focused on Modern Standard Arabic (MSA).


πŸ“œ License

This model is released under the MIT License, allowing both research and commercial usage with proper attribution.


🀝 Community & Contributions

Developed and maintained by NAMAA Community
(Network for Advancing Modern Arabic NLP & AI)

We welcome:

  • Feedback and evaluations
  • Dialect-specific test cases
  • Contributions toward improving Arabic Text-to-Speech systems

πŸ“Œ Citation

If you use this model in research or production, please cite:

@misc{namaa_egyptian_tts,
  title = {NAMAA-Egyptian-TTS: Egyptian Dialect Text-to-Speech},
  author = {{NAMAA Community}},
  year = {2026},
  url = {https://huggingface.co/NAMAA-Space/NAMAA-Egyptian-TTS}
}
Downloads last month
44
Safetensors
Model size
0.5B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for NAMAA-Space/NAMAA-Egyptian-TTS

Finetuned
(35)
this model

Space using NAMAA-Space/NAMAA-Egyptian-TTS 1