Overview

KasbahTTS is the first open-source Text-to-Speech model purpose-built for Algerian Dardja (الدارجة الجزائرية).

Most Arabic TTS systems only speak Modern Standard Arabic — a language nobody uses on the street. KasbahTTS speaks like real Algerians do: the Dardja of Algiers' alleyways, the warmth of Oran's markets, the rhythm of Constantine's conversations.

Built on the F5-TTS architecture (DiT-based flow matching) and fine-tuned from Habibi-TTS, KasbahTTS brings Algerian speech synthesis into the open-source world.

أول موديل TTS مفتوح المصدر يهدر بالدارجة الجزائرية.

Audio Samples

Listen to KasbahTTS in action — each sample shows the reference voice the model clones from, followed by the generated speech:

Sample 1

	Audio
Reference
Generated

Sample 2

	Audio
Reference
Generated

Sample 3

	Audio
Reference
Generated

Model Details


Model	KasbahTTS V0
Task	Text-to-Speech (Zero-Shot Voice Cloning)
Architecture	F5-TTS — DiT-based flow matching
Base Model	Habibi-TTS
Dialect	Algerian Dardja (الدارجة الجزائرية)
License	MIT

Features

Zero-shot voice cloning — Give it a few seconds of any voice, and it speaks Dardja in that voice
Native Algerian Dardja — Trained on real Algerian conversational speech, not textbook Arabic
F5-TTS backbone — State-of-the-art DiT architecture with flow matching for natural, high-fidelity synthesis
Open source — Fully open weights under MIT license, use it however you want

Quick Start

Installation

pip install habibi-tts

Note: On first run, the Vocos vocoder (~40 MB) will be automatically downloaded from HuggingFace. After that, everything runs fully offline.

Python Inference

import torch
import soundfile as sf
from f5_tts.infer.utils_infer import load_model, load_vocoder, preprocess_ref_audio_text
from f5_tts.model import DiT
from habibi_tts.infer.utils_infer import infer_process

# Load vocoder
vocoder = load_vocoder(vocoder_name="vocos", is_local=False)

# Load KasbahTTS
model = load_model(
    DiT,
    dict(dim=1024, depth=22, heads=16, ff_mult=2, text_dim=512, conv_layers=4),
    ckpt_path="ALGERIA.safetensors",
    vocab_file="vocab.txt",
)

# Prepare reference audio
ref_audio, ref_text = preprocess_ref_audio_text(
    "reference.wav",
    "النص المرجعي هنا"
)

# Generate speech
audio, sr, _ = infer_process(
    ref_audio=ref_audio,
    ref_text=ref_text,
    gen_text="واش راك خويا، لاباس عليك؟",
    model_obj=model,
    vocoder=vocoder,
    speed=1.0,
)

sf.write("output.wav", audio, sr)

CLI Usage

habibi-tts_infer-cli \
  --model_cls DiT \
  --ckpt_file ALGERIA.safetensors \
  --vocab_file vocab.txt \
  --ref_audio reference.wav \
  --ref_text "النص المرجعي" \
  --gen_text "صباح الخير، واش راك اليوم؟"

Known Limitations

Arabic Only — No French Code-Switching

This version handles Arabic Dardja text only. Algerian Dardja naturally mixes Arabic and French in daily conversation, but KasbahTTS V0 does not support French words or Latin script. Mixing in French text will produce unpredictable results. Write your input in Arabic script only.

❌ واش راك؟ ça va bien → Unpredictable

✅ واش راك؟ لاباس عليك → Works great

No Number Handling

The model cannot process numerical digits. Numbers in the input text (like "5" or "2024") will not be spoken correctly. Write numbers out as words instead.

❌ عندي 3 خاوتي → Won't work

✅ عندي ثلاثة خاوتي → Works

No Diacritics (تشكيل)

This version does not support Arabic diacritics (harakat). Input text should be plain, unvocalized Arabic. Diacritic support is planned for future versions.

Repetition

The model may occasionally repeat words or phrases. To reduce this:

Try different nfe_step values (32 or 64)
Adjust cfg_strength (default: 2.0)
Break long text into shorter segments

Dialect Scope

KasbahTTS is trained specifically on Algerian Dardja. Other Arabic dialects or MSA text may produce lower quality or unexpected output.

What's Next

KasbahTTS V0 is just the beginning. Planned improvements include:

French code-switching support (Dardja-French mixing)
Number and digit handling
Diacritics (تشكيل) support
Longer and more stable generation
Additional Algerian regional accents

Citation

If you use KasbahTTS in your research or projects, please cite the underlying Habibi-TTS work:

@article{habibi2025,
  title={Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis},
  author={...},
  journal={arXiv preprint arXiv:2601.13802},
  year={2025}
}

Built with ❤️ for Algeria by MenaVoice

من القصبة للعالم — From the Kasbah to the world

Downloads last month: 3

Paper for MenaVoice/KasbahTTS-V0

Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis

Paper • 2601.13802 • Published Jan 20