Whisper Tunisian Dialect ASR (TuniSpeech‑21h)

Whisper Tunisian Dialect ASR

Overview

Whisper Tunisian Dialect ASR is a fine-tuned speech recognition model based on OpenAI Whisper-large-v2, adapted to the Tunisian Arabic dialect using the TuniSpeech corpus (~21 hours).

The model is designed for:

  • Automatic Speech Recognition (ASR) in Tunisian dialect
  • Transcription of short spontaneous speech
  • Research on low-resource Arabic dialects

Repository:
https://huggingface.co/TuniSpeech-AI/whisper-tunisian-dialect


Model Architecture

  • Base model: Whisper-large-v2
  • Fine-tuning strategy: LoRA → merged into full model weights
  • Framework: Hugging Face Transformers
  • Decoding:
    • Forced language: Arabic (ar)
    • Task: transcription (not translation)

After training, LoRA adapters were fully merged, producing a standalone Whisper checkpoint compatible with standard inference pipelines.


Training Data

TuniSpeech Corpus

  • Language: Tunisian Arabic dialect
  • Duration: ~21 hours
  • Recording conditions:
    • Real speech\
    • Spontaneous pronunciation\
    • Speaker variability

The dataset targets dialectal phonetics and vocabulary, which are poorly covered by standard Arabic ASR systems.


Experimental Setup

  • Input audio:
    • Resampled to 16 kHz mono
  • Training method:
    • Parameter-efficient fine-tuning (LoRA)
    • Followed by weight merging

Evaluation focused on:

  • Dialectal word recognition
  • Short utterance transcription
  • Real-world speech robustness

Usage

Installation

pip install torch transformers librosa gradio

Python Inference

from transformers import pipeline

pipe = pipeline(
    "automatic-speech-recognition",
    model="TuniSpeech-AI/whisper-tunisian-dialect"
)

result = pipe("audio.wav")
print(result["text"])

Hugging Face Space

An interactive demo is available:

Upload or record Tunisian speech → get transcription instantly.

⚠️ On CPU, processing is slow.
Recommended:

  • Use short audio (< 5 s)

Citation

If you use this model, please cite the associated work on:

Whisper fine-tuning for Tunisian dialect speech recognition using the TuniSpeech-21H corpus.

@inproceedings{tunispeech_whisper_2026,
  title     = {A New Tunisian Arabic Corpus and Benchmark for Automatic Speech Recognition},
  author    = {Sghaier, Mohamed Ali and Bellagha, Mohamed Lazhar and Zrigui, Mounir},
  booktitle = {In Proceedings of the 18th International Conference on Agents and Artificial Intelligence},
  year      = {2026}
}
Downloads last month
284
Safetensors
Model size
2B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Spaces using TuniSpeech-AI/whisper-tunisian-dialect 2