TajweedSST β Quranic Letter-Level Alignment & Tajweed Physics Engine
CTC Forced Alignment + Acoustic Physics Validation for Quranic Recitation
Overview
TajweedSST is a Python pipeline that produces letter-level timing data for Quranic recitation audio. It combines wav2vec2 CTC forced alignment with acoustic physics validation (Tajweed rules) to generate timing files consumed by MahQuranApp for real-time letter highlighting.
Pipeline Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TajweedSST Pipeline β
β β
β 1. CTC Forced Alignment (wav2vec2) β
β ββ Word-level timestamps from audio β
β β
β 2. Character Expansion β
β ββ Word timestamps β individual character timing β
β β
β 3. Grapheme Matching β
β ββ Merge base + diacritics to match App.tsx rendering β
β β
β 4. Tajweed Parsing β
β ββ Map letters to Tajweed rules (Qalqalah, Ghunnah..) β
β β
β 5. Physics Validation β
β ββ RMS bounce, duration, formant analysis β
β β
β 6. Export to MahQuranApp format β
β ββ JSON with idx, char, ayah, start(ms), end, wordIdx β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Quick Start
Prerequisites
cd /path/to/tajweedsst
python3 -m venv venv
source venv/bin/activate
pip install torch torchaudio ctc-forced-aligner librosa
Single Surah
# Align Surah 90 (Al-Balad) for Abdul Basit
python ctc_align_91.py # Template script
Batch All Surahs
# Process all 114 surahs for Abdul Basit
python batch_align_all.py
Output Format
Each letter_timing_XX.json contains an array of timing entries:
{
"idx": 0,
"char": "ΩΩ",
"ayah": 1,
"start": 3360,
"end": 3410,
"duration": 50,
"wordIdx": 0,
"weight": 1.0
}
Fields
| Field | Type | Description |
|---|---|---|
idx |
int | Sequential letter index |
char |
string | Arabic grapheme (base + diacritics) |
ayah |
int | Verse number (1-indexed) |
start |
int | Start time in milliseconds |
end |
int | End time in milliseconds |
duration |
int | Duration in milliseconds |
wordIdx |
int | Word index within the surah |
weight |
float | Confidence weight |
Critical: Grapheme Matching
The timing data must match the grapheme count produced by MahQuranApp's splitIntoGraphemes() function. This function combines base Arabic letters with their following diacritics:
App.tsx Diacritics Set:
Ω Ω Ω Ω Ω Ω Ω Ω Ω° Ϋ Ϋ Ϋ Ϋ Ϋ Ϋ Ϋ Ω Ω Ω
Plus Unicode ranges: 0x064Bβ0x0652 and 0x0610β0x061A
Example: The word ΩΩΨ’ splits into 2 graphemes: ['ΩΩ', 'Ψ’']
If the timing count doesn't match the grapheme count, highlighting will drift!
Physics Validation
TajweedSST validates timing against acoustic physics:
| Rule | Check | Method |
|---|---|---|
| Qalqalah | RMS dip + spike | Envelope analysis |
| Ghunnah | Nasal duration | Duration measurement |
| Madd | Extended vowel | Duration ratio |
| Tafkheem | Heavy articulation | Formant F2 analysis |
Project Structure
tajweedsst/
βββ src/
β βββ tajweed_parser.py # Tajweed rule detection
β βββ physics_validator.py # Acoustic validation
β βββ duration_model.py # Duration calibration
βββ tests/ # 34 unit/integration tests
βββ ctc_align_90.py # Single surah alignment
βββ ctc_align_91.py # Template with physics
βββ batch_align_all.py # Batch all surahs
βββ README.md
Reciter Support
Currently supported:
- Abdul Basit (114 surahs)
License
MIT