Spaces:

hetchyy
/

Quran-multi-aligner

Running on Zero

hetchyy Claude Opus 4.6 commited on Feb 15

Commit

045ee7d

1 Parent(s): eacc253

Extract build_interface() from app.py into src/ui/interface.py

Move the ~3000-line Gradio UI function (CSS, JS animation system,
component layout, event wiring) into src/ui/interface.py, reducing
app.py to an ~85-line bootstrap file. Phase 4 of the app.py refactor.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Files changed (3) hide show

CLAUDE.md +5 -3
app.py +0 -0
src/ui/interface.py +0 -0

CLAUDE.md CHANGED Viewed

@@ -12,12 +12,14 @@ Quran recitation alignment tool that segments audio recordings and aligns them w
 ### Entry Point
-`app.py` (~3900 lines) — Gradio UI, state management, JavaScript animation system, and API endpoint `/process_audio_json`. Being refactored into smaller modules.
 ### Extracted Modules (`src/`)
 - **`src/ui/segments.py`** — Segment rendering helpers (HTML cards, confidence classes, timestamps, audio encoding). Extracted from app.py in Phase 1.
 - **`src/pipeline/process.py`** — GPU-decorated pipeline functions: VAD+ASR GPU leases, post-VAD alignment pipeline, `process_audio`, `resegment_audio`, `retranscribe_audio`, `save_json_export`. Extracted from app.py in Phase 2.
 ### Core Algorithm (`segment_core/`)
@@ -26,7 +28,7 @@ Quran recitation alignment tool that segments audio recordings and aligns them w
 - **`phoneme_anchor.py`** — N-gram rarity-weighted voting to determine which chapter/verse a segment belongs to. Replaces earlier Whisper-based text matching.
 - **`phoneme_matcher.py`** — Substring Levenshtein DP alignment between ASR phonemes and reference Quran phonemes. Uses windowed alignment with lookback/lookahead.
 - **`_dp_core.pyx`** — Cython-accelerated DP inner loop (10-20x speedup). Falls back to pure Python if not compiled.
-- **`special_segments.py`** — Detects Basmala and Isti'adha via phoneme edit distance (threshold 0.35).
 - **`phoneme_matcher_cache.py`** — Pre-loads and caches phonemized chapter references from `data/phoneme_cache.pkl`.
 - **`ngram_index.py`** — N-gram index data structure used by anchor voting, loaded from `data/phoneme_ngram_index_5.pkl`.
 - **`zero_gpu.py`** — `@gpu_with_fallback` decorator for ZeroGPU quota handling with automatic CPU fallback.
@@ -50,7 +52,7 @@ Quran recitation alignment tool that segments audio recordings and aligns them w
 |-------|----|---------|
 | VAD | `obadx/recitation-segmenter-v2` | Voice activity detection |
 | ASR Base | `hetchyy/r15_95m` | Phoneme recognition (95M params) |
-| ASR Large | `hetchyy/r7` | Phoneme recognition (higher accuracy, 3x slower) |
 | MFA | External Space `hetchyy-quran-phoneme-mfa` | Word-level forced alignment |
 ### Key Patterns

 ### Entry Point
+`app.py` (~85 lines) — Bootstrap entry point: path setup, Cython build, imports `build_interface()` from `src/ui/interface.py`, and `__main__` block with model preloading.
 ### Extracted Modules (`src/`)
+- **`src/ui/interface.py`** — `build_interface()`: full Gradio layout (~3000 lines of CSS, JS animation system, component definitions, and event wiring). Extracted from app.py in Phase 4.
 - **`src/ui/segments.py`** — Segment rendering helpers (HTML cards, confidence classes, timestamps, audio encoding). Extracted from app.py in Phase 1.
 - **`src/pipeline/process.py`** — GPU-decorated pipeline functions: VAD+ASR GPU leases, post-VAD alignment pipeline, `process_audio`, `resegment_audio`, `retranscribe_audio`, `save_json_export`. Extracted from app.py in Phase 2.
+- **`src/mfa/timestamps.py`** — MFA forced-alignment integration: upload/submit to external MFA Space, SSE result polling, progress bar HTML, and `compute_mfa_timestamps` generator that injects word/letter timestamps into segment HTML. Extracted from app.py in Phase 3.
 ### Core Algorithm (`segment_core/`)
 - **`phoneme_anchor.py`** — N-gram rarity-weighted voting to determine which chapter/verse a segment belongs to. Replaces earlier Whisper-based text matching.
 - **`phoneme_matcher.py`** — Substring Levenshtein DP alignment between ASR phonemes and reference Quran phonemes. Uses windowed alignment with lookback/lookahead.
 - **`_dp_core.pyx`** — Cython-accelerated DP inner loop (10-20x speedup). Falls back to pure Python if not compiled.
+- **`special_segments.py`** — Detects Basmala and Isti'adha via phoneme edit distance.
 - **`phoneme_matcher_cache.py`** — Pre-loads and caches phonemized chapter references from `data/phoneme_cache.pkl`.
 - **`ngram_index.py`** — N-gram index data structure used by anchor voting, loaded from `data/phoneme_ngram_index_5.pkl`.
 - **`zero_gpu.py`** — `@gpu_with_fallback` decorator for ZeroGPU quota handling with automatic CPU fallback.
 |-------|----|---------|
 | VAD | `obadx/recitation-segmenter-v2` | Voice activity detection |
 | ASR Base | `hetchyy/r15_95m` | Phoneme recognition (95M params) |
+| ASR Large | `hetchyy/r7` | Phoneme recognition (higher accuracy, slower) |
 | MFA | External Space `hetchyy-quran-phoneme-mfa` | Word-level forced alignment |
 ### Key Patterns

app.py CHANGED Viewed

The diff for this file is too large to render. See raw diff

src/ui/interface.py ADDED Viewed

The diff for this file is too large to render. See raw diff