hetchyy Claude Opus 4.6 commited on
Commit
045ee7d
Β·
1 Parent(s): eacc253

Extract build_interface() from app.py into src/ui/interface.py

Browse files

Move the ~3000-line Gradio UI function (CSS, JS animation system,
component layout, event wiring) into src/ui/interface.py, reducing
app.py to an ~85-line bootstrap file. Phase 4 of the app.py refactor.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Files changed (3) hide show
  1. CLAUDE.md +5 -3
  2. app.py +0 -0
  3. src/ui/interface.py +0 -0
CLAUDE.md CHANGED
@@ -12,12 +12,14 @@ Quran recitation alignment tool that segments audio recordings and aligns them w
12
 
13
  ### Entry Point
14
 
15
- `app.py` (~3900 lines) β€” Gradio UI, state management, JavaScript animation system, and API endpoint `/process_audio_json`. Being refactored into smaller modules.
16
 
17
  ### Extracted Modules (`src/`)
18
 
 
19
  - **`src/ui/segments.py`** β€” Segment rendering helpers (HTML cards, confidence classes, timestamps, audio encoding). Extracted from app.py in Phase 1.
20
  - **`src/pipeline/process.py`** β€” GPU-decorated pipeline functions: VAD+ASR GPU leases, post-VAD alignment pipeline, `process_audio`, `resegment_audio`, `retranscribe_audio`, `save_json_export`. Extracted from app.py in Phase 2.
 
21
 
22
  ### Core Algorithm (`segment_core/`)
23
 
@@ -26,7 +28,7 @@ Quran recitation alignment tool that segments audio recordings and aligns them w
26
  - **`phoneme_anchor.py`** β€” N-gram rarity-weighted voting to determine which chapter/verse a segment belongs to. Replaces earlier Whisper-based text matching.
27
  - **`phoneme_matcher.py`** β€” Substring Levenshtein DP alignment between ASR phonemes and reference Quran phonemes. Uses windowed alignment with lookback/lookahead.
28
  - **`_dp_core.pyx`** β€” Cython-accelerated DP inner loop (10-20x speedup). Falls back to pure Python if not compiled.
29
- - **`special_segments.py`** β€” Detects Basmala and Isti'adha via phoneme edit distance (threshold 0.35).
30
  - **`phoneme_matcher_cache.py`** β€” Pre-loads and caches phonemized chapter references from `data/phoneme_cache.pkl`.
31
  - **`ngram_index.py`** β€” N-gram index data structure used by anchor voting, loaded from `data/phoneme_ngram_index_5.pkl`.
32
  - **`zero_gpu.py`** β€” `@gpu_with_fallback` decorator for ZeroGPU quota handling with automatic CPU fallback.
@@ -50,7 +52,7 @@ Quran recitation alignment tool that segments audio recordings and aligns them w
50
  |-------|----|---------|
51
  | VAD | `obadx/recitation-segmenter-v2` | Voice activity detection |
52
  | ASR Base | `hetchyy/r15_95m` | Phoneme recognition (95M params) |
53
- | ASR Large | `hetchyy/r7` | Phoneme recognition (higher accuracy, 3x slower) |
54
  | MFA | External Space `hetchyy-quran-phoneme-mfa` | Word-level forced alignment |
55
 
56
  ### Key Patterns
 
12
 
13
  ### Entry Point
14
 
15
+ `app.py` (~85 lines) β€” Bootstrap entry point: path setup, Cython build, imports `build_interface()` from `src/ui/interface.py`, and `__main__` block with model preloading.
16
 
17
  ### Extracted Modules (`src/`)
18
 
19
+ - **`src/ui/interface.py`** β€” `build_interface()`: full Gradio layout (~3000 lines of CSS, JS animation system, component definitions, and event wiring). Extracted from app.py in Phase 4.
20
  - **`src/ui/segments.py`** β€” Segment rendering helpers (HTML cards, confidence classes, timestamps, audio encoding). Extracted from app.py in Phase 1.
21
  - **`src/pipeline/process.py`** β€” GPU-decorated pipeline functions: VAD+ASR GPU leases, post-VAD alignment pipeline, `process_audio`, `resegment_audio`, `retranscribe_audio`, `save_json_export`. Extracted from app.py in Phase 2.
22
+ - **`src/mfa/timestamps.py`** β€” MFA forced-alignment integration: upload/submit to external MFA Space, SSE result polling, progress bar HTML, and `compute_mfa_timestamps` generator that injects word/letter timestamps into segment HTML. Extracted from app.py in Phase 3.
23
 
24
  ### Core Algorithm (`segment_core/`)
25
 
 
28
  - **`phoneme_anchor.py`** β€” N-gram rarity-weighted voting to determine which chapter/verse a segment belongs to. Replaces earlier Whisper-based text matching.
29
  - **`phoneme_matcher.py`** β€” Substring Levenshtein DP alignment between ASR phonemes and reference Quran phonemes. Uses windowed alignment with lookback/lookahead.
30
  - **`_dp_core.pyx`** β€” Cython-accelerated DP inner loop (10-20x speedup). Falls back to pure Python if not compiled.
31
+ - **`special_segments.py`** β€” Detects Basmala and Isti'adha via phoneme edit distance.
32
  - **`phoneme_matcher_cache.py`** β€” Pre-loads and caches phonemized chapter references from `data/phoneme_cache.pkl`.
33
  - **`ngram_index.py`** β€” N-gram index data structure used by anchor voting, loaded from `data/phoneme_ngram_index_5.pkl`.
34
  - **`zero_gpu.py`** β€” `@gpu_with_fallback` decorator for ZeroGPU quota handling with automatic CPU fallback.
 
52
  |-------|----|---------|
53
  | VAD | `obadx/recitation-segmenter-v2` | Voice activity detection |
54
  | ASR Base | `hetchyy/r15_95m` | Phoneme recognition (95M params) |
55
+ | ASR Large | `hetchyy/r7` | Phoneme recognition (higher accuracy, slower) |
56
  | MFA | External Space `hetchyy-quran-phoneme-mfa` | Word-level forced alignment |
57
 
58
  ### Key Patterns
app.py CHANGED
The diff for this file is too large to render. See raw diff
 
src/ui/interface.py ADDED
The diff for this file is too large to render. See raw diff