Spaces:
Running on Zero
Running on Zero
Extract build_interface() from app.py into src/ui/interface.py
Browse filesMove the ~3000-line Gradio UI function (CSS, JS animation system,
component layout, event wiring) into src/ui/interface.py, reducing
app.py to an ~85-line bootstrap file. Phase 4 of the app.py refactor.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- CLAUDE.md +5 -3
- app.py +0 -0
- src/ui/interface.py +0 -0
CLAUDE.md
CHANGED
|
@@ -12,12 +12,14 @@ Quran recitation alignment tool that segments audio recordings and aligns them w
|
|
| 12 |
|
| 13 |
### Entry Point
|
| 14 |
|
| 15 |
-
`app.py` (~
|
| 16 |
|
| 17 |
### Extracted Modules (`src/`)
|
| 18 |
|
|
|
|
| 19 |
- **`src/ui/segments.py`** β Segment rendering helpers (HTML cards, confidence classes, timestamps, audio encoding). Extracted from app.py in Phase 1.
|
| 20 |
- **`src/pipeline/process.py`** β GPU-decorated pipeline functions: VAD+ASR GPU leases, post-VAD alignment pipeline, `process_audio`, `resegment_audio`, `retranscribe_audio`, `save_json_export`. Extracted from app.py in Phase 2.
|
|
|
|
| 21 |
|
| 22 |
### Core Algorithm (`segment_core/`)
|
| 23 |
|
|
@@ -26,7 +28,7 @@ Quran recitation alignment tool that segments audio recordings and aligns them w
|
|
| 26 |
- **`phoneme_anchor.py`** β N-gram rarity-weighted voting to determine which chapter/verse a segment belongs to. Replaces earlier Whisper-based text matching.
|
| 27 |
- **`phoneme_matcher.py`** β Substring Levenshtein DP alignment between ASR phonemes and reference Quran phonemes. Uses windowed alignment with lookback/lookahead.
|
| 28 |
- **`_dp_core.pyx`** β Cython-accelerated DP inner loop (10-20x speedup). Falls back to pure Python if not compiled.
|
| 29 |
-
- **`special_segments.py`** β Detects Basmala and Isti'adha via phoneme edit distance
|
| 30 |
- **`phoneme_matcher_cache.py`** β Pre-loads and caches phonemized chapter references from `data/phoneme_cache.pkl`.
|
| 31 |
- **`ngram_index.py`** β N-gram index data structure used by anchor voting, loaded from `data/phoneme_ngram_index_5.pkl`.
|
| 32 |
- **`zero_gpu.py`** β `@gpu_with_fallback` decorator for ZeroGPU quota handling with automatic CPU fallback.
|
|
@@ -50,7 +52,7 @@ Quran recitation alignment tool that segments audio recordings and aligns them w
|
|
| 50 |
|-------|----|---------|
|
| 51 |
| VAD | `obadx/recitation-segmenter-v2` | Voice activity detection |
|
| 52 |
| ASR Base | `hetchyy/r15_95m` | Phoneme recognition (95M params) |
|
| 53 |
-
| ASR Large | `hetchyy/r7` | Phoneme recognition (higher accuracy,
|
| 54 |
| MFA | External Space `hetchyy-quran-phoneme-mfa` | Word-level forced alignment |
|
| 55 |
|
| 56 |
### Key Patterns
|
|
|
|
| 12 |
|
| 13 |
### Entry Point
|
| 14 |
|
| 15 |
+
`app.py` (~85 lines) β Bootstrap entry point: path setup, Cython build, imports `build_interface()` from `src/ui/interface.py`, and `__main__` block with model preloading.
|
| 16 |
|
| 17 |
### Extracted Modules (`src/`)
|
| 18 |
|
| 19 |
+
- **`src/ui/interface.py`** β `build_interface()`: full Gradio layout (~3000 lines of CSS, JS animation system, component definitions, and event wiring). Extracted from app.py in Phase 4.
|
| 20 |
- **`src/ui/segments.py`** β Segment rendering helpers (HTML cards, confidence classes, timestamps, audio encoding). Extracted from app.py in Phase 1.
|
| 21 |
- **`src/pipeline/process.py`** β GPU-decorated pipeline functions: VAD+ASR GPU leases, post-VAD alignment pipeline, `process_audio`, `resegment_audio`, `retranscribe_audio`, `save_json_export`. Extracted from app.py in Phase 2.
|
| 22 |
+
- **`src/mfa/timestamps.py`** β MFA forced-alignment integration: upload/submit to external MFA Space, SSE result polling, progress bar HTML, and `compute_mfa_timestamps` generator that injects word/letter timestamps into segment HTML. Extracted from app.py in Phase 3.
|
| 23 |
|
| 24 |
### Core Algorithm (`segment_core/`)
|
| 25 |
|
|
|
|
| 28 |
- **`phoneme_anchor.py`** β N-gram rarity-weighted voting to determine which chapter/verse a segment belongs to. Replaces earlier Whisper-based text matching.
|
| 29 |
- **`phoneme_matcher.py`** β Substring Levenshtein DP alignment between ASR phonemes and reference Quran phonemes. Uses windowed alignment with lookback/lookahead.
|
| 30 |
- **`_dp_core.pyx`** β Cython-accelerated DP inner loop (10-20x speedup). Falls back to pure Python if not compiled.
|
| 31 |
+
- **`special_segments.py`** β Detects Basmala and Isti'adha via phoneme edit distance.
|
| 32 |
- **`phoneme_matcher_cache.py`** β Pre-loads and caches phonemized chapter references from `data/phoneme_cache.pkl`.
|
| 33 |
- **`ngram_index.py`** β N-gram index data structure used by anchor voting, loaded from `data/phoneme_ngram_index_5.pkl`.
|
| 34 |
- **`zero_gpu.py`** β `@gpu_with_fallback` decorator for ZeroGPU quota handling with automatic CPU fallback.
|
|
|
|
| 52 |
|-------|----|---------|
|
| 53 |
| VAD | `obadx/recitation-segmenter-v2` | Voice activity detection |
|
| 54 |
| ASR Base | `hetchyy/r15_95m` | Phoneme recognition (95M params) |
|
| 55 |
+
| ASR Large | `hetchyy/r7` | Phoneme recognition (higher accuracy, slower) |
|
| 56 |
| MFA | External Space `hetchyy-quran-phoneme-mfa` | Word-level forced alignment |
|
| 57 |
|
| 58 |
### Key Patterns
|
app.py
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
src/ui/interface.py
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|