hetchyy commited on
Commit
5bc4ed8
·
1 Parent(s): 23c18d5

update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -38
README.md CHANGED
@@ -35,31 +35,6 @@ Automatic forced alignment for Quran recitations. Upload an audio recording of a
35
  | [hetchyy/r7](https://huggingface.co/hetchyy/r7) | Phoneme ASR (Large — higher accuracy) |
36
  | [hetchyy/Quran-phoneme-mfa](https://huggingface.co/spaces/hetchyy/Quran-phoneme-mfa) | MFA forced alignment (external Space) |
37
 
38
- ## Running locally
39
-
40
- ```bash
41
- # Install dependencies
42
- pip install -r requirements.txt
43
-
44
- # Start the app (port 7860)
45
- python app.py
46
-
47
- # Dev mode — skip model preloading for fast startup
48
- python app.py --dev
49
-
50
- # With a public sharing link
51
- python app.py --share
52
- ```
53
-
54
- ### Optional: Cython acceleration
55
-
56
- The DP alignment inner loop has a Cython extension that provides 10-20x speedup. It is automatically compiled on startup, but if that fails (missing C compiler), the app falls back to pure Python.
57
-
58
- ```bash
59
- # Manual build
60
- python setup.py build_ext --inplace
61
- ```
62
-
63
  ## How it works
64
 
65
  ### Alignment algorithm
@@ -75,7 +50,7 @@ The core alignment uses **substring Levenshtein DP** with word-boundary constrai
75
  ### Retry and recovery
76
 
77
  When alignment fails for a segment:
78
- - **Tier 1:** Expanded search window (60 lookback, 40 lookahead)
79
  - **Tier 2:** Expanded window + relaxed confidence threshold (0.45)
80
  - **Re-anchoring:** After 2 consecutive failures, n-gram voting re-localizes position within the surah
81
 
@@ -83,15 +58,4 @@ When alignment fails for a segment:
83
 
84
  Two playback modes with real-time word highlighting:
85
  - **Per-segment** — Animate a single aligned segment with word/character-level karaoke
86
- - **Mega card** — Unified text flow across all segments with click-to-seek and configurable opacity windowing (Reveal, Fade, Spotlight, Isolate, Consume modes)
87
-
88
- ## Key dependencies
89
-
90
- - **[quranic-phonemizer](https://pypi.org/project/quranic-phonemizer/)** — Quran-specific grapheme-to-phoneme conversion with tajweed rules
91
- - **[recitations-segmenter](https://pypi.org/project/recitations-segmenter/)** — VAD model for Quran recitation audio
92
- - **torch 2.8** / **transformers 5.0** — Model inference
93
- - **Gradio ≥ 6.5.1** — Web UI framework
94
-
95
- ## License
96
-
97
- MIT
 
35
  | [hetchyy/r7](https://huggingface.co/hetchyy/r7) | Phoneme ASR (Large — higher accuracy) |
36
  | [hetchyy/Quran-phoneme-mfa](https://huggingface.co/spaces/hetchyy/Quran-phoneme-mfa) | MFA forced alignment (external Space) |
37
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  ## How it works
39
 
40
  ### Alignment algorithm
 
50
  ### Retry and recovery
51
 
52
  When alignment fails for a segment:
53
+ - **Tier 1:** Expanded search window
54
  - **Tier 2:** Expanded window + relaxed confidence threshold (0.45)
55
  - **Re-anchoring:** After 2 consecutive failures, n-gram voting re-localizes position within the surah
56
 
 
58
 
59
  Two playback modes with real-time word highlighting:
60
  - **Per-segment** — Animate a single aligned segment with word/character-level karaoke
61
+ - **Mega card** — Unified text flow across all segments with click-to-seek and configurable opacity windowing (Reveal, Fade, Spotlight, Isolate, Consume modes)