Spaces:
Running on Zero
Running on Zero
update README.md
Browse files
README.md
CHANGED
|
@@ -35,31 +35,6 @@ Automatic forced alignment for Quran recitations. Upload an audio recording of a
|
|
| 35 |
| [hetchyy/r7](https://huggingface.co/hetchyy/r7) | Phoneme ASR (Large — higher accuracy) |
|
| 36 |
| [hetchyy/Quran-phoneme-mfa](https://huggingface.co/spaces/hetchyy/Quran-phoneme-mfa) | MFA forced alignment (external Space) |
|
| 37 |
|
| 38 |
-
## Running locally
|
| 39 |
-
|
| 40 |
-
```bash
|
| 41 |
-
# Install dependencies
|
| 42 |
-
pip install -r requirements.txt
|
| 43 |
-
|
| 44 |
-
# Start the app (port 7860)
|
| 45 |
-
python app.py
|
| 46 |
-
|
| 47 |
-
# Dev mode — skip model preloading for fast startup
|
| 48 |
-
python app.py --dev
|
| 49 |
-
|
| 50 |
-
# With a public sharing link
|
| 51 |
-
python app.py --share
|
| 52 |
-
```
|
| 53 |
-
|
| 54 |
-
### Optional: Cython acceleration
|
| 55 |
-
|
| 56 |
-
The DP alignment inner loop has a Cython extension that provides 10-20x speedup. It is automatically compiled on startup, but if that fails (missing C compiler), the app falls back to pure Python.
|
| 57 |
-
|
| 58 |
-
```bash
|
| 59 |
-
# Manual build
|
| 60 |
-
python setup.py build_ext --inplace
|
| 61 |
-
```
|
| 62 |
-
|
| 63 |
## How it works
|
| 64 |
|
| 65 |
### Alignment algorithm
|
|
@@ -75,7 +50,7 @@ The core alignment uses **substring Levenshtein DP** with word-boundary constrai
|
|
| 75 |
### Retry and recovery
|
| 76 |
|
| 77 |
When alignment fails for a segment:
|
| 78 |
-
- **Tier 1:** Expanded search window
|
| 79 |
- **Tier 2:** Expanded window + relaxed confidence threshold (0.45)
|
| 80 |
- **Re-anchoring:** After 2 consecutive failures, n-gram voting re-localizes position within the surah
|
| 81 |
|
|
@@ -83,15 +58,4 @@ When alignment fails for a segment:
|
|
| 83 |
|
| 84 |
Two playback modes with real-time word highlighting:
|
| 85 |
- **Per-segment** — Animate a single aligned segment with word/character-level karaoke
|
| 86 |
-
- **Mega card** — Unified text flow across all segments with click-to-seek and configurable opacity windowing (Reveal, Fade, Spotlight, Isolate, Consume modes)
|
| 87 |
-
|
| 88 |
-
## Key dependencies
|
| 89 |
-
|
| 90 |
-
- **[quranic-phonemizer](https://pypi.org/project/quranic-phonemizer/)** — Quran-specific grapheme-to-phoneme conversion with tajweed rules
|
| 91 |
-
- **[recitations-segmenter](https://pypi.org/project/recitations-segmenter/)** — VAD model for Quran recitation audio
|
| 92 |
-
- **torch 2.8** / **transformers 5.0** — Model inference
|
| 93 |
-
- **Gradio ≥ 6.5.1** — Web UI framework
|
| 94 |
-
|
| 95 |
-
## License
|
| 96 |
-
|
| 97 |
-
MIT
|
|
|
|
| 35 |
| [hetchyy/r7](https://huggingface.co/hetchyy/r7) | Phoneme ASR (Large — higher accuracy) |
|
| 36 |
| [hetchyy/Quran-phoneme-mfa](https://huggingface.co/spaces/hetchyy/Quran-phoneme-mfa) | MFA forced alignment (external Space) |
|
| 37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
## How it works
|
| 39 |
|
| 40 |
### Alignment algorithm
|
|
|
|
| 50 |
### Retry and recovery
|
| 51 |
|
| 52 |
When alignment fails for a segment:
|
| 53 |
+
- **Tier 1:** Expanded search window
|
| 54 |
- **Tier 2:** Expanded window + relaxed confidence threshold (0.45)
|
| 55 |
- **Re-anchoring:** After 2 consecutive failures, n-gram voting re-localizes position within the surah
|
| 56 |
|
|
|
|
| 58 |
|
| 59 |
Two playback modes with real-time word highlighting:
|
| 60 |
- **Per-segment** — Animate a single aligned segment with word/character-level karaoke
|
| 61 |
+
- **Mega card** — Unified text flow across all segments with click-to-seek and configurable opacity windowing (Reveal, Fade, Spotlight, Isolate, Consume modes)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|