MuhammadHijazii commited on
Commit
c113304
·
verified ·
1 Parent(s): 52e080c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -15
README.md CHANGED
@@ -10,29 +10,40 @@ pinned: false
10
  license: apache-2.0
11
  ---
12
 
13
- # Faster-Whisper Large-v3 Arabic (Minimal)
14
 
15
- - تفريغ الصوت باستخدام `faster-whisper` مع `word_timestamps=True`.
16
- - إرجاع:
17
- - نص كامل مجمّع من الكلمات.
18
- - نص المقاطع مع الزمن.
19
- - جدول كلمات واحتمالاتها.
20
- - تقرير JSON صغير.
21
 
22
- > **ملاحظة**: نموذج `large-v3` يفضّل GPU Space.
 
 
 
23
 
24
- ## API
 
 
 
 
25
 
26
- Endpoint اسمه `/transcribe`:
 
27
 
 
 
28
  ```python
29
  from gradio_client import Client, file
30
  client = Client("<username>/<space_name>")
31
- transcript, segments, report, table = client.predict(
32
  audio=file("audio.wav"),
33
- model_name="large-v3",
34
- compute_type="float16",
 
35
  vad=True,
36
- api_name="/transcribe"
 
37
  )
38
- print(report)
 
10
  license: apache-2.0
11
  ---
12
 
13
+ # Samaali — Whisper ASR Post-Processing (Arabic)
14
 
15
+ - Transcribes audio with **faster-whisper** (word timestamps + probabilities)
16
+ - Aligns with the original text and distinguishes **ASR errors** vs **memorization errors**
17
+ - Restores ASR errors to the ground-truth and computes:
18
+ - **Literal score** (Levenshtein + word-overlap + BLEU-1)
19
+ - **Semantic score** (SBERT + MARBERT-CLS)
 
20
 
21
+ ## Usage
22
+ 1. Upload/record audio and paste the **Original Text**.
23
+ 2. Pick Whisper size (`large-v3` on GPU, `small/medium` on CPU).
24
+ 3. Click **Transcribe & Evaluate**.
25
 
26
+ Outputs:
27
+ - **Corrected Transcript** (ASR-only corrections applied)
28
+ - **Raw ASR Transcript**
29
+ - **JSON Report** (scores & thresholds)
30
+ - **Token-level decisions table**
31
 
32
+ ## API (Spaces Inference)
33
+ Two endpoints are exposed:
34
 
35
+ ### 1) `/run/evaluate` (UI-equivalent)
36
+ **Python**
37
  ```python
38
  from gradio_client import Client, file
39
  client = Client("<username>/<space_name>")
40
+ corrected, asr_out, report, table = client.predict(
41
  audio=file("audio.wav"),
42
+ original_text="النص الأصلي...",
43
+ whisper_size="small",
44
+ compute_type="int8",
45
  vad=True,
46
+ use_marbert=False, # True if GPU
47
+ api_name="/evaluate"
48
  )
49
+ print(report) # JSON