tvosch commited on
Commit
dcdbf5e
·
1 Parent(s): 7f2ce48

RTX with batch size of 1

Browse files
Files changed (1) hide show
  1. app.py +3 -4
app.py CHANGED
@@ -128,7 +128,7 @@ Local models are benchmarked on **H100 HBM2e GPUs** for consistent performance m
128
 
129
  **Word Error Rate (WER)** measures the percentage of words transcribed incorrectly compared to a reference transcript. It is calculated as `(substitutions + deletions + insertions) / total reference words × 100`. Lower is better. Results are normalized before scoring: lowercase, no punctuation, digits expanded to words, fillers removed.
130
 
131
- **Real-Time Factor (RTF)** measures how fast a model transcribes relative to the audio duration. An RTF of 0.1 means 1 second of audio is processed in 100 ms. Lower is faster. RTF measured via HTTP API includes network overhead.
132
 
133
  ## Datasets
134
 
@@ -145,10 +145,9 @@ with gr.Blocks(title="Dutch ASR Leaderboard", theme=gr.themes.Default()) as demo
145
  "# Dutch ASR Leaderboard\n"
146
  "**An independent, community-driven benchmark for Dutch automatic speech recognition.** \n"
147
  "Models are evaluated on standardized public test sets. Lower WER is better. "
148
- "Rankings serve as a proxy for comparison performance on your data may differ.\n\n"
149
- "> **Note:** Some models may be benchmaxxed trained or fine-tuned on data that overlaps "
150
  "with these test sets. Treat results as indicative, not definitive. "
151
- "How models compare here may not reflect how they perform on your specific domain, audio conditions, or use case.\n\n"
152
  "[Submit your model on GitHub →](https://github.com/tvosch/Dutch-ASR-leaderboard)"
153
  )
154
 
 
128
 
129
  **Word Error Rate (WER)** measures the percentage of words transcribed incorrectly compared to a reference transcript. It is calculated as `(substitutions + deletions + insertions) / total reference words × 100`. Lower is better. Results are normalized before scoring: lowercase, no punctuation, digits expanded to words, fillers removed.
130
 
131
+ **Real-Time Factor (RTF)** measures how fast a model transcribes relative to the audio duration. An RTF of 0.1 means 1 second of audio is processed in 100 ms. Lower is faster. Measured here at batch size 1; RTF measured via HTTP API includes network overhead.
132
 
133
  ## Datasets
134
 
 
145
  "# Dutch ASR Leaderboard\n"
146
  "**An independent, community-driven benchmark for Dutch automatic speech recognition.** \n"
147
  "Models are evaluated on standardized public test sets. Lower WER is better. "
148
+ "Rankings serve as a proxy for comparison, performance on your data may differ.\n\n"
149
+ "> **Note:** Some models may be benchmaxxed: trained or fine-tuned on data that overlaps "
150
  "with these test sets. Treat results as indicative, not definitive. "
 
151
  "[Submit your model on GitHub →](https://github.com/tvosch/Dutch-ASR-leaderboard)"
152
  )
153