Spaces:
Running
Running
include latest model
Browse files
README.md
CHANGED
|
@@ -25,7 +25,9 @@ We host several models, which are specifically tailored to the processing of Fle
|
|
| 25 |
|
| 26 |
### Automatic Speech Recognition (ASR)
|
| 27 |
|
| 28 |
-
-- **
|
|
|
|
|
|
|
| 29 |
It can generate both an exact verbatim transcription with annotation tags as well as a fully formatted and cleaned up subtitle transcription.
|
| 30 |
|
| 31 |
-- **ASR_subtitles_v2_small**: Smaller variant of ASR_subtitles_v2 with almost as good performance.
|
|
@@ -49,7 +51,8 @@ Word Error Rates on different test sets.
|
|
| 49 |
|
| 50 |
|Model Tag|Number of Parameters|Test CGN|Test Media|
|
| 51 |
|:---|:---:|:---:|:---:|
|
| 52 |
-
|
|
|
|
|
| 53 |
|ASR_subtitles_v2_small|70M|6.93|9.30|
|
| 54 |
|Whisper large finetuned|1550M|7.83|10.64|
|
| 55 |
|Whisper large v3|1550M|11.54|13.76|
|
|
|
|
| 25 |
|
| 26 |
### Automatic Speech Recognition (ASR)
|
| 27 |
|
| 28 |
+
-- **NeLF_S2T_Pytorch** (Recommended): The third version of our Automatic Speech Recognition and Subtitle Generation model. It is a fine-tuned version of ASR_subtitles_v2 without Kaldi-dependency (pure Pytorch), and refined training data leveraging contextualisation techniques for pseudo-labeling.
|
| 29 |
+
|
| 30 |
+
-- **ASR_subtitles_v2**: The second version of our Automatic Speech Recognition and Subtitle Generation model, with improved architecture and trained on 14000 hours of Flemish broadcast subtitled speech data.
|
| 31 |
It can generate both an exact verbatim transcription with annotation tags as well as a fully formatted and cleaned up subtitle transcription.
|
| 32 |
|
| 33 |
-- **ASR_subtitles_v2_small**: Smaller variant of ASR_subtitles_v2 with almost as good performance.
|
|
|
|
| 51 |
|
| 52 |
|Model Tag|Number of Parameters|Test CGN|Test Media|
|
| 53 |
|:---|:---:|:---:|:---:|
|
| 54 |
+
|NeLF_S2T_Pytorch|180M|6.65|8.23|
|
| 55 |
+
|ASR_subtitles_v2|180M|6.49|8.63|
|
| 56 |
|ASR_subtitles_v2_small|70M|6.93|9.30|
|
| 57 |
|Whisper large finetuned|1550M|7.83|10.64|
|
| 58 |
|Whisper large v3|1550M|11.54|13.76|
|