MLSpeech commited on
Commit
a9fba60
·
verified ·
1 Parent(s): d736335

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +118 -122
README.md CHANGED
@@ -1,129 +1,128 @@
1
  ---
2
- license: mit
3
  datasets:
4
- - openslr/librispeech_asr
5
- - facebook/multilingual_librispeech
6
  language:
7
- - en
8
- - fr
9
- - de
10
- - pt
11
- - es
12
  metrics:
13
- - wer
14
  base_model:
15
- - openai/whisper-large-v2
16
- - openai/whisper-small
17
- - openai/whisper-base
18
  pipeline_tag: automatic-speech-recognition
19
  tags:
20
- - streaming
21
- - asr
22
- - Transformer
23
- - encoder-decoder
24
- - pytorch
25
- - audio
26
- - speech
27
- - Whisper
28
  model-index:
29
- - name: CarelessWhisper-large-v2
30
- results:
31
- - task:
32
- type: streaming-transcription-chunk-300msec
33
- dataset:
34
- name: test-clean
35
- type: LibriSpeech
36
- metrics:
37
- - name: Word Error Rate (WER) [%]
38
- type: Word Error Rate (WER) [%]
39
- value: 5.29
40
- - name: Aligned-Relative Word Error Rate (ARWER) [%]
41
- type: Aligned-Relative Word Error Rate (WER) [%]
42
- value: 6.00
43
- - task:
44
- type: streaming-transcription-chunk-300msec
45
- dataset:
46
- name: test-other
47
- type: LibriSpeech
48
- metrics:
49
- - name: Word Error Rate (WER) [%]
50
- type: Word Error Rate (WER) [%]
51
- value: 10.74
52
- - name: Aligned-Relative Word Error Rate (ARWER) [%]
53
- type: Aligned-Relative Word Error Rate (WER) [%]
54
- value: 11.38
55
- - task:
56
- type: streaming-transcription-chunk-200msec
57
- dataset:
58
- name: test-clean
59
- type: LibriSpeech
60
- metrics:
61
- - name: Word Error Rate (WER) [%]
62
- type: Word Error Rate (WER) [%]
63
- value: 5.92
64
- - name: Aligned-Relative Word Error Rate (ARWER) [%]
65
- type: Aligned-Relative Word Error Rate (WER) [%]
66
- value: 6.63
67
- - task:
68
- type: streaming-transcription-chunk-200msec
69
- dataset:
70
- name: test-other
71
- type: LibriSpeech
72
- metrics:
73
- - name: Word Error Rate (WER) [%]
74
- type: Word Error Rate (WER) [%]
75
- value: 11.41
76
- - name: Aligned-Relative Word Error Rate (ARWER) [%]
77
- type: Aligned-Relative Word Error Rate (WER) [%]
78
- value: 12.60
79
- - task:
80
- type: streaming-transcription-chunk-100msec
81
- dataset:
82
- name: test-clean
83
- type: LibriSpeech
84
- metrics:
85
- - name: Word Error Rate (WER) [%]
86
- type: Word Error Rate (WER) [%]
87
- value: 6.33
88
- - name: Aligned-Relative Word Error Rate (ARWER) [%]
89
- type: Aligned-Relative Word Error Rate (WER) [%]
90
- value: 7.76
91
- - task:
92
- type: streaming-transcription-chunk-100msec
93
- dataset:
94
- name: test-other
95
- type: LibriSpeech
96
- metrics:
97
- - name: Word Error Rate (WER) [%]
98
- type: Word Error Rate (WER) [%]
99
- value: 13.06
100
- - name: Aligned-Relative Word Error Rate (ARWER) [%]
101
- type: Aligned-Relative Word Error Rate (WER) [%]
102
- value: 14.99
103
- - task:
104
- type: streaming-transcription-chunk-40msec
105
- dataset:
106
- name: test-clean
107
- type: LibriSpeech
108
- metrics:
109
- - name: Word Error Rate (WER) [%]
110
- type: Word Error Rate (WER) [%]
111
- value: 7.76
112
- - name: Aligned-Relative Word Error Rate (ARWER) [%]
113
- type: Aligned-Relative Word Error Rate (WER) [%]
114
- value: 9.94
115
- - task:
116
- type: streaming-transcription-chunk-40msec
117
- dataset:
118
- name: test-other
119
- type: LibriSpeech
120
- metrics:
121
- - name: Word Error Rate (WER) [%]
122
- type: Word Error Rate (WER) [%]
123
- value: 16.73
124
- - name: Aligned-Relative Word Error Rate (ARWER) [%]
125
- type: Aligned-Relative Word Error Rate (WER) [%]
126
- value: 19.28
127
  ---
128
  # CarelessWhisper - Causal Whisper Streaming Model
129
  Causal Whisper Streaming is a fine tuned version of OpenAI Whisper, which can handle causal data and perform real-time transcription.
@@ -310,7 +309,4 @@ Portions derived from [OpenAI Whisper](https://github.com/openai/whisper) are li
310
  [![CC BY-NC 4.0 License](https://img.shields.io/badge/License-CC--BY--NC%204.0-blue.svg)](https://creativecommons.org/licenses/by-nc/4.0/)
311
  All other original code in this repository is licensed under the **Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0)**.
312
 
313
- See the [LICENSE](./LICENSE) file for full details.
314
-
315
-
316
-
 
1
  ---
 
2
  datasets:
3
+ - openslr/librispeech_asr
4
+ - facebook/multilingual_librispeech
5
  language:
6
+ - en
7
+ - fr
8
+ - de
9
+ - pt
10
+ - es
11
  metrics:
12
+ - wer
13
  base_model:
14
+ - openai/whisper-large-v2
15
+ - openai/whisper-small
16
+ - openai/whisper-base
17
  pipeline_tag: automatic-speech-recognition
18
  tags:
19
+ - streaming
20
+ - asr
21
+ - Transformer
22
+ - encoder-decoder
23
+ - pytorch
24
+ - audio
25
+ - speech
26
+ - Whisper
27
  model-index:
28
+ - name: CarelessWhisper-large-v2
29
+ results:
30
+ - task:
31
+ type: streaming-transcription-chunk-300msec
32
+ dataset:
33
+ name: test-clean
34
+ type: LibriSpeech
35
+ metrics:
36
+ - name: Word Error Rate (WER) [%]
37
+ type: Word Error Rate (WER) [%]
38
+ value: 5.29
39
+ - name: Aligned-Relative Word Error Rate (ARWER) [%]
40
+ type: Aligned-Relative Word Error Rate (WER) [%]
41
+ value: 6
42
+ - task:
43
+ type: streaming-transcription-chunk-300msec
44
+ dataset:
45
+ name: test-other
46
+ type: LibriSpeech
47
+ metrics:
48
+ - name: Word Error Rate (WER) [%]
49
+ type: Word Error Rate (WER) [%]
50
+ value: 10.74
51
+ - name: Aligned-Relative Word Error Rate (ARWER) [%]
52
+ type: Aligned-Relative Word Error Rate (WER) [%]
53
+ value: 11.38
54
+ - task:
55
+ type: streaming-transcription-chunk-200msec
56
+ dataset:
57
+ name: test-clean
58
+ type: LibriSpeech
59
+ metrics:
60
+ - name: Word Error Rate (WER) [%]
61
+ type: Word Error Rate (WER) [%]
62
+ value: 5.92
63
+ - name: Aligned-Relative Word Error Rate (ARWER) [%]
64
+ type: Aligned-Relative Word Error Rate (WER) [%]
65
+ value: 6.63
66
+ - task:
67
+ type: streaming-transcription-chunk-200msec
68
+ dataset:
69
+ name: test-other
70
+ type: LibriSpeech
71
+ metrics:
72
+ - name: Word Error Rate (WER) [%]
73
+ type: Word Error Rate (WER) [%]
74
+ value: 11.41
75
+ - name: Aligned-Relative Word Error Rate (ARWER) [%]
76
+ type: Aligned-Relative Word Error Rate (WER) [%]
77
+ value: 12.6
78
+ - task:
79
+ type: streaming-transcription-chunk-100msec
80
+ dataset:
81
+ name: test-clean
82
+ type: LibriSpeech
83
+ metrics:
84
+ - name: Word Error Rate (WER) [%]
85
+ type: Word Error Rate (WER) [%]
86
+ value: 6.33
87
+ - name: Aligned-Relative Word Error Rate (ARWER) [%]
88
+ type: Aligned-Relative Word Error Rate (WER) [%]
89
+ value: 7.76
90
+ - task:
91
+ type: streaming-transcription-chunk-100msec
92
+ dataset:
93
+ name: test-other
94
+ type: LibriSpeech
95
+ metrics:
96
+ - name: Word Error Rate (WER) [%]
97
+ type: Word Error Rate (WER) [%]
98
+ value: 13.06
99
+ - name: Aligned-Relative Word Error Rate (ARWER) [%]
100
+ type: Aligned-Relative Word Error Rate (WER) [%]
101
+ value: 14.99
102
+ - task:
103
+ type: streaming-transcription-chunk-40msec
104
+ dataset:
105
+ name: test-clean
106
+ type: LibriSpeech
107
+ metrics:
108
+ - name: Word Error Rate (WER) [%]
109
+ type: Word Error Rate (WER) [%]
110
+ value: 7.76
111
+ - name: Aligned-Relative Word Error Rate (ARWER) [%]
112
+ type: Aligned-Relative Word Error Rate (WER) [%]
113
+ value: 9.94
114
+ - task:
115
+ type: streaming-transcription-chunk-40msec
116
+ dataset:
117
+ name: test-other
118
+ type: LibriSpeech
119
+ metrics:
120
+ - name: Word Error Rate (WER) [%]
121
+ type: Word Error Rate (WER) [%]
122
+ value: 16.73
123
+ - name: Aligned-Relative Word Error Rate (ARWER) [%]
124
+ type: Aligned-Relative Word Error Rate (WER) [%]
125
+ value: 19.28
126
  ---
127
  # CarelessWhisper - Causal Whisper Streaming Model
128
  Causal Whisper Streaming is a fine tuned version of OpenAI Whisper, which can handle causal data and perform real-time transcription.
 
309
  [![CC BY-NC 4.0 License](https://img.shields.io/badge/License-CC--BY--NC%204.0-blue.svg)](https://creativecommons.org/licenses/by-nc/4.0/)
310
  All other original code in this repository is licensed under the **Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0)**.
311
 
312
+ See the [LICENSE](./LICENSE) file for full details.