davidmeikle commited on
Commit
f441ae0
·
verified ·
1 Parent(s): dd1909c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +176 -0
README.md ADDED
@@ -0,0 +1,176 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - zh
5
+ - de
6
+ - es
7
+ - ru
8
+ - ko
9
+ - fr
10
+ - ja
11
+ - pt
12
+ - tr
13
+ - pl
14
+ - ca
15
+ - nl
16
+ - ar
17
+ - sv
18
+ - it
19
+ - id
20
+ - hi
21
+ - fi
22
+ - vi
23
+ - he
24
+ - uk
25
+ - el
26
+ - ms
27
+ - cs
28
+ - ro
29
+ - da
30
+ - hu
31
+ - ta
32
+ - "no"
33
+ - th
34
+ - ur
35
+ - hr
36
+ - bg
37
+ - lt
38
+ - la
39
+ - mi
40
+ - ml
41
+ - cy
42
+ - sk
43
+ - te
44
+ - fa
45
+ - lv
46
+ - bn
47
+ - sr
48
+ - az
49
+ - sl
50
+ - kn
51
+ - et
52
+ - mk
53
+ - br
54
+ - eu
55
+ - is
56
+ - hy
57
+ - ne
58
+ - mn
59
+ - bs
60
+ - kk
61
+ - sq
62
+ - sw
63
+ - gl
64
+ - mr
65
+ - pa
66
+ - si
67
+ - km
68
+ - sn
69
+ - yo
70
+ - so
71
+ - af
72
+ - oc
73
+ - ka
74
+ - be
75
+ - tg
76
+ - sd
77
+ - gu
78
+ - am
79
+ - yi
80
+ - lo
81
+ - uz
82
+ - fo
83
+ - ht
84
+ - ps
85
+ - tk
86
+ - nn
87
+ - mt
88
+ - sa
89
+ - lb
90
+ - my
91
+ - bo
92
+ - tl
93
+ - mg
94
+ - as
95
+ - tt
96
+ - haw
97
+ - ln
98
+ - ha
99
+ - ba
100
+ - jw
101
+ - su
102
+ tags:
103
+ - audio
104
+ - automatic-speech-recognition
105
+ - eole
106
+ - whisper
107
+ license: apache-2.0
108
+ base_model: openai/whisper-medium
109
+ pipeline_tag: automatic-speech-recognition
110
+ ---
111
+
112
+ # Whisper Medium (eole)
113
+
114
+ This is [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) converted to [eole](https://github.com/eole-nlp/eole) format using `eole convert --model_dir openai/whisper-medium`.
115
+
116
+ No weights were modified — this is a format conversion only.
117
+
118
+ ## Model details
119
+
120
+ | | |
121
+ |---|---|
122
+ | **Original model** | [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) |
123
+ | **Parameters** | 769M |
124
+ | **Encoder layers** | 24 |
125
+ | **Decoder layers** | 24 |
126
+ | **Hidden size** | 1024 |
127
+ | **Attention heads** | 16 |
128
+ | **Mel bins** | 80 |
129
+ | **Vocab size** | 51,865 |
130
+ | **License** | Apache 2.0 |
131
+
132
+ ## Usage
133
+
134
+ ```bash
135
+ pip install eole[wer]
136
+ ```
137
+
138
+ ### Transcribe
139
+
140
+ ```bash
141
+ eole predict \
142
+ -config eval_config.yaml \
143
+ -model_path whisper-medium-eole \
144
+ -src audio_files.txt \
145
+ -output transcriptions.txt \
146
+ -language en \
147
+ -task transcribe \
148
+ -gpu_ranks 0
149
+ ```
150
+
151
+ ## Evaluation
152
+
153
+ All evaluations use beam size 5.
154
+
155
+ | Benchmark | WER |
156
+ |---|---|
157
+ | LibriSpeech test-clean | 2.92% |
158
+
159
+ ## Conversion
160
+
161
+ ```bash
162
+ eole convert --model_dir openai/whisper-medium --output whisper-medium-eole
163
+ ```
164
+
165
+ ## Citation
166
+
167
+ ```bibtex
168
+ @misc{radford2023robust,
169
+ title={Robust Speech Recognition via Large-Scale Weak Supervision},
170
+ author={Alec Radford and Jong Wook Kim and Tao Xu and Greg Brockman and Christine McLeavey and Ilya Sutskever},
171
+ year={2023},
172
+ eprint={2212.04356},
173
+ archivePrefix={arXiv},
174
+ primaryClass={eess.AS}
175
+ }
176
+ ```