davidmeikle
/

whisper-medium-eole

Automatic Speech Recognition

Model card Files Files and versions

davidmeikle commited on 1 day ago

Commit

f441ae0

·

verified ·

1 Parent(s): dd1909c

Create README.md

Files changed (1) hide show

README.md +176 -0

README.md ADDED Viewed

	@@ -0,0 +1,176 @@

+---
+language:
+- en
+- zh
+- de
+- es
+- ru
+- ko
+- fr
+- ja
+- pt
+- tr
+- pl
+- ca
+- nl
+- ar
+- sv
+- it
+- id
+- hi
+- fi
+- vi
+- he
+- uk
+- el
+- ms
+- cs
+- ro
+- da
+- hu
+- ta
+- "no"
+- th
+- ur
+- hr
+- bg
+- lt
+- la
+- mi
+- ml
+- cy
+- sk
+- te
+- fa
+- lv
+- bn
+- sr
+- az
+- sl
+- kn
+- et
+- mk
+- br
+- eu
+- is
+- hy
+- ne
+- mn
+- bs
+- kk
+- sq
+- sw
+- gl
+- mr
+- pa
+- si
+- km
+- sn
+- yo
+- so
+- af
+- oc
+- ka
+- be
+- tg
+- sd
+- gu
+- am
+- yi
+- lo
+- uz
+- fo
+- ht
+- ps
+- tk
+- nn
+- mt
+- sa
+- lb
+- my
+- bo
+- tl
+- mg
+- as
+- tt
+- haw
+- ln
+- ha
+- ba
+- jw
+- su
+tags:
+- audio
+- automatic-speech-recognition
+- eole
+- whisper
+license: apache-2.0
+base_model: openai/whisper-medium
+pipeline_tag: automatic-speech-recognition
+---
+# Whisper Medium (eole)
+This is [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) converted to [eole](https://github.com/eole-nlp/eole) format using `eole convert --model_dir openai/whisper-medium`.
+No weights were modified — this is a format conversion only.
+## Model details
+| | |
+|---|---|
+| **Original model** | [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) |
+| **Parameters** | 769M |
+| **Encoder layers** | 24 |
+| **Decoder layers** | 24 |
+| **Hidden size** | 1024 |
+| **Attention heads** | 16 |
+| **Mel bins** | 80 |
+| **Vocab size** | 51,865 |
+| **License** | Apache 2.0 |
+## Usage
+```bash
+pip install eole[wer]
+```
+### Transcribe
+```bash
+eole predict \
+  -config eval_config.yaml \
+  -model_path whisper-medium-eole \
+  -src audio_files.txt \
+  -output transcriptions.txt \
+  -language en \
+  -task transcribe \
+  -gpu_ranks 0
+```
+## Evaluation
+All evaluations use beam size 5.
+| Benchmark | WER |
+|---|---|
+| LibriSpeech test-clean | 2.92% |
+## Conversion
+```bash
+eole convert --model_dir openai/whisper-medium --output whisper-medium-eole
+```
+## Citation
+```bibtex
+@misc{radford2023robust,
+      title={Robust Speech Recognition via Large-Scale Weak Supervision},
+      author={Alec Radford and Jong Wook Kim and Tao Xu and Greg Brockman and Christine McLeavey and Ilya Sutskever},
+      year={2023},
+      eprint={2212.04356},
+      archivePrefix={arXiv},
+      primaryClass={eess.AS}
+}
+```