metadata
language:
- en
- zh
- de
- es
- ru
- ko
- fr
- ja
- pt
- tr
- pl
- ca
- nl
- ar
- sv
- it
- id
- hi
- fi
- vi
- he
- uk
- el
- ms
- cs
- ro
- da
- hu
- ta
- 'no'
- th
- ur
- hr
- bg
- lt
- la
- mi
- ml
- cy
- sk
- te
- fa
- lv
- bn
- sr
- az
- sl
- kn
- et
- mk
- br
- eu
- is
- hy
- ne
- mn
- bs
- kk
- sq
- sw
- gl
- mr
- pa
- si
- km
- sn
- yo
- so
- af
- oc
- ka
- be
- tg
- sd
- gu
- am
- yi
- lo
- uz
- fo
- ht
- ps
- tk
- nn
- mt
- sa
- lb
- my
- bo
- tl
- mg
- as
- tt
- haw
- ln
- ha
- ba
- jw
- su
tags:
- audio
- automatic-speech-recognition
- eole
- whisper
license: apache-2.0
base_model: openai/whisper-small
pipeline_tag: automatic-speech-recognition
Whisper Small (eole)
This is openai/whisper-small converted to eole format using eole convert --model_dir openai/whisper-small.
No weights were modified — this is a format conversion only.
Model details
| Original model | openai/whisper-small |
| Parameters | 244M |
| Encoder layers | 12 |
| Decoder layers | 12 |
| Hidden size | 768 |
| Attention heads | 12 |
| Mel bins | 80 |
| Vocab size | 51,865 |
| License | Apache 2.0 |
Usage
pip install eole[wer]
Transcribe
eole predict \
-config eval_config.yaml \
-model_path whisper-small-eole \
-src audio_files.txt \
-output transcriptions.txt \
-language en \
-task transcribe \
-gpu_ranks 0
Evaluation
All evaluations use beam size 5.
| Benchmark | WER |
|---|---|
| LibriSpeech test-clean | 3.30% |
Conversion
eole convert --model_dir openai/whisper-small --output whisper-small-eole
Citation
@misc{radford2023robust,
title={Robust Speech Recognition via Large-Scale Weak Supervision},
author={Alec Radford and Jong Wook Kim and Tao Xu and Greg Brockman and Christine McLeavey and Ilya Sutskever},
year={2023},
eprint={2212.04356},
archivePrefix={arXiv},
primaryClass={eess.AS}
}