Whisper Large V3 (eole)

This is openai/whisper-large-v3 converted to eole format using eole convert --model_dir openai/whisper-large-v3.

No weights were modified — this is a format conversion only.

Model details


Original model	openai/whisper-large-v3
Parameters	1.55B
Encoder layers	32
Decoder layers	32
Hidden size	1280
Attention heads	20
Mel bins	128
Vocab size	51,866
License	Apache 2.0

Usage

pip install eole[wer]

Transcribe

eole predict \
  -config eval_config.yaml \
  -model_path whisper-large-v3-eole \
  -src audio_files.txt \
  -output transcriptions.txt \
  -language en \
  -task transcribe \
  -gpu_ranks 0

Evaluate on FLEURS

cd recipes/whisper/eval
python eval_fleurs.py \
  --model_path whisper-large-v3-eole \
  --languages hi_in \
  --output_dir ./results

Evaluation

All evaluations use beam size 5.

Benchmark	WER
LibriSpeech test-clean	1.91%

Conversion

eole convert --model_dir openai/whisper-large-v3 --output whisper-large-v3-eole

Citation

@misc{radford2023robust,
      title={Robust Speech Recognition via Large-Scale Weak Supervision},
      author={Alec Radford and Jong Wook Kim and Tao Xu and Greg Brockman and Christine McLeavey and Ilya Sutskever},
      year={2023},
      eprint={2212.04356},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}

Downloads last month: 1

Safetensors

Model size

2B params

Tensor type

F16

Model tree for davidmeikle/whisper-large-v3-eole

Base model

openai/whisper-large-v3

Finetuned

(873)

this model

Paper for davidmeikle/whisper-large-v3-eole

Robust Speech Recognition via Large-Scale Weak Supervision

Paper • 2212.04356 • Published Dec 6, 2022 • 55