Whisper Medium (ExecuTorch, XNNPACK, bfloat16)

This folder contains an ExecuTorch .pte export of openai/whisper-medium for CPU inference via the XNNPACK backend.

model.pte: ExecuTorch program (methods: encoder, text_decoder)
whisper_preprocessor.pte: mel-spectrogram preprocessor (feature size 80)
tokenizer.json, vocab.json, merges.txt: tokenizer artifacts
config.json, generation_config.json, preprocessor_config.json, tokenizer_config.json, special_tokens_map.json: metadata files from the upstream Hugging Face repo

Export details

Task: automatic-speech-recognition
Recipe: xnnpack
Weight dtype: bfloat16 (no quantization)

Tooling versions used to generate these artifacts:

ExecuTorch: executorch==1.2.0a0+efe4f0c (git efe4f0cce3)
Optimum ExecuTorch: optimum-executorch==0.2.0.dev0 (git 4c62ed7)

Command used:

optimum-cli export executorch \
  --model "openai/whisper-medium" \
  --task "automatic-speech-recognition" \
  --recipe "xnnpack" \
  --dtype "bfloat16" \
  --output_dir "<output_dir>"

Preprocessor command used:

python -m executorch.extension.audio.mel_spectrogram \
  --feature_size 80 \
  --stack_output \
  --max_audio_len 300 \
  --output_file whisper_preprocessor.pte

Run with the ExecuTorch Whisper runner

Build the runner from the ExecuTorch repo root:

make whisper-cpu

Run (expects a 16kHz mono WAV):

cmake-out/examples/models/whisper/whisper_runner \
  --model_path model.pte \
  --tokenizer_path ./ \
  --audio_path output.wav \
  --processor_path whisper_preprocessor.pte \
  --temperature 0

Downloads last month: -

Model tree for larryliu0820/whisper-medium-ExecuTorch-XNNPACK

Base model

openai/whisper-medium

Finetuned

(899)

this model

larryliu0820
/

whisper-medium-ExecuTorch-XNNPACK

Whisper Medium (ExecuTorch, XNNPACK, bfloat16)

Contents

Export details

Run with the ExecuTorch Whisper runner

Model tree for larryliu0820/whisper-medium-ExecuTorch-XNNPACK