Whisper Medium (ExecuTorch, XNNPACK, bfloat16)

This folder contains an ExecuTorch .pte export of openai/whisper-medium for CPU inference via the XNNPACK backend.

Contents

  • model.pte: ExecuTorch program (methods: encoder, text_decoder)
  • whisper_preprocessor.pte: mel-spectrogram preprocessor (feature size 80)
  • tokenizer.json, vocab.json, merges.txt: tokenizer artifacts
  • config.json, generation_config.json, preprocessor_config.json, tokenizer_config.json, special_tokens_map.json: metadata files from the upstream Hugging Face repo

Export details

  • Task: automatic-speech-recognition
  • Recipe: xnnpack
  • Weight dtype: bfloat16 (no quantization)

Tooling versions used to generate these artifacts:

  • ExecuTorch: executorch==1.2.0a0+efe4f0c (git efe4f0cce3)
  • Optimum ExecuTorch: optimum-executorch==0.2.0.dev0 (git 4c62ed7)

Command used:

optimum-cli export executorch \
  --model "openai/whisper-medium" \
  --task "automatic-speech-recognition" \
  --recipe "xnnpack" \
  --dtype "bfloat16" \
  --output_dir "<output_dir>"

Preprocessor command used:

python -m executorch.extension.audio.mel_spectrogram \
  --feature_size 80 \
  --stack_output \
  --max_audio_len 300 \
  --output_file whisper_preprocessor.pte

Run with the ExecuTorch Whisper runner

Build the runner from the ExecuTorch repo root:

make whisper-cpu

Run (expects a 16kHz mono WAV):

cmake-out/examples/models/whisper/whisper_runner \
  --model_path model.pte \
  --tokenizer_path ./ \
  --audio_path output.wav \
  --processor_path whisper_preprocessor.pte \
  --temperature 0
Downloads last month
37
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for larryliu0820/whisper-medium-ExecuTorch-XNNPACK

Finetuned
(790)
this model