Whisper Tiny (ExecuTorch, XNNPACK, 8da4w)

This folder contains an ExecuTorch .pte export of openai/whisper-tiny for CPU inference via the XNNPACK backend, with post-training quantization enabled.

model.pte: ExecuTorch program (methods: encoder, text_decoder)
whisper_preprocessor.pte: mel-spectrogram preprocessor (feature size 80)
tokenizer.json, vocab.json, merges.txt: tokenizer artifacts
config.json, generation_config.json, preprocessor_config.json, tokenizer_config.json, special_tokens_map.json: metadata files from the upstream Hugging Face repo

Quantization

Export flags:

--qlinear 8da4w: decoder linear layers use 8-bit dynamic activations + 4-bit weights
--qlinear_encoder 8da4w: encoder linear layers use 8-bit dynamic activations + 4-bit weights

Other export settings:

Task: automatic-speech-recognition
Recipe: xnnpack

Tooling versions used to generate these artifacts:

ExecuTorch: executorch==1.2.0a0+efe4f0c (git efe4f0cce3)
Optimum ExecuTorch: optimum-executorch==0.2.0.dev0 (git 4c62ed7)

Command used:

optimum-cli export executorch \
  --model "openai/whisper-tiny" \
  --task "automatic-speech-recognition" \
  --recipe "xnnpack" \
  --qlinear "8da4w" \
  --qlinear_encoder "8da4w" \
  --output_dir "<output_dir>"

Preprocessor command used:

python -m executorch.extension.audio.mel_spectrogram \
  --feature_size 80 \
  --stack_output \
  --max_audio_len 300 \
  --output_file whisper_preprocessor.pte

Run with the ExecuTorch Whisper runner

Build the runner from the ExecuTorch repo root:

make whisper-cpu

Run (expects a 16kHz mono WAV):

cmake-out/examples/models/whisper/whisper_runner \
  --model_path model.pte \
  --tokenizer_path ./ \
  --audio_path output.wav \
  --processor_path whisper_preprocessor.pte \
  --temperature 0

Downloads last month: 10

Model tree for larryliu0820/whisper-tiny-INT8-INT4-ExecuTorch-XNNPACK

Base model

openai/whisper-tiny

Finetuned

(1882)

this model

larryliu0820
/

whisper-tiny-INT8-INT4-ExecuTorch-XNNPACK

Whisper Tiny (ExecuTorch, XNNPACK, 8da4w)

Contents

Quantization

Run with the ExecuTorch Whisper runner

Model tree for larryliu0820/whisper-tiny-INT8-INT4-ExecuTorch-XNNPACK