Whisper Small - ONNX

Official OpenAI Whisper small checkpoint exported as ONNX encoder/decoder graphs for speech-core and sherpa-onnx-compatible runtimes.

Part of the soniqo.audio speech toolkit. This is the ONNX Runtime bundle used by speech-core for server, desktop, and Android-style runtimes; the graphs remain sherpa-onnx compatible. Browse ONNX bundles in the soniqo ONNX collection.

Model


Source	openai/whisper-small
Export format	ONNX for sherpa-onnx
Variants	FP32, INT8, FP16
Runtime	speech-core `OnnxWhisperStt` / ONNX Runtime; sherpa-onnx compatible
Total artifact size	1744.05 MiB

The FP16 graphs use external *.onnx.data files. Large-style bundles also use external FP32 *.weights files. Keep external-data files beside their matching .onnx files.

Files

File	Size	Description
`small-decoder.fp16.onnx`	1.0 MB	Whisper decoder graph
`small-decoder.fp16.onnx.data`	265.91 MB	External tensor data for the adjacent ONNX graph
`small-decoder.int8.onnx`	250.05 MB	Whisper decoder graph
`small-decoder.onnx`	533.19 MB	Whisper decoder graph
`small-encoder.fp16.onnx`	0.22 MB	Whisper encoder graph
`small-encoder.fp16.onnx.data`	195.16 MB	External tensor data for the adjacent ONNX graph
`small-encoder.int8.onnx`	107.21 MB	Whisper encoder graph
`small-encoder.onnx`	390.52 MB	Whisper encoder graph
`small-tokens.txt`	0.78 MB	Tokenizer tokens for speech-core and sherpa-onnx-compatible runtimes

Usage

Use with speech-core's native ONNX Whisper runtime:

#include <speech_core/models/onnx_whisper_stt.h>

speech_core::OnnxWhisperStt stt(
    "small-encoder.int8.onnx",
    "small-decoder.int8.onnx",
    "small-tokens.txt");

auto result = stt.transcribe(audio, length, 16000);

The same encoder/decoder/token files can also be loaded by sherpa-onnx:

import sherpa_onnx

recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(
    encoder="small-encoder.fp16.onnx",
    decoder="small-decoder.fp16.onnx",
    tokens="small-tokens.txt",
    language="en",
    task="transcribe",
    provider="cpu",
)

Benchmarks

This repository ships the exported model artifacts. Full published WER/RTF is currently available for the turbo bundle; run the benchmark command above for local numbers on this variant.

Model tree for soniqo/Whisper-Small-ONNX

Base model

openai/whisper-small

Quantized

(222)

this model

Collection including soniqo/Whisper-Small-ONNX

ONNX

Collection

ONNX bundles for soniqo.audio. VAD, speech enhancement, ASR, TTS — for Android via ONNX Runtime and cross-platform consumers. • 18 items • Updated 4 days ago • 1

soniqo
/

Whisper-Small-ONNX