Whisper Small - ONNX

Official OpenAI Whisper small checkpoint exported as ONNX encoder/decoder graphs for speech-core and sherpa-onnx-compatible runtimes.

Part of the soniqo.audio speech toolkit. This is the ONNX Runtime bundle used by speech-core for server, desktop, and Android-style runtimes; the graphs remain sherpa-onnx compatible. Browse ONNX bundles in the soniqo ONNX collection.

Model

Source openai/whisper-small
Export format ONNX for sherpa-onnx
Variants FP32, INT8, FP16
Runtime speech-core OnnxWhisperStt / ONNX Runtime; sherpa-onnx compatible
Total artifact size 1744.05 MiB

The FP16 graphs use external *.onnx.data files. Large-style bundles also use external FP32 *.weights files. Keep external-data files beside their matching .onnx files.

Files

File Size Description
small-decoder.fp16.onnx 1.0 MB Whisper decoder graph
small-decoder.fp16.onnx.data 265.91 MB External tensor data for the adjacent ONNX graph
small-decoder.int8.onnx 250.05 MB Whisper decoder graph
small-decoder.onnx 533.19 MB Whisper decoder graph
small-encoder.fp16.onnx 0.22 MB Whisper encoder graph
small-encoder.fp16.onnx.data 195.16 MB External tensor data for the adjacent ONNX graph
small-encoder.int8.onnx 107.21 MB Whisper encoder graph
small-encoder.onnx 390.52 MB Whisper encoder graph
small-tokens.txt 0.78 MB Tokenizer tokens for speech-core and sherpa-onnx-compatible runtimes

Usage

Use with speech-core's native ONNX Whisper runtime:

#include <speech_core/models/onnx_whisper_stt.h>

speech_core::OnnxWhisperStt stt(
    "small-encoder.int8.onnx",
    "small-decoder.int8.onnx",
    "small-tokens.txt");

auto result = stt.transcribe(audio, length, 16000);

The same encoder/decoder/token files can also be loaded by sherpa-onnx:

import sherpa_onnx

recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(
    encoder="small-encoder.fp16.onnx",
    decoder="small-decoder.fp16.onnx",
    tokens="small-tokens.txt",
    language="en",
    task="transcribe",
    provider="cpu",
)

Benchmarks

This repository ships the exported model artifacts. Full published WER/RTF is currently available for the turbo bundle; run the benchmark command above for local numbers on this variant.

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for soniqo/Whisper-Small-ONNX

Quantized
(222)
this model

Collection including soniqo/Whisper-Small-ONNX