SenseVoiceSmall ONNX INT8 for sherpa-onnx

This repository contains a sherpa-onnx compatible ONNX INT8 export of FunAudioLLM/SenseVoiceSmall.

It is intended for local or embedded ONNX Runtime inference with sherpa-onnx. The model supports Mandarin, Cantonese, English, Japanese, Korean, auto language detection, inverse text normalization options, and the SenseVoice CTC output format.

Attribution

Base model and upstream project:

Base model: https://huggingface.co/FunAudioLLM/SenseVoiceSmall
Upstream code: https://github.com/FunAudioLLM/SenseVoice
Upstream license: https://github.com/modelscope/FunASR/blob/main/MODEL_LICENSE

This is a derivative export and is not an official FunAudioLLM release.

Files

model.int8.onnx - sherpa-onnx compatible INT8 ONNX model
tokens.txt - token table generated from the upstream SentencePiece model

Model Metadata

The ONNX model includes sherpa-onnx runtime metadata, including:

model_type=sense_voice_ctc
lfr_window_size=7
lfr_window_shift=6
CMVN statistics: neg_mean, inv_stddev
language IDs for auto, zh, en, yue, ja, ko, nospeech
text normalization IDs for with_itn and without_itn
vocab_size=25055

Usage

Install sherpa-onnx following the official documentation for your platform:

pip install sherpa-onnx

Example Python usage:

import sherpa_onnx

recognizer = sherpa_onnx.OfflineRecognizer.from_sense_voice(
    model="model.int8.onnx",
    tokens="tokens.txt",
    num_threads=4,
    use_itn=True,
    debug=False,
)

Please adapt audio loading and resampling to your application. SenseVoice expects 16 kHz audio.

Reproduction

This artifact was generated with OpenASR Model Factory:

openasr-model-factory quantize-sensevoice `
  --input-dir downloads/FunAudioLLM/SenseVoiceSmall `
  --output-dir outputs/sensevoice-small-onnx

The export follows the sherpa-onnx SenseVoice layout:

ONNX inputs: x, x_length, language, text_norm
ONNX output: logits
Dynamic INT8 quantization for MatMul weights with QUInt8

Limitations

INT8 quantization may change recognition output compared with the original PyTorch model.
Validate accuracy and latency in your target environment before production use.
This artifact inherits upstream model limitations and license requirements.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for OpenASR/sensevoice-small-onnx

Base model

FunAudioLLM/SenseVoiceSmall

Quantized

(3)

this model