SenseVoiceSmall ONNX INT8 for sherpa-onnx

This repository contains a sherpa-onnx compatible ONNX INT8 export of FunAudioLLM/SenseVoiceSmall.

It is intended for local or embedded ONNX Runtime inference with sherpa-onnx. The model supports Mandarin, Cantonese, English, Japanese, Korean, auto language detection, inverse text normalization options, and the SenseVoice CTC output format.

Attribution

Base model and upstream project:

This is a derivative export and is not an official FunAudioLLM release.

Files

  • model.int8.onnx - sherpa-onnx compatible INT8 ONNX model
  • tokens.txt - token table generated from the upstream SentencePiece model

Model Metadata

The ONNX model includes sherpa-onnx runtime metadata, including:

  • model_type=sense_voice_ctc
  • lfr_window_size=7
  • lfr_window_shift=6
  • CMVN statistics: neg_mean, inv_stddev
  • language IDs for auto, zh, en, yue, ja, ko, nospeech
  • text normalization IDs for with_itn and without_itn
  • vocab_size=25055

Usage

Install sherpa-onnx following the official documentation for your platform:

pip install sherpa-onnx

Example Python usage:

import sherpa_onnx

recognizer = sherpa_onnx.OfflineRecognizer.from_sense_voice(
    model="model.int8.onnx",
    tokens="tokens.txt",
    num_threads=4,
    use_itn=True,
    debug=False,
)

Please adapt audio loading and resampling to your application. SenseVoice expects 16 kHz audio.

Reproduction

This artifact was generated with OpenASR Model Factory:

openasr-model-factory quantize-sensevoice `
  --input-dir downloads/FunAudioLLM/SenseVoiceSmall `
  --output-dir outputs/sensevoice-small-onnx

The export follows the sherpa-onnx SenseVoice layout:

  • ONNX inputs: x, x_length, language, text_norm
  • ONNX output: logits
  • Dynamic INT8 quantization for MatMul weights with QUInt8

Limitations

  • INT8 quantization may change recognition output compared with the original PyTorch model.
  • Validate accuracy and latency in your target environment before production use.
  • This artifact inherits upstream model limitations and license requirements.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OpenASR/sensevoice-small-onnx

Quantized
(3)
this model