Higgs Audio v3 TTS 4-bit ONNX Vocoder

This repository contains an ONNX export of the Higgs Audio v3 bundled codec decoder/vocoder:

audio_codes [B, 8, T] int64 -> audio_values [B, 1, samples] float32

It is a browser-deployment component for Higgs Audio TTS, not the complete text-to-speech pipeline. The autoregressive Qwen3 text-to-codebook generator is still a separate custom SGLang graph and is not included here.

Files

  • higgs_audio_v3_vocoder_decode.onnx + .onnx.data: fp32 ONNX vocoder.
  • higgs_audio_v3_vocoder_decode_matmul4.onnx + .onnx.data: ONNX Runtime weight-only 4-bit quantization for supported MatMul and Gather nodes.
  • vocoder_onnx_export_report.json: PyTorch vs ONNX Runtime validation.
  • vocoder_onnx_matmul4_report.json: 4-bit ONNX Runtime validation.

Validation

The fp32 vocoder export passed ONNX checker and ONNX Runtime CPU inference:

  • max absolute diff vs PyTorch: 0.0007368475
  • mean absolute diff vs PyTorch: 0.0000280040
  • mean relative diff vs PyTorch: 0.0053399284

The 4-bit ONNX Runtime artifact replaced:

  • Gather: 8 -> GatherBlockQuantized: 8
  • MatMul: 9 -> MatMulNBits: 9

and passed ONNX Runtime CPU inference against the fp32 ONNX graph.

Limitation

This is not a full in-browser TTS model yet. A complete browser version still needs the autoregressive Higgs/Qwen3 text-to-codebook generator exported to an ONNX Runtime Web compatible graph and a JavaScript sampling loop for the 8-codebook delayed-token generation.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Reza2kn/Higgs-Audio-v3-TTS-4bit-ONNX

Quantized
(4)
this model

Space using Reza2kn/Higgs-Audio-v3-TTS-4bit-ONNX 1