Chatterbox Multilingual TTS - Q4 Quantized ONNX

Q4 weight-only quantized version of onnx-community/chatterbox-multilingual-ONNX for use with Transformers.js and ONNX Runtime Web.

Key Features

75% smaller: 790 MB vs 3.2 GB original
Single-file ONNX: No external data files, compatible with Transformers.js
Same quality: Minimal quality loss from Q4 quantization
23 languages supported: ar, da, de, el, en, es, fi, fr, he, hi, it, ja, ko, ms, nl, no, pl, pt, ru, sv, sw, tr, zh

Model Sizes

Model	Original (FP32)	Q4 Quantized
speech_encoder.onnx	564 MB	172 MB
embed_tokens.onnx	66 MB	65 MB
language_model.onnx	2.0 GB	338 MB
conditional_decoder.onnx	510 MB	215 MB
Total	3.2 GB	790 MB

Usage

With ONNX Runtime (Python)

import onnxruntime

# Load Q4 models - single files, no external data needed
speech_encoder = onnxruntime.InferenceSession("onnx/speech_encoder.onnx")
embed_tokens = onnxruntime.InferenceSession("onnx/embed_tokens.onnx")
language_model = onnxruntime.InferenceSession("onnx/language_model.onnx")
conditional_decoder = onnxruntime.InferenceSession("onnx/conditional_decoder.onnx")

With Transformers.js (JavaScript)

// Models are single-file ONNX format, compatible with ONNX Runtime Web
import { AutoTokenizer } from '@huggingface/transformers';

const tokenizer = await AutoTokenizer.from_pretrained('ipsilondev/chatterbox-multilingual-ONNX-q4');

Quantization Details

Method: Q4 weight-only quantization using MatMulNBitsQuantizer
Block size: 32
Symmetric: Yes
Format: Single-file ONNX (no external data) for web compatibility

Important Parameters

When using these models, ensure you use the correct parameters:

repetition_penalty = 1.2  # CRITICAL: Do NOT use 2.0 - causes infinite loops
temperature = 0.8
top_p = 0.95
min_p = 0.05

Supported Languages

Code	Language	Code	Language
ar	Arabic	ko	Korean
da	Danish	ms	Malay
de	German	nl	Dutch
el	Greek	no	Norwegian
en	English	pl	Polish
es	Spanish	pt	Portuguese
fi	Finnish	ru	Russian
fr	French	sv	Swedish
he	Hebrew	sw	Swahili
hi	Hindi	tr	Turkish
it	Italian	zh	Chinese
ja	Japanese

Credits

Original model: onnx-community/chatterbox-multilingual-ONNX
Base model: ResembleAI/chatterbox
Quantization by: ipsilondev

License

MIT License (same as original model)

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for ipsilondev/chatterbox-multilingual-ONNX-q4

Base model

ResembleAI/chatterbox

Quantized

onnx-community/chatterbox-multilingual-ONNX