Chatterbox Multilingual TTS - Q4 Quantized ONNX
Q4 weight-only quantized version of onnx-community/chatterbox-multilingual-ONNX for use with Transformers.js and ONNX Runtime Web.
Key Features
- 75% smaller: 790 MB vs 3.2 GB original
- Single-file ONNX: No external data files, compatible with Transformers.js
- Same quality: Minimal quality loss from Q4 quantization
- 23 languages supported: ar, da, de, el, en, es, fi, fr, he, hi, it, ja, ko, ms, nl, no, pl, pt, ru, sv, sw, tr, zh
Model Sizes
| Model | Original (FP32) | Q4 Quantized |
|---|---|---|
| speech_encoder.onnx | 564 MB | 172 MB |
| embed_tokens.onnx | 66 MB | 65 MB |
| language_model.onnx | 2.0 GB | 338 MB |
| conditional_decoder.onnx | 510 MB | 215 MB |
| Total | 3.2 GB | 790 MB |
Usage
With ONNX Runtime (Python)
import onnxruntime
# Load Q4 models - single files, no external data needed
speech_encoder = onnxruntime.InferenceSession("onnx/speech_encoder.onnx")
embed_tokens = onnxruntime.InferenceSession("onnx/embed_tokens.onnx")
language_model = onnxruntime.InferenceSession("onnx/language_model.onnx")
conditional_decoder = onnxruntime.InferenceSession("onnx/conditional_decoder.onnx")
With Transformers.js (JavaScript)
// Models are single-file ONNX format, compatible with ONNX Runtime Web
import { AutoTokenizer } from '@huggingface/transformers';
const tokenizer = await AutoTokenizer.from_pretrained('ipsilondev/chatterbox-multilingual-ONNX-q4');
Quantization Details
- Method: Q4 weight-only quantization using
MatMulNBitsQuantizer - Block size: 32
- Symmetric: Yes
- Format: Single-file ONNX (no external data) for web compatibility
Important Parameters
When using these models, ensure you use the correct parameters:
repetition_penalty = 1.2 # CRITICAL: Do NOT use 2.0 - causes infinite loops
temperature = 0.8
top_p = 0.95
min_p = 0.05
Supported Languages
| Code | Language | Code | Language |
|---|---|---|---|
| ar | Arabic | ko | Korean |
| da | Danish | ms | Malay |
| de | German | nl | Dutch |
| el | Greek | no | Norwegian |
| en | English | pl | Polish |
| es | Spanish | pt | Portuguese |
| fi | Finnish | ru | Russian |
| fr | French | sv | Swedish |
| he | Hebrew | sw | Swahili |
| hi | Hindi | tr | Turkish |
| it | Italian | zh | Chinese |
| ja | Japanese |
Credits
- Original model: onnx-community/chatterbox-multilingual-ONNX
- Base model: ResembleAI/chatterbox
- Quantization by: ipsilondev
License
MIT License (same as original model)
Model tree for ipsilondev/chatterbox-multilingual-ONNX-q4
Base model
ResembleAI/chatterbox