Blue ONNX v2 — INT8 (experimental)

Weight-only dynamic INT8 (QUInt8) ONNX graphs for Blue / Light Blue TTS inference with ONNX Runtime. Same multilingual pipeline and file layout as the full-precision bundle (notmax123/blue-onnx-v2): Hebrew, English, Spanish, Italian, and German (including mixed-language text with XML-style tags).

Trade-offs

	FP32 slim (blue-onnx-v2)	This repo (INT8)
Quality	Recommended default	Experimental; may degrade prosody / clarity
Size	Larger	Smaller disk footprint
Graph cleanup	After export: onnxslim	Not slimmed (quantization runs on the unslimmed export)

Use this bundle when you need a smaller download or want to benchmark INT8 on CPU/GPU; for production quality, prefer the FP32 slim Hub model.

Files in this repository

File	Role
`text_encoder.onnx`	Text encoder; phone IDs + per-voice style (TTL)
`vector_estimator.onnx`	Flow / vector field; CFG and unconditional latents baked in
`vocoder.onnx`	Decoder to waveform; stats baked in-graph
`duration_predictor.onnx`	Style-conditioned duration; `style_dp` from voice JSON
`codec_encoder.onnx`	Mel → latent for reference zero-shot style
`style_encoder.onnx`	Reference latent → style_ttl
`duration_style_encoder.onnx`	Reference latent → style_dp
`tts.json`	Runtime dimensions / version (`tts_version`)
`vocab.json`	Character vocabulary for the text encoder

Voice style JSON is not included. Use per-line JSON files (e.g. female1.json) from the BlueTTS repo and pass them to BlueTTS(..., style_json=...).

For Hebrew G2P, add renikud model.onnx next to your app (see main Blue README).

Download

hf download notmax123/bluev2-onnx-int8 --repo-type model --local-dir ./onnx_int8

Usage (Python)

Requires pip install blue-onnx (or a clone of BlueTTS with blue_onnx on the path). Point onnx_dir at the folder above:

import soundfile as sf
from blue_onnx import BlueTTS

tts = BlueTTS(
    onnx_dir="onnx_int8",
    style_json="voices/female1.json",
    renikud_path="model.onnx",  # optional; Hebrew
)

audio, sr = tts.synthesize("Hello, world.", lang="en")
sf.write("out.wav", audio, sr)

How these graphs were produced

Exported from PyTorch checkpoints with exports/export_onnx.py using --int8 only (no --slim): ONNX Runtime dynamic quantization, per-tensor weight QUInt8, per_channel=False. Some ops remain float where quantization does not apply; you may see warnings during export.

License

Same as the BlueTTS project: MIT.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support