OmniVoice โ int8 g=64 (MLX)
8-bit, group-size-64 affine quantization of k2-fsa/OmniVoice,
produced with mlx-audio for Apple Silicon. Quality-conservative variant โ
use this if lightsofapollo/omnivoice-mlx-q4-g64
sounds off on your prompts.
Sizes
| Backbone | Total | |
|---|---|---|
| original (bf16) | ~1.2 GB | ~1.6 GB |
| this repo (int8 g=64 backbone, bf16 tokenizer) | ~620 MB | ~1.0 GB |
Quantization applies only to the Qwen3 backbone Linear layers (and the tied audio embedding/head matmuls). The Higgs Audio V2 acoustic tokenizer (decoder, RVQ, semantic) is left at bfloat16.
Performance (M-series, mlx-audio 0.x)
| Prompt | RTF (bf16) | RTF (this) |
|---|---|---|
| "Voice synthesis on Apple Silicon has come a long way. We can now generate full sentences in real time." | 3.68ร | 4.60ร (+25%) |
Whisper-small round-trip: identical transcript to bf16 on the long prompt.
Usage (mlx-audio Python)
import json
import mlx.core as mx
import mlx.nn as nn
from huggingface_hub import snapshot_download
from mlx_audio.tts.models.omnivoice.config import OmniVoiceConfig
from mlx_audio.tts.models.omnivoice.omnivoice import Model
path = snapshot_download("lightsofapollo/omnivoice-mlx-q8-g64")
cfg_dict = json.load(open(f"{path}/config.json"))
model = Model(OmniVoiceConfig(**{k: v for k, v in cfg_dict.items() if k in OmniVoiceConfig.__dataclass_fields__}))
q = cfg_dict["quantization"]
nn.quantize(model, group_size=q["group_size"], bits=q["bits"], mode=q.get("mode", "affine"),
class_predicate=lambda _p, m: hasattr(m, "to_quantized"))
raw = dict(mx.load(f"{path}/model.safetensors"))
model.load_weights(list(model.sanitize(raw).items()))
mx.eval(model.parameters())
How this was made
python -m mlx_audio.tts.models.omnivoice.convert \
--model k2-fsa/OmniVoice --output omnivoice-bf16 --dtype bfloat16
python -m mlx_audio.convert \
--hf-path omnivoice-bf16 --mlx-path omnivoice-q8-g64 \
--quantize --q-bits 8 --q-group-size 64
License
Apache-2.0 (inherited from k2-fsa/OmniVoice).
- Downloads last month
- 30
Model size
0.2B params
Tensor type
BF16
ยท
U32 ยท
Hardware compatibility
Log In to add your hardware
8-bit