VibeVoice 7B — MLX

MLX-converted fp16 weights for vibevoice/VibeVoice-7B.

For inference code, benchmarks, and documentation see vibevoice-mlx.

Quick start

git clone https://github.com/gafiatulin/vibevoice-mlx && cd vibevoice-mlx
uv sync

# Basic synthesis (weights download automatically)
uv run vibevoice-mlx --model gafiatulin/vibevoice-7b-mlx --text "Hello, world!" --output hello.wav

# Voice cloning with INT8 quantization
uv run vibevoice-mlx --model gafiatulin/vibevoice-7b-mlx --quantize 8 \
  --ref-audio speaker.wav --text "Clone this voice" --output cloned.wav

Performance

Benchmarked on M4 Max 64GB with voice cloning (~30s audio):

Config	RTF	Gen	Peak Mem
fp16	0.53x	53.0s	21.7 GB
int8	1.06x	29.6s	14.9 GB
int4	1.16x	25.8s	11.2 GB
int8, no-semantic	1.24x	23.3s	13.6 GB
int4, no-semantic	1.37x	19.5s	9.8 GB

Downloads last month: 70

Safetensors

Model size

9B params

Tensor type

BF16

MLX

Hardware compatibility

Quantized

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for gafiatulin/vibevoice-7b-mlx

Base model

vibevoice/VibeVoice-7B

Finetuned

(5)

this model

Collection including gafiatulin/vibevoice-7b-mlx

VibeVoice MLX

Collection

VibeVoice-{1.5b, 7b} converted to MLX • 2 items • Updated Mar 21