VibeVoice 1.5B — MLX

MLX-converted fp16 weights for microsoft/VibeVoice-1.5B.

For inference code, benchmarks, and documentation see vibevoice-mlx.

Quick start

git clone https://github.com/gafiatulin/vibevoice-mlx && cd vibevoice-mlx
uv sync

# Basic synthesis (weights download automatically)
uv run vibevoice-mlx --text "Hello, world!" --output hello.wav

# Voice cloning
uv run vibevoice-mlx \
  --ref-audio speaker.wav --text "Clone this voice" --output cloned.wav

Performance

Benchmarked on M4 Max 64GB with voice cloning (~30s audio):

Config	RTF	Gen	Peak Mem
fp16	1.85x	15.5s	8.5 GB
int8	2.65x	10.3s	5.7 GB
int4	2.72x	9.7s	4.6 GB
int8, no-semantic	3.42x	7.8s	4.4 GB
int4, no-semantic	3.92x	6.4s	3.3 GB

Downloads last month: 190

MLX

Hardware compatibility

Quantized

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for gafiatulin/vibevoice-1.5b-mlx

Base model

microsoft/VibeVoice-1.5B

Finetuned

(12)

this model

Collection including gafiatulin/vibevoice-1.5b-mlx

VibeVoice MLX

Collection

VibeVoice-{1.5b, 7b} converted to MLX • 2 items • Updated 28 days ago