Upload README.md with huggingface_hub

1f3d1dc verified about 1 month ago

1.17 kB

license: mit
base_model: vibevoice/VibeVoice-7B
tags:
  - mlx
  - tts
  - vibevoice
  - apple-silicon
  - voice-cloning

VibeVoice 7B — MLX

MLX-converted fp16 weights for vibevoice/VibeVoice-7B.

For inference code, benchmarks, and documentation see vibevoice-mlx.

Quick start

git clone https://github.com/gafiatulin/vibevoice-mlx && cd vibevoice-mlx
uv sync

# Basic synthesis (weights download automatically)
uv run vibevoice-mlx --model gafiatulin/vibevoice-7b-mlx --text "Hello, world!" --output hello.wav

# Voice cloning with INT8 quantization
uv run vibevoice-mlx --model gafiatulin/vibevoice-7b-mlx --quantize 8 \
  --ref-audio speaker.wav --text "Clone this voice" --output cloned.wav

Performance

Benchmarked on M4 Max 64GB with voice cloning (~30s audio):

Config	RTF	Gen	Peak Mem
fp16	0.53x	53.0s	21.7 GB
int8	1.06x	29.6s	14.9 GB
int4	1.16x	25.8s	11.2 GB
int8, no-semantic	1.24x	23.3s	13.6 GB
int4, no-semantic	1.37x	19.5s	9.8 GB