Instructions to use mlx-community/MuseTalk-1.5-q4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/MuseTalk-1.5-q4 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir MuseTalk-1.5-q4 mlx-community/MuseTalk-1.5-q4
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
MuseTalk 1.5 — MLX (q4)
Apple-MLX port of MuseTalk 1.5 (TMElyralab / Tencent Music) — realtime, high-quality lip-sync via single-step latent-space inpainting (not diffusion). Runs natively on Apple Silicon. MIT-licensed, commercial use OK.
This variant: UNet Linears quantized to int4 (group_size 64); VAE + Whisper encoder kept fp16. Per-pass cosine vs fp16 = 0.99985. Decoded-face error vs the PyTorch reference: mean |Δ| ≈ 2.74/255.
Components (all in this repo, self-contained, torch-free)
| File | What |
|---|---|
unet.safetensors |
SD1.x UNet2DConditionModel (in=8, out=4, cross_attn=384), single-step t=0 |
vae.safetensors |
sd-vae-ft-mse AutoencoderKL (fp16) |
whisper_encoder.safetensors |
whisper-tiny audio encoder (fp16) |
config.json |
dtype / quantization / scaling factor |
Performance
Realtime on an M-series GPU: ~34 generated 256² faces/sec at batch 8 (>25 fps video rate), ~7 GB peak. fp16 inference.
Usage
from musetalk_mlx.pipeline_mlx import MuseTalkPipeline
pipe = MuseTalkPipeline.from_pretrained_mlx("MuseTalk-1.5-MLX-q4")
# crop_bgr: a 256x256 face crop; chunks: (N,50,384) whisper audio features
latents = pipe.get_latents_for_unet(crop_bgr)
faces = pipe.generate_faces(latents, audio_chunks) # BGR uint8 lip-synced faces
Face detection / cropping / paste-back blending use the upstream (MuseTalk) CPU preprocessing.
Parity (vs PyTorch, cpu fp32)
VAE encode 1.7e-5 · decode 3.4e-5 · UNet forward 1.4e-6 · whisper encoder 1.6e-5 · face-level e2e recon ≤ 2/255.
License
MIT (mirrors upstream MuseTalk). Dependency models keep their own permissive licenses. Port by MVS Collective (xocialize-code).
- Downloads last month
- 27
Quantized
Model tree for mlx-community/MuseTalk-1.5-q4
Base model
TMElyralab/MuseTalk