MuseTalk V15 UNet — AEmotionStudio Mirror

Mirror of the MuseTalk V15 UNet weights for use with ComfyUI-FFMPEGA.

About

MuseTalk is a real-time, high-quality lip sync model that synchronizes lip movements in video to match provided audio. It supports:

  • Video + Audio lip sync — make a person in a video speak new dialogue
  • Image + Audio talking head — animate a portrait photo with speech audio
  • Multi-face support — sync multiple faces in a single video
  • Batch inference — process multiple frames simultaneously for speed

Files

File Precision Size Description
musetalkV15/unet_fp16.safetensors fp16 ~1.6 GB Recommended — half-precision UNet weights
musetalkV15/unet.safetensors fp32 ~3.2 GB Full-precision UNet weights (fallback)
musetalkV15/musetalk.json < 1 KB Model configuration

Usage with ComfyUI-FFMPEGA

This model is auto-downloaded when you use the lip_sync skill in ComfyUI-FFMPEGA.

Example Prompts

Lip sync this video to the provided audio
Make the person's lips match the speech
Dub this video with the new voiceover

The fp16 variant is preferred by default when use_float16 is enabled (default). Falls back to fp32 if fp16 is unavailable.

Manual Download

If auto-download is disabled, download the files and place them in:

ComfyUI/models/musetalk/musetalkV15/

Additional Dependencies

MuseTalk also requires these models (auto-downloaded from HuggingFace on first use):

  • SD-VAE (stabilityai/sd-vae-ft-mse) — ~335 MB
  • Whisper-tiny (openai/whisper-tiny) — ~75 MB

VRAM Requirements

  • Minimum: ~4 GB
  • Recommended: ~6 GB
  • Uses subprocess isolation to prevent CUDA memory leaks

License

Citation

@article{zhang2024musetalk,
  title={MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting},
  author={Zhang, Yue and Liu, Minhao and Chen, Zhaokang and Wu, Bin and others},
  journal={arXiv preprint arXiv:2410.10122},
  year={2024}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for AEmotionStudio/musetalk-models