Video Avatar Installers (2024-12-24)

Automated installation scripts for lip-sync avatar generation on GPU servers (VAST.ai, RunPod, etc).

Available Engines

Engine	Model Size	VRAM	Quality	Speed	Best For
MuseTalk V1.5	~3.4GB	~8-12GB	Excellent	Medium	Single user, high quality
Wav2Lip-ONNX-HQ	~700MB	~2-4GB	Good	Fast	Multi-user, lightweight

Quick Start

Option 1: MuseTalk V1.5 (High Quality)

wget https://huggingface.co/dumont/video-avatar-2024-12-24/raw/main/install_musetalk.sh
chmod +x install_musetalk.sh
sudo ./install_musetalk.sh

Usage:

source /opt/miniconda/bin/activate musetalk
cd /workspace/MuseTalk

# Create config
cat > configs/inference/test.yaml << EOF
task_0:
  video_path: "data/video/your_avatar.mp4"
  audio_path: "path/to/audio.wav"
EOF

# Run inference
PYTHONPATH=. python3 scripts/inference.py --version v15 --inference_config configs/inference/test.yaml

Option 2: Wav2Lip-ONNX-HQ (Lightweight)

wget https://huggingface.co/dumont/video-avatar-2024-12-24/raw/main/install_wav2lip.sh
chmod +x install_wav2lip.sh
sudo ./install_wav2lip.sh

Usage:

cd /workspace/wav2lip-onnx-HQ

python3 inference_onnxModel.py \
  --face /path/to/video.mp4 \
  --audio /path/to/audio.wav \
  --checkpoint_path checkpoints/wav2lip_gan.onnx \
  --outfile output.mp4 \
  --hq_output

Tested Environment

GPU: NVIDIA RTX 4090 (24GB VRAM)
CUDA: 12.x
OS: Ubuntu 22.04
Platform: VAST.ai, RunPod

Sample Videos

The samples/ folder contains example outputs from both engines for comparison.

Credits

MuseTalk by TMElyralab/Tencent
Wav2Lip-ONNX-HQ by instant-high
Installer scripts by Dumont AI Team

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support