YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Video Avatar Installers (2024-12-24)
Automated installation scripts for lip-sync avatar generation on GPU servers (VAST.ai, RunPod, etc).
Available Engines
| Engine | Model Size | VRAM | Quality | Speed | Best For |
|---|---|---|---|---|---|
| MuseTalk V1.5 | ~3.4GB | ~8-12GB | Excellent | Medium | Single user, high quality |
| Wav2Lip-ONNX-HQ | ~700MB | ~2-4GB | Good | Fast | Multi-user, lightweight |
Quick Start
Option 1: MuseTalk V1.5 (High Quality)
wget https://huggingface.co/dumont/video-avatar-2024-12-24/raw/main/install_musetalk.sh
chmod +x install_musetalk.sh
sudo ./install_musetalk.sh
Usage:
source /opt/miniconda/bin/activate musetalk
cd /workspace/MuseTalk
# Create config
cat > configs/inference/test.yaml << EOF
task_0:
video_path: "data/video/your_avatar.mp4"
audio_path: "path/to/audio.wav"
EOF
# Run inference
PYTHONPATH=. python3 scripts/inference.py --version v15 --inference_config configs/inference/test.yaml
Option 2: Wav2Lip-ONNX-HQ (Lightweight)
wget https://huggingface.co/dumont/video-avatar-2024-12-24/raw/main/install_wav2lip.sh
chmod +x install_wav2lip.sh
sudo ./install_wav2lip.sh
Usage:
cd /workspace/wav2lip-onnx-HQ
python3 inference_onnxModel.py \
--face /path/to/video.mp4 \
--audio /path/to/audio.wav \
--checkpoint_path checkpoints/wav2lip_gan.onnx \
--outfile output.mp4 \
--hq_output
Tested Environment
- GPU: NVIDIA RTX 4090 (24GB VRAM)
- CUDA: 12.x
- OS: Ubuntu 22.04
- Platform: VAST.ai, RunPod
Sample Videos
The samples/ folder contains example outputs from both engines for comparison.
Credits
- MuseTalk by TMElyralab/Tencent
- Wav2Lip-ONNX-HQ by instant-high
- Installer scripts by Dumont AI Team
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support