metadata
title: SadTalker
emoji: 😭
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: false
license: mit
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation (CVPR 2023)
This is a Gradio app for SadTalker, which can generate talking face videos from a single image and audio.
Features
- Generate talking face videos from single image + audio
- Multiple preprocessing options
- Face enhancement with GFPGAN
- Multiple pose styles
- Still mode for fewer head movements
Setup
After cloning this repository to your Hugging Face Space, you'll need to:
- Upload model files: Download the following model files and upload them to your repository:
Required Model Files:
SadTalker Models (upload to checkpoints/ folder):
SadTalker_V0.0.2_256.safetensorsSadTalker_V0.0.2_512.safetensorsmapping_00109-model.pth.tarmapping_00229-model.pth.tar
GFPGAN Models (upload to gfpgan/weights/ folder):
alignment_WFLW_4HG.pthdetection_Resnet50_Final.pthGFPGANv1.4.pthparsing_parsenet.pth
Where to get the models:
- Download from the original SadTalker repository: https://github.com/OpenTalker/SadTalker
- Or from the model links provided in their documentation
Upload Instructions:
- Go to your Hugging Face Space repository
- Click "Upload files"
- Create the folder structure and upload the model files
- Make sure the files are in the correct paths as listed above
Usage
- Upload a source image (preferably a portrait with clear face)
- Upload an audio file
- Adjust settings as needed
- Click Generate to create your talking face video
Citation
@InProceedings{zhang2023sadtalker,
author={Zhang, Wenxuan and Cun, Xiaodong and Wang, Xuan and Zhang, Yong and Shen, Xi and Guo, Yu and Shan, Ying and Wang, Fei},
title={SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation},
booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month={June},
year={2023}
}