---
title: SadTalker
emoji: 😭
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: false
license: mit
---

<!-- Alternative deployment options:

For Streamlit:
sdk: streamlit
app_file: app_streamlit.py

For FastAPI:
sdk: docker
app_port: 7860

For Docker:
sdk: docker
app_port: 7860
-->

# SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation (CVPR 2023)

This is a Gradio app for SadTalker, which can generate talking face videos from a single image and audio.

## Features
- Generate talking face videos from single image + audio
- Multiple preprocessing options
- Face enhancement with GFPGAN
- Multiple pose styles
- Still mode for fewer head movements

## Setup

After cloning this repository to your Hugging Face Space, you'll need to:

1. **Upload model files**: Download the following model files and upload them to your repository:

### Required Model Files:

#### SadTalker Models (upload to `checkpoints/` folder):
- `SadTalker_V0.0.2_256.safetensors`
- `SadTalker_V0.0.2_512.safetensors` 
- `mapping_00109-model.pth.tar`
- `mapping_00229-model.pth.tar`

#### GFPGAN Models (upload to `gfpgan/weights/` folder):
- `alignment_WFLW_4HG.pth`
- `detection_Resnet50_Final.pth`
- `GFPGANv1.4.pth`
- `parsing_parsenet.pth`

### Where to get the models:
1. Download from the original SadTalker repository: https://github.com/OpenTalker/SadTalker
2. Or from the model links provided in their documentation

### Upload Instructions:
1. Go to your Hugging Face Space repository
2. Click "Upload files"
3. Create the folder structure and upload the model files
4. Make sure the files are in the correct paths as listed above

## Usage
1. Upload a source image (preferably a portrait with clear face)
2. Upload an audio file
3. Adjust settings as needed
4. Click Generate to create your talking face video

## Citation
```
@InProceedings{zhang2023sadtalker,
  author={Zhang, Wenxuan and Cun, Xiaodong and Wang, Xuan and Zhang, Yong and Shen, Xi and Guo, Yu and Shan, Ying and Wang, Fei},
  title={SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation},
  booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month={June},
  year={2023}
}
```

## Links
- [Paper](https://arxiv.org/abs/2211.12194)
- [Project Page](https://sadtalker.github.io)
- [Original Repository](https://github.com/OpenTalker/SadTalker)