ONNX
File size: 5,131 Bytes
ed5553a
 
 
 
 
2d3be86
774970e
 
 
2d3be86
 
774970e
2d3be86
 
774970e
2d3be86
 
774970e
2d3be86
 
774970e
 
 
 
 
 
 
2d3be86
 
 
 
774970e
 
 
 
 
 
2d3be86
774970e
 
2d3be86
774970e
2d3be86
774970e
 
 
2d3be86
 
774970e
2d3be86
 
774970e
2d3be86
 
774970e
2d3be86
774970e
 
 
 
 
 
2d3be86
774970e
 
 
 
 
 
2d3be86
774970e
 
 
 
2d3be86
 
774970e
 
 
2d3be86
 
 
 
774970e
2d3be86
 
774970e
2d3be86
774970e
2d3be86
 
 
774970e
 
 
2d3be86
 
 
774970e
2d3be86
 
774970e
 
 
 
 
 
 
 
 
2d3be86
ed5553a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
---
license: apache-2.0
datasets:
- acvlab/FantasyPortrait-Multi-Expr
---
# FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers

[![Home Page](https://img.shields.io/badge/Project-FantasyPortrait-blue.svg)](https://fantasy-amap.github.io/fantasy-portrait/)
[![arXiv](https://img.shields.io/badge/Arxiv-2507.12956-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2507.12956)
[![hf_dataset](https://img.shields.io/badge/🤗%20Dataset-FantasyPortrait-yellow.svg)](https://huggingface.co/datasets/acvlab/FantasyPortrait-Multi-Expr)
[![hf_model](https://img.shields.io/badge/🤗%20Model-FantasyPortrait-green.svg)](https://huggingface.co/acvlab/FantasyPortrait)
[![hf_paper](https://img.shields.io/badge/🤗-FantasyPortrait-red.svg)](https://huggingface.co/papers/2507.12956)
[![ms_model](https://img.shields.io/badge/ModelScope-Model-9cf.svg)](https://modelscope.cn/models/amap_cvlab/FantasyPortrait)
[![ms_dataset](https://img.shields.io/badge/ModelScope-Dataset-ff69b4.svg)](https://www.modelscope.cn/datasets/amap_cvlab/FantasyPortrait-Multi-Expr)

## 🔥 Latest News!!
* August 12, 2025: We released the inference code, model weights and datasets.

## Demo
For more interesting results, please visit our [website](https://fantasy-amap.github.io/fantasy-portrait/).

| ![单人示例](./danren_1.gif) | ![对比](./duibi.gif) |
| :---: | :---: |
| ![动物](./dongwu.gif) | ![双人1](./shuangren_1.gif) |
| ![双人2](./shuangren_2.gif) | ![三人](./sanren.gif) |


## Quickstart
### 🛠️Installation

Clone the repo:

```
git clone https://github.com/Fantasy-AMAP/fantasy-portrait.git
cd fantasy-portrait
```

Install dependencies:
```
apt-get install ffmpeg
# Ensure torch >= 2.0.0
pip install -r requirements.txt
# Note: flash attention must be installed
pip install flash_attn
```

### 📦Multi-Expr Dataset
We make public the first multi-portrait facial expression video dataset **Multi-Expr Dataset**, Please download it via the [ModelScope](https://www.modelscope.cn/datasets/amap_cvlab/FantasyPortrait-Multi-Expr) or [Huggingface](https://huggingface.co/datasets/acvlab/FantasyPortrait-Multi-Expr).

### 🧱Model Download
| Models        |                       Download Link                                           |    Notes                      |
| --------------|-------------------------------------------------------------------------------|-------------------------------|
| Wan2.1-I2V-14B-720P  |      🤗 [Huggingface](https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-720P)    🤖 [ModelScope](https://www.modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-720P)     | Base model
| FantasyPortrait      |      🤗 [Huggingface](https://huggingface.co/acvlab/FantasyPortrait/)     🤖 [ModelScope](https://www.modelscope.cn/models/amap_cvlab/FantasyPortrait/)         | Our emo condition weights

Download models using huggingface-cli:
``` sh
pip install "huggingface_hub[cli]"
huggingface-cli download Wan-AI/Wan2.1-I2V-14B-720P --local-dir ./models/Wan2.1-I2V-14B-720P
huggingface-cli download acvlab/FantasyPortrait --local-dir ./models
```

Download models using modelscope-cli:
``` sh
pip install modelscope
modelscope download Wan-AI/Wan2.1-I2V-14B-720P --local_dir ./models/Wan2.1-I2V-14B-720P
modelscope download amap_cvlab/FantasyPortrait  --local_dir ./models
```

### 🔑 Single-Portrait Inference
``` sh
bash infer_single.sh
```

### 🔑 Multi-Portrait Inference
If you use input image and drive videos with multiple people, you can run as follows:
``` sh
bash infer_multi.sh
```
If you use input image with multiple people and different multiple single-human driven videos, you can run as follows:
```sh
bash infer_multi_diff.sh
```

### 📦Speed and VRAM Usage
We present a detailed table here. The model is tested on a single A100.

|`torch_dtype`|`num_persistent_param_in_dit`|Speed|Required VRAM|
|-|-|-|-|
|torch.bfloat16|None (unlimited)|15.5s/it|40G|
|torch.bfloat16|7*10**9 (7B)|32.8s/it|20G|
|torch.bfloat16|0|42.6s/it|5G|



## 🧩 Community Works
We ❤️ contributions from the open-source community! If your work has improved FantasyPortrait, please inform us.
Or you can directly e-mail [frank.jf@alibaba-inc.com](mailto://frank.jf@alibaba-inc.com). We are happy to reference your project for everyone's convenience.

## 🔗Citation
If you find this repository useful, please consider giving a star ⭐ and citation
```
@article{wang2025fantasyportrait,
  title={FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers},
  author={Wang, Qiang and Wang, Mengchao and Jiang, Fan and Fan, Yaqi and Qi, Yonggang and Xu, Mu},
  journal={arXiv preprint arXiv:2507.12956},
  year={2025}
}
```

## Acknowledgments
Thanks to [Wan2.1](https://github.com/Wan-Video/Wan2.1), [PD-FGC](https://github.com/Dorniwang/PD-FGC-inference) and [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) for open-sourcing their models and code, which provided valuable references and support for this project. Their contributions to the open-source community are truly appreciated.