meteorite2023 commited on
Commit
253526b
Β·
verified Β·
1 Parent(s): ef59a70

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +98 -91
README.md CHANGED
@@ -1,91 +1,98 @@
1
-
2
- # DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability
3
-
4
- <div align="center">
5
-
6
- ### [ICCV 2025]
7
-
8
- [Xirui Hu](https://openreview.net/profile?id=~Xirui_Hu1),
9
- [Jiahao Wang](https://openreview.net/profile?id=~Jiahao_Wang14),
10
- [Hao Chen](https://openreview.net/profile?id=~Hao_chen100),
11
- [Weizhan Zhang](https://openreview.net/profile?id=~Weizhan_Zhang1),
12
- [Benqi Wang](https://openreview.net/profile?id=~Benqi_Wang2),
13
- [Yikun Li](https://openreview.net/profile?id=~Yikun_Li1),
14
- [Haishun Nan](https://openreview.net/profile?id=~Haishun_Nan1),
15
-
16
- [![arXiv](https://img.shields.io/badge/arXiv-2503.06505-b31b1b.svg)](https://arxiv.org/abs/2503.06505)
17
- [![GitHub](https://img.shields.io/badge/GitHub-MTVCrafter-blue?logo=github)](https://github.com/ByteCat-bot/DynamicID)
18
- </div>
19
-
20
- ---
21
- This is the official implementation of DynamicID, a framework that generates visually harmonious image featuring **multiple individuals**. Each person in the image can be specified through user-provided reference images, and most notably, our method enables **independent control of each individual's facial expression** via text prompts. Hope you have fun with this demo!
22
-
23
- ---
24
-
25
- ## πŸ” Abstract
26
-
27
- Recent advancements in text-to-image generation have spurred interest in personalized human image generation. Although existing methods achieve high-fidelity identity preservation, they often struggle with **limited multi-ID usability** and **inadequate facial editability**.
28
-
29
- We present DynamicID, a tuning-free framework that inherently facilitates both single-ID and multi-ID personalized generation with high fidelity and flexible facial editability. Our key innovations include:
30
-
31
- - Semantic-Activated Attention (SAA), which employs query-level activation gating to minimize disruption to the original model when injecting ID features and achieve multi-ID personalization without requiring multi-ID samples during training.
32
-
33
- - Identity-Motion Reconfigurator (IMR), which applies feature-space manipulation to effectively disentangle and reconfigure facial motion and identity features, supporting flexible facial editing.
34
-
35
- - A task-decoupled training paradigm that reduces data dependency
36
-
37
- - A curated VariFace-10k facial dataset, comprising 10k unique individuals, each represented by 35 distinct facial images.
38
-
39
- Experimental results demonstrate that DynamicID outperforms state-of-the-art methods in identity fidelity, facial editability, and multi-ID personalization capability.
40
-
41
- ## πŸ’‘ Method
42
-
43
- <div align="center">
44
- <img src="assets/pipeline.jpg", width="1000">
45
- </div>
46
-
47
- The proposed framework is architected around two core components: SAA and IMR. (a) In the anchoring stage, we jointly optimize the SAA and a face encoder to establish robust single-ID and multi-ID personalized generation capabilities. (b) Subsequently in the reconfiguration stage, we freeze these optimized components and leverage them to train the IMR for flexible and fine-grained facial editing.
48
-
49
- ## πŸš€ Checkpoint
50
-
51
- 1. Download the pretrained Stable Diffusion v1.5 checkpoint from [Stable Diffusion v1.5 on Hugging Face](https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5).
52
-
53
- 2. Download our SAA-related and IMR-related checkpoints from [DynamicID Checkpoints on Hugging Face](https://huggingface.co/meteorite2023/DynamicID).
54
-
55
-
56
- ## 🌈 Gallery
57
-
58
- <div align="center">
59
- <img src="assets/teaser.jpg", width="900">
60
- <br><br><br>
61
- <img src="assets/single.jpg", width="900">
62
- <br><br><br>
63
- <img src="assets/multi.jpg", width="900">
64
- </div>
65
-
66
- ## πŸ“Œ ToDo List
67
-
68
- - [x] Release technical report
69
- - [x] Release **training and inference code**
70
- - [x] Release **Dynamic-sd** (based on *stable diffusion v1.5*)
71
- - [ ] Release **Dynamic-flux** (based on *Flux-dev*)
72
- - [ ] Release a Hugging Face Demo Space
73
-
74
- ## πŸ“– Citation
75
- If you are inspired by our work, please cite our paper.
76
- ```bibtex
77
- @inproceedings{dynamicid,
78
- title={DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability},
79
- author={Xirui Hu,
80
- Jiahao Wang,
81
- Hao Chen,
82
- Weizhan Zhang,
83
- Benqi Wang,
84
- Yikun Li,
85
- Haishun Nan
86
- },
87
- booktitle={International Conference on Computer Vision},
88
- year={2025}
89
- }
90
-
91
- ```
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - stable-diffusion-v1-5/stable-diffusion-v1-5
5
+ ---
6
+ <meta name="google-site-verification" content="-XQC-POJtlDPD3i2KSOxbFkSBde_Uq9obAIh_4mxTkM" />
7
+
8
+ # DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability
9
+
10
+
11
+ <div align="center">
12
+
13
+ ### [ICCV 2025]
14
+
15
+ [Xirui Hu](https://openreview.net/profile?id=~Xirui_Hu1),
16
+ [Jiahao Wang](https://openreview.net/profile?id=~Jiahao_Wang14),
17
+ [Hao Chen](https://openreview.net/profile?id=~Hao_chen100),
18
+ [Weizhan Zhang](https://openreview.net/profile?id=~Weizhan_Zhang1),
19
+ [Benqi Wang](https://openreview.net/profile?id=~Benqi_Wang2),
20
+ [Yikun Li](https://openreview.net/profile?id=~Yikun_Li1),
21
+ [Haishun Nan](https://openreview.net/profile?id=~Haishun_Nan1),
22
+
23
+ [![arXiv](https://img.shields.io/badge/arXiv-2503.06505-b31b1b.svg)](https://arxiv.org/abs/2503.06505)
24
+ [![GitHub](https://img.shields.io/badge/GitHub-MTVCrafter-blue?logo=github)](https://github.com/ByteCat-bot/DynamicID)
25
+ </div>
26
+
27
+ ---
28
+ This is the official implementation of DynamicID, a framework that generates visually harmonious image featuring **multiple individuals**. Each person in the image can be specified through user-provided reference images, and most notably, our method enables **independent control of each individual's facial expression** via text prompts. Hope you have fun with this demo!
29
+
30
+ ---
31
+
32
+ ## πŸ” Abstract
33
+
34
+ Recent advancements in text-to-image generation have spurred interest in personalized human image generation. Although existing methods achieve high-fidelity identity preservation, they often struggle with **limited multi-ID usability** and **inadequate facial editability**.
35
+
36
+ We present DynamicID, a tuning-free framework that inherently facilitates both single-ID and multi-ID personalized generation with high fidelity and flexible facial editability. Our key innovations include:
37
+
38
+ - Semantic-Activated Attention (SAA), which employs query-level activation gating to minimize disruption to the original model when injecting ID features and achieve multi-ID personalization without requiring multi-ID samples during training.
39
+
40
+ - Identity-Motion Reconfigurator (IMR), which applies feature-space manipulation to effectively disentangle and reconfigure facial motion and identity features, supporting flexible facial editing.
41
+
42
+ - A task-decoupled training paradigm that reduces data dependency
43
+
44
+ - A curated VariFace-10k facial dataset, comprising 10k unique individuals, each represented by 35 distinct facial images.
45
+
46
+ Experimental results demonstrate that DynamicID outperforms state-of-the-art methods in identity fidelity, facial editability, and multi-ID personalization capability.
47
+
48
+ ## πŸ’‘ Method
49
+
50
+ <div align="center">
51
+ <img src="assets/pipeline.jpg", width="1000">
52
+ </div>
53
+
54
+ The proposed framework is architected around two core components: SAA and IMR. (a) In the anchoring stage, we jointly optimize the SAA and a face encoder to establish robust single-ID and multi-ID personalized generation capabilities. (b) Subsequently in the reconfiguration stage, we freeze these optimized components and leverage them to train the IMR for flexible and fine-grained facial editing.
55
+
56
+ ## πŸš€ Checkpoint
57
+
58
+ 1. Download the pretrained Stable Diffusion v1.5 checkpoint from [Stable Diffusion v1.5 on Hugging Face](https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5).
59
+
60
+ 2. Download our SAA-related and IMR-related checkpoints from [DynamicID Checkpoints on Hugging Face](https://huggingface.co/meteorite2023/DynamicID).
61
+
62
+
63
+ ## 🌈 Gallery
64
+
65
+ <div align="center">
66
+ <img src="assets/teaser.jpg", width="900">
67
+ <br><br><br>
68
+ <img src="assets/single.jpg", width="900">
69
+ <br><br><br>
70
+ <img src="assets/multi.jpg", width="900">
71
+ </div>
72
+
73
+ ## πŸ“Œ ToDo List
74
+
75
+ - [x] Release technical report
76
+ - [x] Release **training and inference code**
77
+ - [x] Release **Dynamic-sd** (based on *stable diffusion v1.5*)
78
+ - [ ] Release **Dynamic-flux** (based on *Flux-dev*)
79
+ - [ ] Release a Hugging Face Demo Space
80
+
81
+ ## πŸ“– Citation
82
+ If you are inspired by our work, please cite our paper.
83
+ ```bibtex
84
+ @inproceedings{dynamicid,
85
+ title={DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability},
86
+ author={Xirui Hu,
87
+ Jiahao Wang,
88
+ Hao Chen,
89
+ Weizhan Zhang,
90
+ Benqi Wang,
91
+ Yikun Li,
92
+ Haishun Nan
93
+ },
94
+ booktitle={International Conference on Computer Vision},
95
+ year={2025}
96
+ }
97
+
98
+ ```