File size: 6,098 Bytes
253526b
 
 
07b8e53
 
 
253526b
07b8e53
253526b
 
6665c4d
253526b
 
 
6665c4d
 
 
253526b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
07b8e53
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
253526b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
---
base_model:
- stable-diffusion-v1-5/stable-diffusion-v1-5
license: apache-2.0
pipeline_tag: text-to-image
library_name: diffusers
---

<meta name="google-site-verification" content="-XQC-POJtlDPD3i2KSOxbFkSBde_Uq9obAIh_4mxTkM" />




<div align="center">
  
<h2>DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability</h2>
<h3>[ICCV 2025]</h3>

[Xirui Hu](https://openreview.net/profile?id=~Xirui_Hu1),
[Jiahao Wang](https://openreview.net/profile?id=~Jiahao_Wang14),
[Hao Chen](https://openreview.net/profile?id=~Hao_chen100),
[Weizhan Zhang](https://openreview.net/profile?id=~Weizhan_Zhang1),
[Benqi Wang](https://openreview.net/profile?id=~Benqi_Wang2),
[Yikun Li](https://openreview.net/profile?id=~Yikun_Li1),
[Haishun Nan](https://openreview.net/profile?id=~Haishun_Nan1),

[![arXiv](https://img.shields.io/badge/arXiv-2503.06505-b31b1b.svg)](https://arxiv.org/abs/2503.06505)
[![GitHub](https://img.shields.io/badge/GitHub-MTVCrafter-blue?logo=github)](https://github.com/ByteCat-bot/DynamicID)
</div>

---
This is the official implementation of DynamicID, a framework that generates visually harmonious image featuring **multiple individuals**. Each person in the image can be specified through user-provided reference images, and most notably, our method enables **independent control of each individual's facial expression** via text prompts. Hope you have fun with this demo!

---

## πŸ” Abstract

Recent advancements in text-to-image generation have spurred interest in personalized human image generation. Although existing methods achieve high-fidelity identity preservation, they often struggle with **limited multi-ID usability** and **inadequate facial editability**. 

We present DynamicID, a tuning-free framework that inherently facilitates both single-ID and multi-ID personalized generation with high fidelity and flexible facial editability. Our key innovations include: 

- Semantic-Activated Attention (SAA), which employs query-level activation gating to minimize disruption to the original model when injecting ID features and achieve multi-ID personalization without requiring multi-ID samples during training. 

- Identity-Motion Reconfigurator (IMR), which applies feature-space manipulation to effectively disentangle and reconfigure facial motion and identity features, supporting flexible facial editing.

- A task-decoupled training paradigm that reduces data dependency

- A curated VariFace-10k facial dataset, comprising 10k unique individuals, each represented by 35 distinct facial images. 

Experimental results demonstrate that DynamicID outperforms state-of-the-art methods in identity fidelity, facial editability, and multi-ID personalization capability.

## πŸ’‘ Method

<div align="center">
    <img src="assets/pipeline.jpg", width="1000">
</div>

The proposed framework is architected around two core components: SAA and IMR. (a) In the anchoring stage, we jointly optimize the SAA and a face encoder to establish robust single-ID and multi-ID personalized generation capabilities. (b) Subsequently in the reconfiguration stage, we freeze these optimized components and leverage them to train the IMR for flexible and fine-grained facial editing.

## πŸš€ Checkpoint

1. Download the pretrained Stable Diffusion v1.5 checkpoint from [Stable Diffusion v1.5 on Hugging Face](https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5).

2. Download our SAA-related and IMR-related checkpoints from  [DynamicID Checkpoints on Hugging Face](https://huggingface.co/meteorite2023/DynamicID).

## ⚑ Sample Usage (Diffusers)

The official inference code is available in the [GitHub repository](https://github.com/ByteCat-bot/DynamicID), which provides detailed instructions for running the model. A typical usage with the `diffusers` library would involve loading the base Stable Diffusion pipeline and then integrating the DynamicID specific weights.

```python
import torch
from diffusers import StableDiffusionPipeline

# Load the base Stable Diffusion pipeline
# Ensure you have downloaded the base model locally or from Hugging Face Hub
pipeline = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16
).to("cuda")

# Load DynamicID specific weights (e.g., LoRAs or custom UNet modifications)
# The precise method for loading these weights will be detailed in the official repository.
# For conceptual understanding, it might involve:
# pipeline.load_lora_weights("path/to/DynamicID/weights")
# Or integrating custom UNet/attention layers as per the DynamicID implementation.

# Refer to the official GitHub repository for the exact loading and inference pipeline.
# You would then pass your text prompt and identity reference images to the pipeline.
# Example (conceptual):
# prompt = "a photo of [person1] with a big smile and [person2] looking thoughtful"
# generated_image = pipeline(
#     prompt=prompt,
#     identity_references=[id_image_1, id_image_2], # Placeholder for identity images
#     # Add other parameters as specified in the DynamicID code
# ).images[0]
```

## 🌈 Gallery

<div align="center">
    <img src="assets/teaser.jpg", width="900">
    <br><br><br>
    <img src="assets/single.jpg", width="900">
    <br><br><br>
    <img src="assets/multi.jpg", width="900">
</div>

## πŸ“Œ ToDo List

- [x] Release technical report
- [x] Release **training and inference code**
- [x] Release **Dynamic-sd** (based on *stable diffusion v1.5*)  
- [ ] Release **Dynamic-flux** (based on *Flux-dev*)
- [ ] Release a Hugging Face Demo Space

## πŸ“– Citation
If you are inspired by our work, please cite our paper.
```bibtex
@inproceedings{dynamicid,
      title={DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability},
      author={Xirui Hu,
              Jiahao Wang,
              Hao Chen,
              Weizhan Zhang,
              Benqi Wang,
              Yikun Li,
              Haishun Nan
              },
      booktitle={International Conference on Computer Vision},
      year={2025}
    }
    
```