Update README.md
Browse files
README.md
CHANGED
|
@@ -1,11 +1,5 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
language:
|
| 4 |
-
- ak
|
| 5 |
-
library_name: diffusers
|
| 6 |
-
---
|
| 7 |
<p align="center">
|
| 8 |
-
<img src="https://github.com/JackAILab/ConsistentID/assets/135965025/c0594480-d73d-4268-95ca-5494ca2a61e4" height=
|
| 9 |
|
| 10 |
</p>
|
| 11 |
|
|
@@ -13,9 +7,10 @@ library_name: diffusers
|
|
| 13 |
|
| 14 |
<div align="center">
|
| 15 |
|
| 16 |
-
## ConsistentID : Portrait Generation with Multimodal Fine-Grained Identity Preserving []()
|
| 17 |
[π[Paper](https://arxiv.org/abs/2404.16771)]   [π©[Project Page](https://ssugarwh.github.io/consistentid.github.io/)]   [πΌ[Gradio Demo](http://consistentid.natapp1.cc/)] <br>
|
| 18 |
|
|
|
|
| 19 |
|
| 20 |
</div>
|
| 21 |
|
|
@@ -39,10 +34,9 @@ library_name: diffusers
|
|
| 39 |
|
| 40 |
## π© To-Do List
|
| 41 |
Your star will help facilitate the process.
|
| 42 |
-
- [x] Release training, evaluation code, and demo!
|
| 43 |
-
- [ ]
|
| 44 |
-
- [ ] Release
|
| 45 |
-
- [ ] Optimize training and inference structures to further improve text following and ID decoupling capabilities.
|
| 46 |
|
| 47 |
## π·οΈ Abstract
|
| 48 |
|
|
@@ -55,10 +49,17 @@ Finally, a large number of experiments were conducted in this article, and Consi
|
|
| 55 |
|
| 56 |
## π§ Requirements
|
| 57 |
|
| 58 |
-
|
|
|
|
|
|
|
| 59 |
|
| 60 |
-
```
|
| 61 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
```
|
| 63 |
|
| 64 |
## π¦οΈ Data Preparation
|
|
@@ -75,12 +76,11 @@ The .json file should be like
|
|
| 75 |
```
|
| 76 |
[
|
| 77 |
{
|
| 78 |
-
"
|
| 79 |
-
"parsing_color_IMG": "...",
|
| 80 |
"parsing_mask_IMG": "...",
|
| 81 |
"vqa_llva": "...",
|
| 82 |
"id_embed_file_resize": "...",
|
| 83 |
-
"
|
| 84 |
},
|
| 85 |
...
|
| 86 |
]
|
|
@@ -109,13 +109,11 @@ The pre-trained model parameters of the model can now be downloaded on [Google D
|
|
| 109 |
## Acknowledgement
|
| 110 |
* Inspired from many excellent demos and repos, including [IPAdapter](https://github.com/tencent-ailab/IP-Adapter), [FastComposer](https://github.com/mit-han-lab/fastcomposer), [PhotoMaker](https://github.com/TencentARC/PhotoMaker). Thanks for their great works!
|
| 111 |
* Thanks to the open source contributions of the following work: [face-parsing.PyTorch](https://github.com/zllrunning/face-parsing.PyTorch), [LLaVA](https://github.com/haotian-liu/LLaVA), [insightface](https://github.com/deepinsight/insightface), [FFHQ](https://github.com/NVlabs/ffhq-dataset), [CelebA](https://github.com/switchablenorms/CelebAMask-HQ), [SFHQ](https://github.com/SelfishGene/SFHQ-dataset).
|
| 112 |
-
* Thanks to the [
|
| 113 |
-
|
| 114 |
|
| 115 |
## Disclaimer
|
| 116 |
This project strives to impact the domain of AI-driven image generation positively. Users are granted the freedom to create images using this tool, but they are expected to comply with local laws and utilize it responsibly. The developers do not assume any responsibility for potential misuse by users.
|
| 117 |
|
| 118 |
-
|
| 119 |
## Citation
|
| 120 |
If you found this code helpful, please consider citing:
|
| 121 |
~~~
|
|
@@ -128,7 +126,3 @@ If you found this code helpful, please consider citing:
|
|
| 128 |
~~~
|
| 129 |
|
| 130 |
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
<p align="center">
|
| 2 |
+
<img src="https://github.com/JackAILab/ConsistentID/assets/135965025/c0594480-d73d-4268-95ca-5494ca2a61e4" height=30>
|
| 3 |
|
| 4 |
</p>
|
| 5 |
|
|
|
|
| 7 |
|
| 8 |
<div align="center">
|
| 9 |
|
| 10 |
+
## ConsistentID : Portrait Generation with Multimodal Fine-Grained Identity Preserving [](https://arxiv.org/abs/2404.16771)
|
| 11 |
[π[Paper](https://arxiv.org/abs/2404.16771)]   [π©[Project Page](https://ssugarwh.github.io/consistentid.github.io/)]   [πΌ[Gradio Demo](http://consistentid.natapp1.cc/)] <br>
|
| 12 |
|
| 13 |
+
[π€[Faster Demo](https://huggingface.co/spaces/JackAILab/ConsistentID)]   <br>
|
| 14 |
|
| 15 |
</div>
|
| 16 |
|
|
|
|
| 34 |
|
| 35 |
## π© To-Do List
|
| 36 |
Your star will help facilitate the process.
|
| 37 |
+
- [x] Release ConsistentID training, evaluation code, and demo!
|
| 38 |
+
- [ ] Release the SDXL model trained with more data, with enhanced resolution and generalizability.
|
| 39 |
+
- [ ] Release the multi-ID input version to guide the improvement of ID diversity.
|
|
|
|
| 40 |
|
| 41 |
## π·οΈ Abstract
|
| 42 |
|
|
|
|
| 49 |
|
| 50 |
## π§ Requirements
|
| 51 |
|
| 52 |
+
- Python >= 3.8 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html))
|
| 53 |
+
- [PyTorch >= 2.0.0](https://pytorch.org/)
|
| 54 |
+
- cuda==11.8
|
| 55 |
|
| 56 |
+
```bash
|
| 57 |
+
conda create --name ConsistentID python=3.8.10
|
| 58 |
+
conda activate ConsistentID
|
| 59 |
+
pip install -U pip
|
| 60 |
+
|
| 61 |
+
# Install requirements
|
| 62 |
+
pip install -r requirements.txt
|
| 63 |
```
|
| 64 |
|
| 65 |
## π¦οΈ Data Preparation
|
|
|
|
| 76 |
```
|
| 77 |
[
|
| 78 |
{
|
| 79 |
+
"IMG": "Path of image...",
|
|
|
|
| 80 |
"parsing_mask_IMG": "...",
|
| 81 |
"vqa_llva": "...",
|
| 82 |
"id_embed_file_resize": "...",
|
| 83 |
+
"vqa_llva_facial": "..."
|
| 84 |
},
|
| 85 |
...
|
| 86 |
]
|
|
|
|
| 109 |
## Acknowledgement
|
| 110 |
* Inspired from many excellent demos and repos, including [IPAdapter](https://github.com/tencent-ailab/IP-Adapter), [FastComposer](https://github.com/mit-han-lab/fastcomposer), [PhotoMaker](https://github.com/TencentARC/PhotoMaker). Thanks for their great works!
|
| 111 |
* Thanks to the open source contributions of the following work: [face-parsing.PyTorch](https://github.com/zllrunning/face-parsing.PyTorch), [LLaVA](https://github.com/haotian-liu/LLaVA), [insightface](https://github.com/deepinsight/insightface), [FFHQ](https://github.com/NVlabs/ffhq-dataset), [CelebA](https://github.com/switchablenorms/CelebAMask-HQ), [SFHQ](https://github.com/SelfishGene/SFHQ-dataset).
|
| 112 |
+
* π€ Thanks to the huggingface gradio team [ZeroGPUs](https://github.com/huggingface) for their free GPU support!
|
|
|
|
| 113 |
|
| 114 |
## Disclaimer
|
| 115 |
This project strives to impact the domain of AI-driven image generation positively. Users are granted the freedom to create images using this tool, but they are expected to comply with local laws and utilize it responsibly. The developers do not assume any responsibility for potential misuse by users.
|
| 116 |
|
|
|
|
| 117 |
## Citation
|
| 118 |
If you found this code helpful, please consider citing:
|
| 119 |
~~~
|
|
|
|
| 126 |
~~~
|
| 127 |
|
| 128 |
|
|
|
|
|
|
|
|
|
|
|
|