JackAILab
/

ConsistentID

@@ -1,11 +1,5 @@
----
-license: mit
-language:
-- ak
-library_name: diffusers
----
 <p align="center">
-  <img src="https://github.com/JackAILab/ConsistentID/assets/135965025/c0594480-d73d-4268-95ca-5494ca2a61e4" height=20>
 </p>
@@ -13,9 +7,10 @@ library_name: diffusers
 <div align="center">
-## ConsistentID : Portrait Generation with Multimodal Fine-Grained Identity Preserving  [![Paper page](https://huggingface.co/datasets/huggingface/badges/resolve/main/paper-page-md-dark.svg)]()
 [📄[Paper](https://arxiv.org/abs/2404.16771)] &emsp; [🚩[Project Page](https://ssugarwh.github.io/consistentid.github.io/)] &emsp; [🖼[Gradio Demo](http://consistentid.natapp1.cc/)] <br>
 </div>
@@ -39,10 +34,9 @@ library_name: diffusers
 ## 🚩 To-Do List
 Your star will help facilitate the process.
-- [x] Release training, evaluation code, and demo!
-- [ ] Retrain with more data and the SDXL base model to enhance aesthetics and generalization.
-- [ ] Release a multi-ID input version to guide the improvement of ID diversity.
-- [ ] Optimize training and inference structures to further improve text following and ID decoupling capabilities.
 ## 🏷️ Abstract
@@ -55,10 +49,17 @@ Finally, a large number of experiments were conducted in this article, and Consi
 ## 🔧 Requirements
-To install requirements:
-```setup
-pip3 install -r requirements.txt
 ```
 ## 📦️ Data Preparation
@@ -75,12 +76,11 @@ The .json file should be like
 ```
 [
     {
-        "resize_IMG": "Path to resized image...",
-        "parsing_color_IMG": "...",
         "parsing_mask_IMG": "...",
         "vqa_llva": "...",
         "id_embed_file_resize": "...",
-        "vqa_llva_more_face_detail": "..."
     },
     ...
 ]
@@ -109,13 +109,11 @@ The pre-trained model parameters of the model can now be downloaded on [Google D
 ## Acknowledgement
 * Inspired from many excellent demos and repos, including [IPAdapter](https://github.com/tencent-ailab/IP-Adapter), [FastComposer](https://github.com/mit-han-lab/fastcomposer), [PhotoMaker](https://github.com/TencentARC/PhotoMaker). Thanks for their great works!
 * Thanks to the open source contributions of the following work: [face-parsing.PyTorch](https://github.com/zllrunning/face-parsing.PyTorch), [LLaVA](https://github.com/haotian-liu/LLaVA), [insightface](https://github.com/deepinsight/insightface), [FFHQ](https://github.com/NVlabs/ffhq-dataset), [CelebA](https://github.com/switchablenorms/CelebAMask-HQ), [SFHQ](https://github.com/SelfishGene/SFHQ-dataset).
-* Thanks to the [HuggingFace](https://github.com/huggingface) gradio team for their free GPU support!
 ## Disclaimer
 This project strives to impact the domain of AI-driven image generation positively. Users are granted the freedom to create images using this tool, but they are expected to comply with local laws and utilize it responsibly. The developers do not assume any responsibility for potential misuse by users.
 ## Citation
 If you found this code helpful, please consider citing:
 ~~~
@@ -128,7 +126,3 @@ If you found this code helpful, please consider citing:
 ~~~

 <p align="center">
+  <img src="https://github.com/JackAILab/ConsistentID/assets/135965025/c0594480-d73d-4268-95ca-5494ca2a61e4" height=30>
 </p>
 <div align="center">
+## ConsistentID : Portrait Generation with Multimodal Fine-Grained Identity Preserving  [![Paper page](https://huggingface.co/datasets/huggingface/badges/resolve/main/paper-page-md-dark.svg)](https://arxiv.org/abs/2404.16771)
 [📄[Paper](https://arxiv.org/abs/2404.16771)] &emsp; [🚩[Project Page](https://ssugarwh.github.io/consistentid.github.io/)] &emsp; [🖼[Gradio Demo](http://consistentid.natapp1.cc/)] <br>
+[🤗[Faster Demo](https://huggingface.co/spaces/JackAILab/ConsistentID)] &emsp; <br>
 </div>
 ## 🚩 To-Do List
 Your star will help facilitate the process.
+- [x] Release ConsistentID training, evaluation code, and demo!
+- [ ] Release the SDXL model trained with more data, with enhanced resolution and generalizability.
+- [ ] Release the multi-ID input version to guide the improvement of ID diversity.
 ## 🏷️ Abstract
 ## 🔧 Requirements
+- Python >= 3.8 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html))
+- [PyTorch >= 2.0.0](https://pytorch.org/)
+- cuda==11.8
+```bash
+conda create --name ConsistentID python=3.8.10
+conda activate ConsistentID
+pip install -U pip
+# Install requirements
+pip install -r requirements.txt
 ```
 ## 📦️ Data Preparation
 ```
 [
     {
+        "IMG": "Path of image...",
         "parsing_mask_IMG": "...",
         "vqa_llva": "...",
         "id_embed_file_resize": "...",
+        "vqa_llva_facial": "..."
     },
     ...
 ]
 ## Acknowledgement
 * Inspired from many excellent demos and repos, including [IPAdapter](https://github.com/tencent-ailab/IP-Adapter), [FastComposer](https://github.com/mit-han-lab/fastcomposer), [PhotoMaker](https://github.com/TencentARC/PhotoMaker). Thanks for their great works!
 * Thanks to the open source contributions of the following work: [face-parsing.PyTorch](https://github.com/zllrunning/face-parsing.PyTorch), [LLaVA](https://github.com/haotian-liu/LLaVA), [insightface](https://github.com/deepinsight/insightface), [FFHQ](https://github.com/NVlabs/ffhq-dataset), [CelebA](https://github.com/switchablenorms/CelebAMask-HQ), [SFHQ](https://github.com/SelfishGene/SFHQ-dataset).
+* 🤗 Thanks to the huggingface gradio team [ZeroGPUs](https://github.com/huggingface) for their free GPU support!
 ## Disclaimer
 This project strives to impact the domain of AI-driven image generation positively. Users are granted the freedom to create images using this tool, but they are expected to comply with local laws and utilize it responsibly. The developers do not assume any responsibility for potential misuse by users.
 ## Citation
 If you found this code helpful, please consider citing:
 ~~~
 ~~~