EndlessSora commited on
Commit
b1a623b
·
1 Parent(s): 035faf6

update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -8
README.md CHANGED
@@ -8,7 +8,7 @@ pipeline_tag: text-to-image
8
 
9
  tags:
10
  - Text-to-Image
11
- - Flux.1-dev
12
  - image-generation
13
  - Diffusion-Transformer
14
  - subject-personalization
@@ -18,13 +18,11 @@ base_model: black-forest-labs/FLUX.1-dev
18
 
19
  # InfiniteYou Model Card
20
 
21
- <div align="center">
22
-
23
  <a href="https://bytedance.github.io/InfiniteYou"><img src="https://img.shields.io/static/v1?label=Project&message=Page&color=blue&logo=github-pages"></a> &ensp;
24
  <a href="https://arxiv.org/abs/2503.xxxxx"><img src="https://img.shields.io/static/v1?label=Arxiv&message=InfiniteYou&color=darkred&logo=arxiv"></a> &ensp;
25
  <a href="https://github.com/bytedance/InfiniteYou"><img src="https://img.shields.io/static/v1?label=GitHub&message=Code&color=green&logo=github"></a> &ensp;
26
  <a href=""><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Hugging%20Face&message=Demo&color=orange"></a> &ensp;
27
-
28
  </div>
29
 
30
  ![teaser](./assets/teaser.jpg)
@@ -40,12 +38,12 @@ This repository provides the official models for the following paper:
40
  [Xin Lu](https://scholar.google.com/citations?user=mFC0wp8AAAAJ)<br />
41
  ByteDance Intelligent Creation
42
 
43
- > **Abstract:** *Achieving flexible and high-fidelity identity-preserved image generation remains formidable, particularly with advanced Diffusion Transformers (DiTs) like FLUX. We introduce **InfiniteYou (InfU)**, one of the earliest robust frameworks leveraging DiTs for this task. InfU addresses significant issues of existing methods, such as insufficient identity similarity, poor text-image alignment, and low generation quality and aesthetics. Central to InfU is InfuseNet, a component that injects identity features into the DiT base model via residual connections, enhancing identity similarity while maintaining generation capabilities. A multi-stage training strategy, including pretraining and supervised fine-tuning (SFT) with synthetic single-person-multiple-sample (SPMS) data, further improves text-image alignment, ameliorates image quality, and alleviates face copy-pasting. Extensive experiments demonstrate that InfU achieves state-of-the-art performance, surpassing existing baselines. In addition, the plug-and-play design of InfU ensures compatibility with various existing methods, offering a valuable contribution to the broader community.*
44
 
45
 
46
  ## 🔧 Installation and Usage
47
 
48
- Please clone our [GitHub code repository](https://github.com/bytedance/InfiniteYou) and follow the instructions to use the released models for local inference.
49
 
50
  <!-- We appreciate the GPU grant from the Hugging Face team. -->
51
  You can also try our [InfiniteYou-FLUX Hugging Face demo]() online.
@@ -59,10 +57,10 @@ You can also try our [InfiniteYou-FLUX Hugging Face demo]() online.
59
 
60
  - We also provided two LoRAs ([Realism](https://civitai.com/models/631986?modelVersionId=706528) and [Anti-blur](https://civitai.com/models/675581/anti-blur-flux-lora)) to enable additional usage flexibility. They are *entirely optional*, which are examples to facilitate users to try but are NOT used in our paper.
61
 
62
- - If the generated gender does not align with your preferences, try adding specific words in the text prompt, such as 'a man', 'a woman', *etc*. We encourage users to use inclusive and respectful language.
63
 
64
 
65
- ## :european_castle: Model Zoo
66
 
67
  | InfiniteYou Version | Model Version | Base Model Trained with | Description |
68
  | :---: | :---: | :---: | :---: |
 
8
 
9
  tags:
10
  - Text-to-Image
11
+ - FLUX.1-dev
12
  - image-generation
13
  - Diffusion-Transformer
14
  - subject-personalization
 
18
 
19
  # InfiniteYou Model Card
20
 
21
+ <div style="display:flex;justify-content: center">
 
22
  <a href="https://bytedance.github.io/InfiniteYou"><img src="https://img.shields.io/static/v1?label=Project&message=Page&color=blue&logo=github-pages"></a> &ensp;
23
  <a href="https://arxiv.org/abs/2503.xxxxx"><img src="https://img.shields.io/static/v1?label=Arxiv&message=InfiniteYou&color=darkred&logo=arxiv"></a> &ensp;
24
  <a href="https://github.com/bytedance/InfiniteYou"><img src="https://img.shields.io/static/v1?label=GitHub&message=Code&color=green&logo=github"></a> &ensp;
25
  <a href=""><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Hugging%20Face&message=Demo&color=orange"></a> &ensp;
 
26
  </div>
27
 
28
  ![teaser](./assets/teaser.jpg)
 
38
  [Xin Lu](https://scholar.google.com/citations?user=mFC0wp8AAAAJ)<br />
39
  ByteDance Intelligent Creation
40
 
41
+ > **Abstract:** Achieving flexible and high-fidelity identity-preserved image generation remains formidable, particularly with advanced Diffusion Transformers (DiTs) like FLUX. We introduce **InfiniteYou (InfU)**, one of the earliest robust frameworks leveraging DiTs for this task. InfU addresses significant issues of existing methods, such as insufficient identity similarity, poor text-image alignment, and low generation quality and aesthetics. Central to InfU is InfuseNet, a component that injects identity features into the DiT base model via residual connections, enhancing identity similarity while maintaining generation capabilities. A multi-stage training strategy, including pretraining and supervised fine-tuning (SFT) with synthetic single-person-multiple-sample (SPMS) data, further improves text-image alignment, ameliorates image quality, and alleviates face copy-pasting. Extensive experiments demonstrate that InfU achieves state-of-the-art performance, surpassing existing baselines. In addition, the plug-and-play design of InfU ensures compatibility with various existing methods, offering a valuable contribution to the broader community.
42
 
43
 
44
  ## 🔧 Installation and Usage
45
 
46
+ Please clone our [GitHub code repository](https://github.com/bytedance/InfiniteYou) and follow the detailed instructions to use the released models for local inference.
47
 
48
  <!-- We appreciate the GPU grant from the Hugging Face team. -->
49
  You can also try our [InfiniteYou-FLUX Hugging Face demo]() online.
 
57
 
58
  - We also provided two LoRAs ([Realism](https://civitai.com/models/631986?modelVersionId=706528) and [Anti-blur](https://civitai.com/models/675581/anti-blur-flux-lora)) to enable additional usage flexibility. They are *entirely optional*, which are examples to facilitate users to try but are NOT used in our paper.
59
 
60
+ - If the generated gender does not align with your preferences, try adding specific words in the text prompt, such as 'a man', 'a woman', *etc*. We encourage using inclusive and respectful language.
61
 
62
 
63
+ ## 🏰 Model Zoo
64
 
65
  | InfiniteYou Version | Model Version | Base Model Trained with | Description |
66
  | :---: | :---: | :---: | :---: |