ByteDance
/

InfiniteYou

@@ -8,7 +8,7 @@ pipeline_tag: text-to-image
 tags:
 - Text-to-Image
-- Flux.1-dev
 - image-generation
 - Diffusion-Transformer
 - subject-personalization
@@ -18,13 +18,11 @@ base_model: black-forest-labs/FLUX.1-dev
 # InfiniteYou Model Card
-<div align="center">
 <a href="https://bytedance.github.io/InfiniteYou"><img src="https://img.shields.io/static/v1?label=Project&message=Page&color=blue&logo=github-pages"></a> &ensp;
 <a href="https://arxiv.org/abs/2503.xxxxx"><img src="https://img.shields.io/static/v1?label=Arxiv&message=InfiniteYou&color=darkred&logo=arxiv"></a> &ensp;
 <a href="https://github.com/bytedance/InfiniteYou"><img src="https://img.shields.io/static/v1?label=GitHub&message=Code&color=green&logo=github"></a> &ensp;
 <a href=""><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Hugging%20Face&message=Demo&color=orange"></a> &ensp;
 </div>
 ![teaser](./assets/teaser.jpg)
@@ -40,12 +38,12 @@ This repository provides the official models for the following paper:
 [Xin Lu](https://scholar.google.com/citations?user=mFC0wp8AAAAJ)<br />
 ByteDance Intelligent Creation
-> **Abstract:** *Achieving flexible and high-fidelity identity-preserved image generation remains formidable, particularly with advanced Diffusion Transformers (DiTs) like FLUX. We introduce **InfiniteYou (InfU)**, one of the earliest robust frameworks leveraging DiTs for this task. InfU addresses significant issues of existing methods, such as insufficient identity similarity, poor text-image alignment, and low generation quality and aesthetics. Central to InfU is InfuseNet, a component that injects identity features into the DiT base model via residual connections, enhancing identity similarity while maintaining generation capabilities. A multi-stage training strategy, including pretraining and supervised fine-tuning (SFT) with synthetic single-person-multiple-sample (SPMS) data, further improves text-image alignment, ameliorates image quality, and alleviates face copy-pasting. Extensive experiments demonstrate that InfU achieves state-of-the-art performance, surpassing existing baselines. In addition, the plug-and-play design of InfU ensures compatibility with various existing methods, offering a valuable contribution to the broader community.*
 ## 🔧 Installation and Usage
-Please clone our [GitHub code repository](https://github.com/bytedance/InfiniteYou) and follow the instructions to use the released models for local inference.
 <!-- We appreciate the GPU grant from the Hugging Face team.  -->
 You can also try our [InfiniteYou-FLUX Hugging Face demo]() online.
@@ -59,10 +57,10 @@ You can also try our [InfiniteYou-FLUX Hugging Face demo]() online.
 - We also provided two LoRAs ([Realism](https://civitai.com/models/631986?modelVersionId=706528) and [Anti-blur](https://civitai.com/models/675581/anti-blur-flux-lora)) to enable additional usage flexibility. They are *entirely optional*, which are examples to facilitate users to try but are NOT used in our paper.
-- If the generated gender does not align with your preferences, try adding specific words in the text prompt, such as 'a man', 'a woman', *etc*. We encourage users to use inclusive and respectful language.
-## :european_castle: Model Zoo
 | InfiniteYou Version | Model Version | Base Model Trained with | Description |
 | :---: | :---: | :---: | :---: |

 tags:
 - Text-to-Image
+- FLUX.1-dev
 - image-generation
 - Diffusion-Transformer
 - subject-personalization
 # InfiniteYou Model Card
+<div style="display:flex;justify-content: center">
 <a href="https://bytedance.github.io/InfiniteYou"><img src="https://img.shields.io/static/v1?label=Project&message=Page&color=blue&logo=github-pages"></a> &ensp;
 <a href="https://arxiv.org/abs/2503.xxxxx"><img src="https://img.shields.io/static/v1?label=Arxiv&message=InfiniteYou&color=darkred&logo=arxiv"></a> &ensp;
 <a href="https://github.com/bytedance/InfiniteYou"><img src="https://img.shields.io/static/v1?label=GitHub&message=Code&color=green&logo=github"></a> &ensp;
 <a href=""><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Hugging%20Face&message=Demo&color=orange"></a> &ensp;
 </div>
 ![teaser](./assets/teaser.jpg)
 [Xin Lu](https://scholar.google.com/citations?user=mFC0wp8AAAAJ)<br />
 ByteDance Intelligent Creation
+> **Abstract:** Achieving flexible and high-fidelity identity-preserved image generation remains formidable, particularly with advanced Diffusion Transformers (DiTs) like FLUX. We introduce **InfiniteYou (InfU)**, one of the earliest robust frameworks leveraging DiTs for this task. InfU addresses significant issues of existing methods, such as insufficient identity similarity, poor text-image alignment, and low generation quality and aesthetics. Central to InfU is InfuseNet, a component that injects identity features into the DiT base model via residual connections, enhancing identity similarity while maintaining generation capabilities. A multi-stage training strategy, including pretraining and supervised fine-tuning (SFT) with synthetic single-person-multiple-sample (SPMS) data, further improves text-image alignment, ameliorates image quality, and alleviates face copy-pasting. Extensive experiments demonstrate that InfU achieves state-of-the-art performance, surpassing existing baselines. In addition, the plug-and-play design of InfU ensures compatibility with various existing methods, offering a valuable contribution to the broader community.
 ## 🔧 Installation and Usage
+Please clone our [GitHub code repository](https://github.com/bytedance/InfiniteYou) and follow the detailed instructions to use the released models for local inference.
 <!-- We appreciate the GPU grant from the Hugging Face team.  -->
 You can also try our [InfiniteYou-FLUX Hugging Face demo]() online.
 - We also provided two LoRAs ([Realism](https://civitai.com/models/631986?modelVersionId=706528) and [Anti-blur](https://civitai.com/models/675581/anti-blur-flux-lora)) to enable additional usage flexibility. They are *entirely optional*, which are examples to facilitate users to try but are NOT used in our paper.
+- If the generated gender does not align with your preferences, try adding specific words in the text prompt, such as 'a man', 'a woman', *etc*. We encourage using inclusive and respectful language.
+## 🏰 Model Zoo
 | InfiniteYou Version | Model Version | Base Model Trained with | Description |
 | :---: | :---: | :---: | :---: |