Text-to-Image
Diffusers
CoreloneH commited on
Commit
c80a977
·
verified ·
1 Parent(s): c8292a2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -0
README.md CHANGED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-nd-4.0
3
+ ---
4
+
5
+ # RealCustom Series
6
+ <p align="center">
7
+ <img src="./assets/teaser.svg" width=95% height=95%
8
+ class="center">
9
+ </p>
10
+
11
+ ## 📖 Introduction
12
+
13
+ Existing text-to-image customization methods (i.e., subject-driven generation) face a fundamental challenge due to the entangled influence of visual and textual conditions. This inherent conflict forces a trade-off between subject fidelity and textual controllability, preventing simultaneous optimization of both objectives.We present RealCustom to disentangle subject similarity from text controllability and thereby allows both to be optimized simultaneously without conflicts. The core idea of RealCustom is to represent given subjects as real words that can be seamlessly integrated with given texts, and further leveraging the relevance between real words and image regions to disentangle visual condition from text condition.
14
+
15
+ <p align="center">
16
+ <img src="./assets/process.svg" width=95% height=95%
17
+ class="center">
18
+ </p>
19
+
20
+ ## ⚡️ Quick Start
21
+
22
+ ### 🔧 Requirements and Installation
23
+
24
+ Install the requirements
25
+ ```bash
26
+ bash envs/init.sh
27
+ ```
28
+
29
+ ### ✍️ Inference
30
+ ```bash
31
+ bash inference/inference_single_image.sh
32
+ ```
33
+
34
+ ### 🌟 Gradio Demo
35
+ ```
36
+ python inference/app.py
37
+ ```
38
+
39
+ ### 🎨 Enjoy on [Dreamina](https://jimeng.jianying.com/ai-tool/home)
40
+ RealCustom is previously commercially applied in Dreamina and Doubao, ByteDance. You can also enjoy the more advanced customization algorithm in Dreamina!
41
+
42
+ #### Step 1: Create A Character:
43
+ Create character images and corresponding appearance descriptions through prompt descriptions, uploading reference images. Specifically:
44
+ 1. **Character Image**: Best in clean background, close-up, prominent subject, high-quality resolution.
45
+ 2. **Character Description**: Brief, includes the subject and key appearance elements.
46
+ <p align="center">
47
+ <img src="./assets/dreamina_character.jpg" width=50% height=50%
48
+ class="center">
49
+ </p>
50
+
51
+ #### Step 2: Character-Driven Generation:
52
+ Input prompts where the subject is replaced by the selected character, guiding the character to make corresponding changes such as style, actions, expressions, scenes, and modifiers.
53
+ There is no need to add descriptions of the subject in the prompt. "Face Reference Strength" is the weight for ID retention, and "Body Reference Strength" is the weight for IP retention.
54
+ <p align="center">
55
+ <img src="./assets/dreamina_generation.jpg" width=50% height=50%
56
+ class="center">
57
+ </p>
58
+
59
+ ## Citation
60
+ If you find this project useful for your research, please consider citing our papers:
61
+ ```bibtex
62
+ @inproceedings{huang2024realcustom,
63
+ title={RealCustom: narrowing real text word for real-time open-domain text-to-image customization},
64
+ author={Huang, Mengqi and Mao, Zhendong and Liu, Mingcong and He, Qian and Zhang, Yongdong},
65
+ booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
66
+ pages={7476--7485},
67
+ year={2024}
68
+ }
69
+ @article{mao2024realcustom++,
70
+ title={Realcustom++: Representing images as real-word for real-time customization},
71
+ author={Mao, Zhendong and Huang, Mengqi and Ding, Fei and Liu, Mingcong and He, Qian and Zhang, Yongdong},
72
+ journal={arXiv preprint arXiv:2408.09744},
73
+ year={2024}
74
+ }
75
+ ```