oahzxl commited on
Commit
9ae1e47
·
verified ·
1 Parent(s): 7d3f802

update readme

Browse files
.gitattributes CHANGED
@@ -33,3 +33,14 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ assets/animation.gif filter=lfs diff=lfs merge=lfs -text
37
+ assets/gsb.png filter=lfs diff=lfs merge=lfs -text
38
+ assets/input_1_1.png filter=lfs diff=lfs merge=lfs -text
39
+ assets/input_1_2.png filter=lfs diff=lfs merge=lfs -text
40
+ assets/showcase1.png filter=lfs diff=lfs merge=lfs -text
41
+ assets/showcase2.png filter=lfs diff=lfs merge=lfs -text
42
+ assets/showcase3.png filter=lfs diff=lfs merge=lfs -text
43
+ assets/showcase4.png filter=lfs diff=lfs merge=lfs -text
44
+ assets/showcase5.png filter=lfs diff=lfs merge=lfs -text
45
+ assets/teaser.png filter=lfs diff=lfs merge=lfs -text
46
+ assets/workflow.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,5 +1,204 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- base_model:
3
- - tencent/HunyuanImage-3.0-Instruct
4
- pipeline_tag: image-to-image
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div align="center">
2
+ <img src="./assets/tencent-hy-wu-logo.svg" alt="HY-WU Logo" width="600">
3
+
4
+ # HY-WU (Part I): An Extensible Functional Neural Memory Framework and An Instantiation in Text-Guided Image Editing
5
+ </div>
6
+
7
+ <div align="center">
8
+ <img src="./assets/teaser.png" alt="HY-WU Teaser" width="800">
9
+ </div>
10
+
11
+ <div align="center">
12
+ <a href=https://tencent-hy-wu.github.io/ target="_blank"><img src=https://img.shields.io/badge/🌐%20Demo-4285F4.svg height=22px></a>
13
+ <a href=https://huggingface.co/tencent/HY-WU target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-d96902.svg height=22px></a>
14
+ <a href=https://github.com/Tencent-Hunyuan/HY-WU target="_blank"><img src=https://img.shields.io/badge/GitHub-181717.svg?logo=github height=22px></a>
15
+ <a href=https://github.com/Tencent-Hunyuan/HY-WU/assets/report.pdf target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
16
+ <a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
17
+ <a href=https://docs.qq.com/doc/DUVVadmhCdG9qRXBU target="_blank"><img src=https://img.shields.io/badge/📚-PromptHandBook-grey.svg?logo=book height=22px></a>
18
+ </div>
19
+
20
+ <!-- <p align="center">
21
+ 👏 Join our <a href="./assets/WECHAT.md" target="_blank">WeChat</a> and <a href="https://discord.gg/ehjWMqF5wY">Discord</a> |
22
+ 💻 <a href="https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct">Official website(官网) Try our model!</a>&nbsp&nbsp
23
+ </p> -->
24
+
25
+ ## 🔥 News
26
+
27
+ - **March 6, 2025**: 🎉 **[HY-WU](https://github.com/Tencent-Hunyuan/HY-WU)** open source - Inference code and model weights publicly available.
28
+
29
+ ## 🗂️ Contents
30
+ - [🔥 News](#-news)
31
+ - [📖 Introduction](#-introduction)
32
+ - [✨ Key Features](#-key-features)
33
+ - [🖼 Showcases](#-showcases)
34
+ - [📑 Open-Source Plan](#-open-source-plan)
35
+ - [🚀 Usage](#-usage)
36
+ - [🧱 Memory Requirement](#-memory-requirement)
37
+ - [📊 Evaluation](#-evaluation)
38
+ - [📚 Citation](#-citation)
39
+
40
  ---
41
+
42
+ ## 📖 Introduction
43
+
44
+ We propose HY-WU: a scalable framework for on-the-fly conditional generation of low-rank (LoRA) updates.
45
+ HY-WU synthesizes instance-conditioned adapter weights from hybrid image–instruction representations and injects them into a frozen backbone during the forward pass, producing instance-specific operators without test-time optimization.
46
+
47
+ <div align="center">
48
+ <img src="./assets/animation.gif" alt="HY-WU Animation" width="800">
49
+ </div>
50
+
51
+ ## ✨ Key Features
52
+
53
+ * 🧠 **Functional Neural Memory:**
54
+ Introduces a lightweight “neural memory” for AI. Generates conditioned model adapter per request (without finetuning!), enabling instance-level personalization while preserving the base model’s general capability.
55
+
56
+ * 🏆 **Scalable for Large Models:**
57
+ HY-WU remains practical for large foundation models (even at 80B parameters!). With structured parameter tokenization, the method naturally compatible with large-scale architectures.
58
+
59
+ * 🎨 **Strong Human Preference:**
60
+ HY-WU achieves high human preference win-rates against open-source models, exceeds strong closed-source baselines, and remains close to the latest Nano-Banana series.
61
+
62
+ ## 🖼 Showcases
63
+
64
+ **Showcase 1: Cross-Domain Clothing Fusion**
65
+
66
+ <div align="center">
67
+ <img src="./assets/showcase1.png" width="90%">
68
+ </div>
69
+
70
+ **Showcase 2: Creative Cosplay and Character Outfit Migration**
71
+
72
+ <div align="center">
73
+ <img src="./assets/showcase2.png" width="90%">
74
+ </div>
75
+
76
+ **Showcase 3: High-Fidelity Face Identity Transfer**
77
+
78
+ <div align="center">
79
+ <img src="./assets/showcase3.png" width="90%">
80
+ </div>
81
+
82
+ **Showcase 4: Seamless Outfit Transfer and Virtual Try-on**
83
+
84
+ <div align="center">
85
+ <img src="./assets/showcase4.png" width="90%">
86
+ </div>
87
+
88
+ **Showcase 5: High-Quality Texture Synthesis**
89
+
90
+ <div align="center">
91
+ <img src="./assets/showcase5.png" width="90%">
92
+ </div>
93
+
94
+ ## 📑 Open-source Plan
95
+
96
+ - HY-WU
97
+ - [x] Inference
98
+ - [x] HY-Image-3.0-Instruct's checkpoint
99
+ - [ ] Distilled checkpoint
100
+ - [ ] Other models' checkpoint
101
+
102
+
103
+ ## 🚀 Usage
104
+
105
+ #### 🏠 Clone the repository
106
+
107
+ ```bash
108
+ git clone https://github.com/Tencent-Hunyuan/HY-WU.git
109
+ cd HY-WU
110
+ ```
111
+
112
+ #### 📥 Install dependencies
113
+
114
+ ```bash
115
+ pip install -r requirements.txt
116
+ ```
117
+
118
+ #### 🔥 Play with the code
119
+
120
+ Directly run `infer.py`
121
+
122
+ ```python
123
+ python infer.py
124
+ ```
125
+
126
+ Or use the code below:
127
+
128
+ ```python
129
+ from wu import WUPipeline
130
+
131
+ base_model_path = "tencent/HunyuanImage-3.0-Instruct"
132
+ pg_model_path = "tencent/HY-WU"
133
+
134
+ pipeline = WUPipeline(
135
+ base_model_path=base_model_path,
136
+ pg_model_path=pg_model_path,
137
+ device_map="auto",
138
+ moe_impl="eager",
139
+ moe_drop_tokens=False,
140
+ )
141
+
142
+ prompt = "以图1为底图,将图2公仔穿的衣物换到图1人物身上;保持图1人物、姿态和背景不变,自然贴合并融���。"
143
+ # prompt_en = Using Figure 1 as the base image, replace the clothing on the character in Figure 1 with the outfit worn by the figurine in Figure 2. Keep the character, pose, and background of Figure 1 unchanged, ensuring the new clothing fits naturally and blends seamlessly.
144
+ imgs_input = ["./assets/input_1_1.png", "./assets/input_1_2.png"]
145
+
146
+ sample = pipeline.generate(prompt=prompt, imgs_input=imgs_input, diff_infer_steps=50, seed=42, verbose=2)
147
+
148
+ sample.save("./output.png")
149
+
150
+ ```
151
+
152
+ #### 🎨 Interactive Gradio Demo
153
+
154
+ Launch an interactive web interface for easy image-to-image generation.
155
+
156
+ ```bash
157
+ pip install gradio>=4.21.0
158
+
159
+ python gradio/app.py
160
+ ```
161
+
162
+ > 🌐 **Web Interface:** Open your browser and navigate to `http://localhost:7680` or shared link.
163
+
164
+ </details>
165
+
166
+ ## 🧱 Memory Requirement
167
+
168
+ | Base model param | HY-WU param | Recommended VRAM |
169
+ |--------------------| ----------- | ----------------------- |
170
+ | 80B (13B active) | 8B | ≥ 8 × 40 GB or 4 x 80GB |
171
+
172
+ Notes:
173
+ - Multi‑GPU inference is required for the base model.
174
+
175
+ ## 📊 Evaluation
176
+
177
+ ### 👥 **GSB (Human Evaluation)**
178
+
179
+ HY-WU substantially outperforms leading open-source models, and remain competitive with top-tier closed-source commercial systems.
180
+ While Nano Banana 2 and Nano Banana Pro achieve slightly higher overall scores (52.4\% and 53.8\%, respectively), the margin remains modest.
181
+
182
+ Given that these commercial systems are likely trained with substantially larger-scale backbones and proprietary data, the modest performance gap suggests that our operator-level conditional adaptation remains effective even under more constrained model scale.
183
+
184
+ <p align="center">
185
+ <img src="./assets/gsb.png" width=70% alt="Human Evaluation with Other Models">
186
+ </p>
187
+
188
+
189
+
190
+ ## 📚 Citation
191
+
192
+ If you find HY-WU useful in your research, please cite our work:
193
+
194
+ ```bibtex
195
+ @misc{wu2026hy-wu,
196
+ author = {Tencent HY Team, Mengxuan Wu, Xuanlei Zhao, Ziqiao Wang, Ruichfeng Feng, Atlas Wang, Qinglin Lu, and Kai Wang},
197
+ title = {HY-WU (Part I): An Extensible Functional Neural Memory Framework and An Instantiation in Text-Guided Image Editing},
198
+ year = {2026},
199
+ publisher = {GitHub},
200
+ journal = {GitHub repository},
201
+ howpublished = {\url{https://github.com/Tencent-Hunyuan/HY-WU}},
202
+ note = {Preprint}
203
+ }
204
+ ```
assets/animation.gif ADDED

Git LFS Details

  • SHA256: c12ddb417b09478fd4f41e51fae8a24ae491b089c6812ec8cc40aef831bfb234
  • Pointer size: 132 Bytes
  • Size of remote file: 1.67 MB
assets/gsb.png ADDED

Git LFS Details

  • SHA256: 218b693f6b1ae83d86cdab16f6571ac596509f4d8eb1631f02633a8f471e41cb
  • Pointer size: 131 Bytes
  • Size of remote file: 195 kB
assets/input_1_1.png ADDED

Git LFS Details

  • SHA256: 77a0d5ff660e986e445b91c95a2684dffdd0f9412f82c242037d8c5373bfd0f0
  • Pointer size: 132 Bytes
  • Size of remote file: 2.89 MB
assets/input_1_2.png ADDED

Git LFS Details

  • SHA256: 0c865de97a6e8b9a1d199a6a64b746dd139dc0b06b30ffec230d7f2315f65172
  • Pointer size: 132 Bytes
  • Size of remote file: 1.16 MB
assets/showcase1.png ADDED

Git LFS Details

  • SHA256: fb9409ba4b61fde745b9e695526c7b1bfd6b1a074dc0ae5d4b5c68348e920857
  • Pointer size: 132 Bytes
  • Size of remote file: 6.67 MB
assets/showcase2.png ADDED

Git LFS Details

  • SHA256: d5d4e449c9cc0885836688ca8087224b8bbe5966fd2a8f480ad9fa67ef09fe07
  • Pointer size: 133 Bytes
  • Size of remote file: 14.8 MB
assets/showcase3.png ADDED

Git LFS Details

  • SHA256: 8990695f4f192cf1ef14cd756bc4e64332b903984469036a6f8d4499b85dbe8e
  • Pointer size: 132 Bytes
  • Size of remote file: 8.5 MB
assets/showcase4.png ADDED

Git LFS Details

  • SHA256: d65340e29efebf13b4d21b629b3eb383cca567827fa27874930cfe4b85b127aa
  • Pointer size: 133 Bytes
  • Size of remote file: 12.8 MB
assets/showcase5.png ADDED

Git LFS Details

  • SHA256: 5ededcca38444d015100b26b4a6a2e8fa4fcd40f09599af07d2363a6464ffa64
  • Pointer size: 133 Bytes
  • Size of remote file: 20.6 MB
assets/teaser.png ADDED

Git LFS Details

  • SHA256: 7fbe4c7cb7ed5f6d5761a43f74d6b480d6f1db3bb03f12278201e09608bc214c
  • Pointer size: 132 Bytes
  • Size of remote file: 5.25 MB
assets/tencent-hy-wu-logo.svg ADDED
assets/workflow.png ADDED

Git LFS Details

  • SHA256: 58d84a487495f0a2505131ce9a1e950def1eebf186fb0e37b6291a08cbee21ba
  • Pointer size: 131 Bytes
  • Size of remote file: 214 kB