update readme
Browse files- .gitattributes +11 -0
- README.md +203 -4
- assets/animation.gif +3 -0
- assets/gsb.png +3 -0
- assets/input_1_1.png +3 -0
- assets/input_1_2.png +3 -0
- assets/showcase1.png +3 -0
- assets/showcase2.png +3 -0
- assets/showcase3.png +3 -0
- assets/showcase4.png +3 -0
- assets/showcase5.png +3 -0
- assets/teaser.png +3 -0
- assets/tencent-hy-wu-logo.svg +30 -0
- assets/workflow.png +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,14 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
assets/animation.gif filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
assets/gsb.png filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
assets/input_1_1.png filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
assets/input_1_2.png filter=lfs diff=lfs merge=lfs -text
|
| 40 |
+
assets/showcase1.png filter=lfs diff=lfs merge=lfs -text
|
| 41 |
+
assets/showcase2.png filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
assets/showcase3.png filter=lfs diff=lfs merge=lfs -text
|
| 43 |
+
assets/showcase4.png filter=lfs diff=lfs merge=lfs -text
|
| 44 |
+
assets/showcase5.png filter=lfs diff=lfs merge=lfs -text
|
| 45 |
+
assets/teaser.png filter=lfs diff=lfs merge=lfs -text
|
| 46 |
+
assets/workflow.png filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -1,5 +1,204 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<div align="center">
|
| 2 |
+
<img src="./assets/tencent-hy-wu-logo.svg" alt="HY-WU Logo" width="600">
|
| 3 |
+
|
| 4 |
+
# HY-WU (Part I): An Extensible Functional Neural Memory Framework and An Instantiation in Text-Guided Image Editing
|
| 5 |
+
</div>
|
| 6 |
+
|
| 7 |
+
<div align="center">
|
| 8 |
+
<img src="./assets/teaser.png" alt="HY-WU Teaser" width="800">
|
| 9 |
+
</div>
|
| 10 |
+
|
| 11 |
+
<div align="center">
|
| 12 |
+
<a href=https://tencent-hy-wu.github.io/ target="_blank"><img src=https://img.shields.io/badge/🌐%20Demo-4285F4.svg height=22px></a>
|
| 13 |
+
<a href=https://huggingface.co/tencent/HY-WU target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-d96902.svg height=22px></a>
|
| 14 |
+
<a href=https://github.com/Tencent-Hunyuan/HY-WU target="_blank"><img src=https://img.shields.io/badge/GitHub-181717.svg?logo=github height=22px></a>
|
| 15 |
+
<a href=https://github.com/Tencent-Hunyuan/HY-WU/assets/report.pdf target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
|
| 16 |
+
<a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
|
| 17 |
+
<a href=https://docs.qq.com/doc/DUVVadmhCdG9qRXBU target="_blank"><img src=https://img.shields.io/badge/📚-PromptHandBook-grey.svg?logo=book height=22px></a>
|
| 18 |
+
</div>
|
| 19 |
+
|
| 20 |
+
<!-- <p align="center">
|
| 21 |
+
👏 Join our <a href="./assets/WECHAT.md" target="_blank">WeChat</a> and <a href="https://discord.gg/ehjWMqF5wY">Discord</a> |
|
| 22 |
+
💻 <a href="https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct">Official website(官网) Try our model!</a>  
|
| 23 |
+
</p> -->
|
| 24 |
+
|
| 25 |
+
## 🔥 News
|
| 26 |
+
|
| 27 |
+
- **March 6, 2025**: 🎉 **[HY-WU](https://github.com/Tencent-Hunyuan/HY-WU)** open source - Inference code and model weights publicly available.
|
| 28 |
+
|
| 29 |
+
## 🗂️ Contents
|
| 30 |
+
- [🔥 News](#-news)
|
| 31 |
+
- [📖 Introduction](#-introduction)
|
| 32 |
+
- [✨ Key Features](#-key-features)
|
| 33 |
+
- [🖼 Showcases](#-showcases)
|
| 34 |
+
- [📑 Open-Source Plan](#-open-source-plan)
|
| 35 |
+
- [🚀 Usage](#-usage)
|
| 36 |
+
- [🧱 Memory Requirement](#-memory-requirement)
|
| 37 |
+
- [📊 Evaluation](#-evaluation)
|
| 38 |
+
- [📚 Citation](#-citation)
|
| 39 |
+
|
| 40 |
---
|
| 41 |
+
|
| 42 |
+
## 📖 Introduction
|
| 43 |
+
|
| 44 |
+
We propose HY-WU: a scalable framework for on-the-fly conditional generation of low-rank (LoRA) updates.
|
| 45 |
+
HY-WU synthesizes instance-conditioned adapter weights from hybrid image–instruction representations and injects them into a frozen backbone during the forward pass, producing instance-specific operators without test-time optimization.
|
| 46 |
+
|
| 47 |
+
<div align="center">
|
| 48 |
+
<img src="./assets/animation.gif" alt="HY-WU Animation" width="800">
|
| 49 |
+
</div>
|
| 50 |
+
|
| 51 |
+
## ✨ Key Features
|
| 52 |
+
|
| 53 |
+
* 🧠 **Functional Neural Memory:**
|
| 54 |
+
Introduces a lightweight “neural memory” for AI. Generates conditioned model adapter per request (without finetuning!), enabling instance-level personalization while preserving the base model’s general capability.
|
| 55 |
+
|
| 56 |
+
* 🏆 **Scalable for Large Models:**
|
| 57 |
+
HY-WU remains practical for large foundation models (even at 80B parameters!). With structured parameter tokenization, the method naturally compatible with large-scale architectures.
|
| 58 |
+
|
| 59 |
+
* 🎨 **Strong Human Preference:**
|
| 60 |
+
HY-WU achieves high human preference win-rates against open-source models, exceeds strong closed-source baselines, and remains close to the latest Nano-Banana series.
|
| 61 |
+
|
| 62 |
+
## 🖼 Showcases
|
| 63 |
+
|
| 64 |
+
**Showcase 1: Cross-Domain Clothing Fusion**
|
| 65 |
+
|
| 66 |
+
<div align="center">
|
| 67 |
+
<img src="./assets/showcase1.png" width="90%">
|
| 68 |
+
</div>
|
| 69 |
+
|
| 70 |
+
**Showcase 2: Creative Cosplay and Character Outfit Migration**
|
| 71 |
+
|
| 72 |
+
<div align="center">
|
| 73 |
+
<img src="./assets/showcase2.png" width="90%">
|
| 74 |
+
</div>
|
| 75 |
+
|
| 76 |
+
**Showcase 3: High-Fidelity Face Identity Transfer**
|
| 77 |
+
|
| 78 |
+
<div align="center">
|
| 79 |
+
<img src="./assets/showcase3.png" width="90%">
|
| 80 |
+
</div>
|
| 81 |
+
|
| 82 |
+
**Showcase 4: Seamless Outfit Transfer and Virtual Try-on**
|
| 83 |
+
|
| 84 |
+
<div align="center">
|
| 85 |
+
<img src="./assets/showcase4.png" width="90%">
|
| 86 |
+
</div>
|
| 87 |
+
|
| 88 |
+
**Showcase 5: High-Quality Texture Synthesis**
|
| 89 |
+
|
| 90 |
+
<div align="center">
|
| 91 |
+
<img src="./assets/showcase5.png" width="90%">
|
| 92 |
+
</div>
|
| 93 |
+
|
| 94 |
+
## 📑 Open-source Plan
|
| 95 |
+
|
| 96 |
+
- HY-WU
|
| 97 |
+
- [x] Inference
|
| 98 |
+
- [x] HY-Image-3.0-Instruct's checkpoint
|
| 99 |
+
- [ ] Distilled checkpoint
|
| 100 |
+
- [ ] Other models' checkpoint
|
| 101 |
+
|
| 102 |
+
|
| 103 |
+
## 🚀 Usage
|
| 104 |
+
|
| 105 |
+
#### 🏠 Clone the repository
|
| 106 |
+
|
| 107 |
+
```bash
|
| 108 |
+
git clone https://github.com/Tencent-Hunyuan/HY-WU.git
|
| 109 |
+
cd HY-WU
|
| 110 |
+
```
|
| 111 |
+
|
| 112 |
+
#### 📥 Install dependencies
|
| 113 |
+
|
| 114 |
+
```bash
|
| 115 |
+
pip install -r requirements.txt
|
| 116 |
+
```
|
| 117 |
+
|
| 118 |
+
#### 🔥 Play with the code
|
| 119 |
+
|
| 120 |
+
Directly run `infer.py`
|
| 121 |
+
|
| 122 |
+
```python
|
| 123 |
+
python infer.py
|
| 124 |
+
```
|
| 125 |
+
|
| 126 |
+
Or use the code below:
|
| 127 |
+
|
| 128 |
+
```python
|
| 129 |
+
from wu import WUPipeline
|
| 130 |
+
|
| 131 |
+
base_model_path = "tencent/HunyuanImage-3.0-Instruct"
|
| 132 |
+
pg_model_path = "tencent/HY-WU"
|
| 133 |
+
|
| 134 |
+
pipeline = WUPipeline(
|
| 135 |
+
base_model_path=base_model_path,
|
| 136 |
+
pg_model_path=pg_model_path,
|
| 137 |
+
device_map="auto",
|
| 138 |
+
moe_impl="eager",
|
| 139 |
+
moe_drop_tokens=False,
|
| 140 |
+
)
|
| 141 |
+
|
| 142 |
+
prompt = "以图1为底图,将图2公仔穿的衣物换到图1人物身上;保持图1人物、姿态和背景不变,自然贴合并融���。"
|
| 143 |
+
# prompt_en = Using Figure 1 as the base image, replace the clothing on the character in Figure 1 with the outfit worn by the figurine in Figure 2. Keep the character, pose, and background of Figure 1 unchanged, ensuring the new clothing fits naturally and blends seamlessly.
|
| 144 |
+
imgs_input = ["./assets/input_1_1.png", "./assets/input_1_2.png"]
|
| 145 |
+
|
| 146 |
+
sample = pipeline.generate(prompt=prompt, imgs_input=imgs_input, diff_infer_steps=50, seed=42, verbose=2)
|
| 147 |
+
|
| 148 |
+
sample.save("./output.png")
|
| 149 |
+
|
| 150 |
+
```
|
| 151 |
+
|
| 152 |
+
#### 🎨 Interactive Gradio Demo
|
| 153 |
+
|
| 154 |
+
Launch an interactive web interface for easy image-to-image generation.
|
| 155 |
+
|
| 156 |
+
```bash
|
| 157 |
+
pip install gradio>=4.21.0
|
| 158 |
+
|
| 159 |
+
python gradio/app.py
|
| 160 |
+
```
|
| 161 |
+
|
| 162 |
+
> 🌐 **Web Interface:** Open your browser and navigate to `http://localhost:7680` or shared link.
|
| 163 |
+
|
| 164 |
+
</details>
|
| 165 |
+
|
| 166 |
+
## 🧱 Memory Requirement
|
| 167 |
+
|
| 168 |
+
| Base model param | HY-WU param | Recommended VRAM |
|
| 169 |
+
|--------------------| ----------- | ----------------------- |
|
| 170 |
+
| 80B (13B active) | 8B | ≥ 8 × 40 GB or 4 x 80GB |
|
| 171 |
+
|
| 172 |
+
Notes:
|
| 173 |
+
- Multi‑GPU inference is required for the base model.
|
| 174 |
+
|
| 175 |
+
## 📊 Evaluation
|
| 176 |
+
|
| 177 |
+
### 👥 **GSB (Human Evaluation)**
|
| 178 |
+
|
| 179 |
+
HY-WU substantially outperforms leading open-source models, and remain competitive with top-tier closed-source commercial systems.
|
| 180 |
+
While Nano Banana 2 and Nano Banana Pro achieve slightly higher overall scores (52.4\% and 53.8\%, respectively), the margin remains modest.
|
| 181 |
+
|
| 182 |
+
Given that these commercial systems are likely trained with substantially larger-scale backbones and proprietary data, the modest performance gap suggests that our operator-level conditional adaptation remains effective even under more constrained model scale.
|
| 183 |
+
|
| 184 |
+
<p align="center">
|
| 185 |
+
<img src="./assets/gsb.png" width=70% alt="Human Evaluation with Other Models">
|
| 186 |
+
</p>
|
| 187 |
+
|
| 188 |
+
|
| 189 |
+
|
| 190 |
+
## 📚 Citation
|
| 191 |
+
|
| 192 |
+
If you find HY-WU useful in your research, please cite our work:
|
| 193 |
+
|
| 194 |
+
```bibtex
|
| 195 |
+
@misc{wu2026hy-wu,
|
| 196 |
+
author = {Tencent HY Team, Mengxuan Wu, Xuanlei Zhao, Ziqiao Wang, Ruichfeng Feng, Atlas Wang, Qinglin Lu, and Kai Wang},
|
| 197 |
+
title = {HY-WU (Part I): An Extensible Functional Neural Memory Framework and An Instantiation in Text-Guided Image Editing},
|
| 198 |
+
year = {2026},
|
| 199 |
+
publisher = {GitHub},
|
| 200 |
+
journal = {GitHub repository},
|
| 201 |
+
howpublished = {\url{https://github.com/Tencent-Hunyuan/HY-WU}},
|
| 202 |
+
note = {Preprint}
|
| 203 |
+
}
|
| 204 |
+
```
|
assets/animation.gif
ADDED
|
Git LFS Details
|
assets/gsb.png
ADDED
|
Git LFS Details
|
assets/input_1_1.png
ADDED
|
Git LFS Details
|
assets/input_1_2.png
ADDED
|
Git LFS Details
|
assets/showcase1.png
ADDED
|
Git LFS Details
|
assets/showcase2.png
ADDED
|
Git LFS Details
|
assets/showcase3.png
ADDED
|
Git LFS Details
|
assets/showcase4.png
ADDED
|
Git LFS Details
|
assets/showcase5.png
ADDED
|
Git LFS Details
|
assets/teaser.png
ADDED
|
Git LFS Details
|
assets/tencent-hy-wu-logo.svg
ADDED
|
|
assets/workflow.png
ADDED
|
Git LFS Details
|