donghao-zhou commited on
Commit
5283d91
Β·
verified Β·
1 Parent(s): bd84a70

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +116 -0
README.md ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ pipeline_tag: image-to-image
4
+ library_name: diffusers
5
+ tags:
6
+ - image-generation
7
+ - image-inpainting
8
+ - reference-based-inpainting
9
+ - human-product-images
10
+ - lora
11
+ - hifi-inpaint
12
+ ---
13
+
14
+ <h1 align="center" style="line-height: 50px;">
15
+ HiFi-Inpaint: High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images
16
+ </h1>
17
+
18
+ <div align="center">
19
+ Yichen Liu<sup>1,*</sup>, Donghao Zhou<sup>2,*</sup>, Jie Wang<sup>3</sup>, Xin Gao<sup>3</sup>, Guisheng Liu<sup>3</sup>, Jiatong Li<sup>3,†</sup>, Quanwei Zhang<sup>4</sup>,<br>
20
+ Qiang Lyu<sup>1</sup>, Lanqing Guo<sup>5</sup>, Shilei Wen<sup>3,Β§</sup>, Weiqiang Wang<sup>1,Β§</sup>, Pheng-Ann Heng<sup>2,Β§</sup>
21
+ </div>
22
+
23
+ <br>
24
+
25
+ <div align="center">
26
+ <sup>1</sup>University of Chinese Academy of Sciences, <sup>2</sup>The Chinese University of Hong Kong, <sup>3</sup>ByteDance,<br>
27
+ <sup>4</sup>Zhejiang University, <sup>5</sup>UT Austin
28
+ </div>
29
+
30
+ <br>
31
+
32
+ <div align="center">
33
+ <sup>*</sup>Equal contribution, <sup>†</sup>Project lead, <sup>Β§</sup>Corresponding author
34
+ </div>
35
+
36
+ <br>
37
+
38
+ ## 🌍 Useful Links
39
+
40
+ - Project Page: https://correr-zhou.github.io/HiFi-Inpaint/
41
+ - Paper: https://arxiv.org/pdf/2603.02210
42
+ - Code: https://github.com/Correr-Zhou/HiFi-Inpaint
43
+ - Training Dataset: https://huggingface.co/datasets/donghao-zhou/HP-Image-40K
44
+
45
+ ---
46
+
47
+ ## πŸ“Œ Model Summary
48
+
49
+ **HiFi-Inpaint** is a reference-based human-product image inpainting model for generating detail-preserving human-product images. Given a product reference image, a masked condition image, and a text prompt/caption, the model is designed to reconstruct the missing region while preserving fine-grained product appearance.
50
+
51
+ This repository contains the released model weights for HiFi-Inpaint, intended for research and model development on high-fidelity reference-guided inpainting.
52
+
53
+ ## πŸ—‚οΈ Repository Files
54
+
55
+ ```text
56
+ HiFi-Inpaint/
57
+ β”œβ”€β”€ README.md
58
+ β”œβ”€β”€ alpha_blocks.pt
59
+ └── pytorch_lora_weights.safetensors
60
+ ```
61
+
62
+ - `pytorch_lora_weights.safetensors`: LoRA weights for the HiFi-Inpaint model.
63
+ - `alpha_blocks.pt`: auxiliary alpha-block weights used by the HiFi-Inpaint model pipeline.
64
+
65
+ ## 🎯 Intended Uses
66
+
67
+ HiFi-Inpaint is intended for **research and model development** on reference-based human-product generation and inpainting. Typical use cases include:
68
+
69
+ - Product-reference-guided image inpainting.
70
+ - Generating detail-preserving human-product images.
71
+ - Fine-tuning or analyzing reference-conditioned image generation pipelines.
72
+ - Studying product appearance preservation under masked-image reconstruction settings.
73
+
74
+ This model is released as research weights and is not intended for deceptive, harmful, privacy-violating, or otherwise unlawful applications.
75
+
76
+ ## πŸ’» How to Use
77
+
78
+ Please refer to the official code repository for installation, pipeline construction, and inference scripts:
79
+
80
+ https://github.com/Correr-Zhou/HiFi-Inpaint
81
+
82
+ A typical setup should download this repository's weights and load:
83
+
84
+ - `pytorch_lora_weights.safetensors` as the LoRA checkpoint.
85
+ - `alpha_blocks.pt` as the auxiliary alpha-block checkpoint required by the inference pipeline.
86
+
87
+ ## πŸ“š Training Data
88
+
89
+ The model is associated with **HP-Image-40K**, a training dataset for high-fidelity reference-based human-product image inpainting. The dataset contains **43,632** aligned training samples with product reference images, ground-truth target images, masked condition images, binary masks, and captions.
90
+
91
+ Dataset repository: https://huggingface.co/datasets/donghao-zhou/HP-Image-40K
92
+
93
+ ## βš–οΈ Usage Note
94
+
95
+ This model is released for **research and model development** purposes.
96
+
97
+ - Users should ensure that downstream use complies with the model license, dataset license, and applicable regulations.
98
+ - The model should not be used for deceptive, harmful, or privacy-violating applications.
99
+ - Generated outputs should be reviewed before public or commercial use.
100
+
101
+ ## πŸ”— Citation
102
+
103
+ If you find this model useful in your research, please cite:
104
+
105
+ ```bibtex
106
+ @article{liu2026hifiinpaint,
107
+ title={HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images},
108
+ author={Liu, Yichen and Zhou, Donghao and Wang, Jie and Gao, Xin and Liu, Guisheng and Li, Jiatong and Zhang, Quanwei and Lyu, Qiang and Guo, Lanqing and Wen, Shilei and Wang, Weiqiang and Heng, Pheng-Ann},
109
+ journal={arXiv preprint arXiv:2603.02210},
110
+ year={2026}
111
+ }
112
+ ```
113
+
114
+ ## πŸ“¬ Contact
115
+
116
+ For questions about the model or dataset, please contact Donghao Zhou: [dhzhou@link.cuhk.edu.hk](mailto:dhzhou@link.cuhk.edu.hk).