How to use from the
Use from the
Diffusers library
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline
from diffusers.utils import load_image

# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("fill-in-base-model", dtype=torch.bfloat16, device_map="cuda")
pipe.load_lora_weights("donghao-zhou/HiFi-Inpaint")

prompt = "Turn this cat into a dog"
input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png")

image = pipe(image=input_image, prompt=prompt).images[0]

HiFi-Inpaint: High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images

Yichen Liu1,*, Donghao Zhou2,*, Jie Wang3, Xin Gao3, Guisheng Liu3, Jiatong Li3,†, Quanwei Zhang4,
Qiang Lyu1, Lanqing Guo5, Shilei Wen3,Β§, Weiqiang Wang1,Β§, Pheng-Ann Heng2,Β§

1University of Chinese Academy of Sciences, 2The Chinese University of Hong Kong, 3ByteDance,
4Zhejiang University, 5UT Austin

*Equal contribution, †Project lead, Β§Corresponding author

🌍 Useful Links


πŸ“Œ Model Summary

HiFi-Inpaint is a reference-based human-product image inpainting model for generating detail-preserving human-product images. Given a product reference image, a masked condition image, and a text prompt/caption, the model is designed to reconstruct the missing region while preserving fine-grained product appearance.

This repository contains the released model weights for HiFi-Inpaint, intended for research and model development on high-fidelity reference-guided inpainting.

πŸ—‚οΈ Repository Files

HiFi-Inpaint/
β”œβ”€β”€ README.md
β”œβ”€β”€ alpha_blocks.pt
└── pytorch_lora_weights.safetensors
  • pytorch_lora_weights.safetensors: LoRA weights for the HiFi-Inpaint model.
  • alpha_blocks.pt: auxiliary alpha-block weights used by the HiFi-Inpaint model pipeline.

🎯 Intended Uses

HiFi-Inpaint is intended for research and model development on reference-based human-product generation and inpainting. Typical use cases include:

  • Product-reference-guided image inpainting.
  • Generating detail-preserving human-product images.
  • Fine-tuning or analyzing reference-conditioned image generation pipelines.
  • Studying product appearance preservation under masked-image reconstruction settings.

This model is released as research weights and is not intended for deceptive, harmful, privacy-violating, or otherwise unlawful applications.

πŸ’» How to Use

Please refer to the official code repository for installation, pipeline construction, and inference scripts:

https://github.com/Correr-Zhou/HiFi-Inpaint

A typical setup should download this repository's weights and load:

  • pytorch_lora_weights.safetensors as the LoRA checkpoint.
  • alpha_blocks.pt as the auxiliary alpha-block checkpoint required by the inference pipeline.

πŸ“š Training Data

The model is associated with HP-Image-40K, a training dataset for high-fidelity reference-based human-product image inpainting. The dataset contains 43,632 aligned training samples with product reference images, ground-truth target images, masked condition images, binary masks, and captions.

Dataset repository: https://huggingface.co/datasets/donghao-zhou/HP-Image-40K

βš–οΈ Usage Note

This model is released for research and model development purposes.

  • Users should ensure that downstream use complies with the model license, dataset license, and applicable regulations.
  • The model should not be used for deceptive, harmful, or privacy-violating applications.
  • Generated outputs should be reviewed before public or commercial use.

πŸ”— Citation

If you find this model useful in your research, please cite:

@article{liu2026hifiinpaint,
  title={HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images},
  author={Liu, Yichen and Zhou, Donghao and Wang, Jie and Gao, Xin and Liu, Guisheng and Li, Jiatong and Zhang, Quanwei and Lyu, Qiang and Guo, Lanqing and Wen, Shilei and Wang, Weiqiang and Heng, Pheng-Ann},
  journal={arXiv preprint arXiv:2603.02210},
  year={2026}
}

πŸ“¬ Contact

For questions about the model or dataset, please contact Donghao Zhou: dhzhou@link.cuhk.edu.hk.

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for donghao-zhou/HiFi-Inpaint