RefineAnything / README.md
nielsr's picture
nielsr HF Staff
Add model card and metadata
4aad799 verified
|
raw
history blame
2.26 kB
metadata
license: apache-2.0
pipeline_tag: image-to-image

RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details

RefineAnything targets region-specific image refinement: given an input image and a user-specified region (e.g., scribble mask or bounding box), it restores fine-grained details—text, logos, thin structures—while keeping all non-edited pixels unchanged. It supports both reference-based and reference-free refinement.

Highlights

  • Region-accurate refinement — Explicit region cues (scribbles or boxes) steer edits to the target area.
  • Reference-based and reference-free — Optional reference image for guided local detail recovery.
  • Strict background preservation — Edits stay inside the target region; training emphasizes seamless boundaries.

Usage

To use RefineAnything, you need an input image, a binary mask (where white indicates the region to refine), and a text prompt.

Installation

pip install -r requirement.txt

Reference-based Logo Refinement

Refine a blurry logo on a pillow using a reference image:

python scripts/fast_inference.py \
    --input  src/input1.png \
    --mask   src/mask1.png \
    --prompt "Refine the LOGO." \
    --ref    src/ref1.png \
    --output output/demo1.png

Reference-free Text Refinement

Refine blurry text on a building sign without a reference image:

python scripts/fast_inference.py \
    --input  src/input2.png \
    --mask   src/mask2.png \
    --prompt "refine the text '鼎好商城'" \
    --output output/demo2.png

Citation

@article{zhou2026refineanything,
  title={RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details},
  author={Zhou, Dewei and Li, You and Yang, Zongxin and Yang, Yi},
  journal={arXiv preprint arXiv:2604.06870},
  year={2026}
}