Add model card and metadata
#1
by nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,3 +1,65 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
--
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
pipeline_tag: image-to-image
|
| 4 |
+
---
|
| 5 |
+
|
| 6 |
+
# RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details
|
| 7 |
+
|
| 8 |
+
RefineAnything targets **region-specific image refinement**: given an input image and a user-specified region (e.g., scribble mask or bounding box), it restores fine-grained details—text, logos, thin structures—while keeping **all non-edited pixels unchanged**. It supports both **reference-based** and **reference-free** refinement.
|
| 9 |
+
|
| 10 |
+
- **Paper:** [2604.06870](https://huggingface.co/papers/2604.06870)
|
| 11 |
+
- **Project Page:** [https://limuloo.github.io/RefineAnything/](https://limuloo.github.io/RefineAnything/)
|
| 12 |
+
- **GitHub:** [https://github.com/limuloo/RefineAnything](https://github.com/limuloo/RefineAnything)
|
| 13 |
+
- **Demo:** [Hugging Face Space](https://huggingface.co/spaces/limuloo1999/RefineAnything)
|
| 14 |
+
|
| 15 |
+
## Highlights
|
| 16 |
+
|
| 17 |
+
- **Region-accurate refinement** — Explicit region cues (scribbles or boxes) steer edits to the target area.
|
| 18 |
+
- **Reference-based and reference-free** — Optional reference image for guided local detail recovery.
|
| 19 |
+
- **Strict background preservation** — Edits stay inside the target region; training emphasizes seamless boundaries.
|
| 20 |
+
|
| 21 |
+
## Usage
|
| 22 |
+
|
| 23 |
+
To use RefineAnything, you need an input image, a binary mask (where white indicates the region to refine), and a text prompt.
|
| 24 |
+
|
| 25 |
+
### Installation
|
| 26 |
+
|
| 27 |
+
```bash
|
| 28 |
+
pip install -r requirement.txt
|
| 29 |
+
```
|
| 30 |
+
|
| 31 |
+
### Reference-based Logo Refinement
|
| 32 |
+
|
| 33 |
+
Refine a blurry logo on a pillow using a reference image:
|
| 34 |
+
|
| 35 |
+
```bash
|
| 36 |
+
python scripts/fast_inference.py \
|
| 37 |
+
--input src/input1.png \
|
| 38 |
+
--mask src/mask1.png \
|
| 39 |
+
--prompt "Refine the LOGO." \
|
| 40 |
+
--ref src/ref1.png \
|
| 41 |
+
--output output/demo1.png
|
| 42 |
+
```
|
| 43 |
+
|
| 44 |
+
### Reference-free Text Refinement
|
| 45 |
+
|
| 46 |
+
Refine blurry text on a building sign without a reference image:
|
| 47 |
+
|
| 48 |
+
```bash
|
| 49 |
+
python scripts/fast_inference.py \
|
| 50 |
+
--input src/input2.png \
|
| 51 |
+
--mask src/mask2.png \
|
| 52 |
+
--prompt "refine the text '鼎好商城'" \
|
| 53 |
+
--output output/demo2.png
|
| 54 |
+
```
|
| 55 |
+
|
| 56 |
+
## Citation
|
| 57 |
+
|
| 58 |
+
```bibtex
|
| 59 |
+
@article{zhou2026refineanything,
|
| 60 |
+
title={RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details},
|
| 61 |
+
author={Zhou, Dewei and Li, You and Yang, Zongxin and Yang, Yi},
|
| 62 |
+
journal={arXiv preprint arXiv:2604.06870},
|
| 63 |
+
year={2026}
|
| 64 |
+
}
|
| 65 |
+
```
|