nielsr HF Staff commited on
Commit
234c7ca
·
verified ·
1 Parent(s): 2131cfd

Enhance model card for UniREdit-Bagel: Add metadata, links, and usage

Browse files

This PR significantly improves the model card for `UniREdit-Bagel` by:
- Adding the `pipeline_tag: image-to-image` to improve discoverability for image editing models.
- Specifying `library_name: transformers` based on evidence in the configuration files, enabling automated usage examples on the Hub.
- Including direct links to the project page and the GitHub repository for easy access to more resources.
- Adding a comprehensive "Introduction" and "Highlights" section derived from the paper abstract and GitHub README.
- Providing a detailed "Sample Usage" section with environment setup, checkpoint preparation, and inference steps, directly copied from the GitHub repository.
- Incorporating relevant images from the GitHub repository to visualize the model's capabilities and the benchmark.
- Adding a correct BibTeX citation for the paper.
- Including contact information.

Please review and merge this PR.

Files changed (1) hide show
  1. README.md +113 -1
README.md CHANGED
@@ -1,5 +1,117 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
4
 
5
- [UniREditBench: A Unified Reasoning-based Image Editing Benchmark](https://arxiv.org/abs/2511.01295)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: image-to-image
4
+ library_name: transformers
5
  ---
6
 
7
+ # UniREdit-Bagel: A Unified Reasoning-based Image Editing Model
8
+
9
+ This repository hosts **UniREdit-Bagel**, a model developed as part of the research presented in the paper:
10
+ [UniREditBench: A Unified Reasoning-based Image Editing Benchmark](https://arxiv.org/abs/2511.01295)
11
+
12
+ **Project Page**: [https://maplebb.github.io/UniREditBench/](https://maplebb.github.io/UniREditBench/)
13
+ **Code Repository**: [https://github.com/Maplebb/UniREditBench](https://github.com/Maplebb/UniREditBench)
14
+
15
+ <div align="center">
16
+ <img alt="image" src="https://github.com/Maplebb/UniREditBench/raw/main/docs/static/images/teaser.png" />
17
+ </div>
18
+
19
+ ## Introduction
20
+ We propose **UniREditBench**, a unified benchmark for reasoning-based image editing evaluation with broader evaluation dimension coverage and a robust evaluation pipeline. We also design an automated multi-scenario data synthesis pipeline and construct **UniREdit-Data-100K**, a large-scale synthetic dataset with high-quality chain-of-thought (CoT) reasoning annotations. We fine-tune Bagel on this dataset and develop **UniREdit-Bagel**, demonstrating substantial improvements in both in-domain and out-of-distribution settings.
21
+
22
+ <div align="center">
23
+ <img alt="image" src="https://github.com/Maplebb/UniREditBench/raw/main/docs/static/images/radar.png" />
24
+ </div>
25
+
26
+ ### ✨ Highlights:
27
+
28
+ - **Broader Scenario and Reasoning Dimension Coverage**: It contains 2,700 high-quality samples organized into 8 primary reasoning dimensions and 18 sub-categories, spanning both real-world and game-world image editing tasks.
29
+ - **Reliable Dual-Reference Evaluation**: For each sample assessment, we design both the textual reference and ground-truth (GT) image reference. This multi-modal reference enables vision-language model (VLM) evaluators to perform direct and fine-grained comparisons at both the textual and visual levels with the generated images, leading to more reliable evaluation.
30
+
31
+ <div align="center">
32
+ <img alt="image" src="https://github.com/Maplebb/UniREditBench/raw/main/docs/static/images/motivation_tab.png" />
33
+ </div>
34
+ <div align="center">
35
+ <img alt="image" src="https://github.com/Maplebb/UniREditBench/raw/main/docs/static/images/motivation_fig.png" />
36
+ </div>
37
+ <div align="center">
38
+ <img alt="image" src="https://github.com/Maplebb/UniREditBench/raw/main/docs/static/images/testpoint_cases.png" />
39
+ </div>
40
+
41
+ ## 🚀 Sample Usage
42
+
43
+ To perform image editing with reasoning using UniREdit-Bagel, follow the steps below. This section is adapted from the [official GitHub repository](https://github.com/Maplebb/UniREditBench).
44
+
45
+ ### 1. Set Up Environment
46
+
47
+ ```bash
48
+ conda create -n uniredit python=3.10 -y
49
+ conda activate uniredit
50
+ pip install -r requirements.txt
51
+ pip install flash_attn==2.7.0.post1 --no-build-isolation
52
+ ```
53
+ You can also install `flash_attn` via:
54
+ ```bash
55
+ # for cuda11 torch2.5.x
56
+ pip install "https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.0.post1/flash_attn-2.7.0.post1+cu11torch2.5cxx11abiFALSE-cp310-cp310-linux_x86_64.whl"
57
+
58
+ # for cuda12 torch2.5.x
59
+ pip install "https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.0.post1/flash_attn-2.7.0.post1+cu12torch2.5cxx11abiFALSE-cp310-cp310-linux_x86_64.whl"
60
+ ```
61
+
62
+ ### 2. Benchmark and Checkpoint Preparation
63
+
64
+ First, prepare the UniREditBench benchmark dataset:
65
+ ```bash
66
+ huggingface-cli download --resume-download maplebb/UniREditBench --local-dir ./UniREditBench
67
+ cd UniREditBench
68
+ unzip original_image.zip
69
+ unzip reference_image.zip
70
+ cd ..
71
+ ```
72
+ Then, prepare the UniREdit-Bagel checkpoint:
73
+ ```bash
74
+ huggingface-cli download --resume-download maplebb/UniREdit-Bagel --local-dir ./ckpt
75
+
76
+ pip install safetensors
77
+
78
+ python merge_ckpt.py
79
+ ```
80
+ *(Note: The `merge_ckpt.py` script is part of the UniREditBench GitHub repository and should be run from its root directory after cloning and downloading the checkpoint.)*
81
+
82
+ ### 3. Inference
83
+
84
+ Once the environment and checkpoints are prepared, you can run inference:
85
+ ```bash
86
+ GPUS=8
87
+ model_path=./ckpt
88
+ input_path=./UniREditBench
89
+ output_path=./output_images
90
+
91
+ # Image Editing with Reasoning
92
+ torchrun \
93
+ --nnodes=1 \
94
+ --nproc_per_node=$GPUS \
95
+ gen_images_mp_uniredit.py \
96
+ --input_dir $input_path \
97
+ --output_dir $output_path \
98
+ --metadata_file ./UniREditBench/data.json \
99
+ --max_latent_size 64 \
100
+ --model-path $model_path \
101
+ --think
102
+ ```
103
+ This command will generate edited images based on the instructions and save them to the specified `output_images` directory. The `--think` argument enables reasoning capabilities.
104
+
105
+ ## 📧 Contact
106
+ If you have any comments or questions, please open a new issue on the [GitHub repository](https://github.com/Maplebb/UniREditBench) or feel free to contact [Feng Han](fhan25@m.fudan.edu.cn) and [Yibin Wang](https://codegoat24.github.io).
107
+
108
+ ## ��� Citation
109
+ If you find our work helpful or inspiring, please consider citing it:
110
+ ```bibtex
111
+ @article{han2025unireditbench,
112
+ title={UniREditBench: A Unified Reasoning-based Image Editing Benchmark},
113
+ author={Han, Feng and Wang, Yibin and Li, Chenglin and Liang, Zheming and Wang, Dianyi and Jiao, Yang and Wei, Zhipeng and Gong, Chao and Jin, Cheng and Chen, Jingjing and Wang, Jiaqi},
114
+ journal={arXiv preprint arXiv:2511.01295},
115
+ year={2025}
116
+ }
117
+ ```