Instructions to use EditScore/EditScore-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use EditScore/EditScore-7B with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("/share/project/luoxin/huggingface/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/cc594898137f460bfe9f0759e9844b3ce807cfb5")
model = PeftModel.from_pretrained(base_model, "EditScore/EditScore-7B")

Transformers

How to use EditScore/EditScore-7B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="EditScore/EditScore-7B")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("EditScore/EditScore-7B", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use EditScore/EditScore-7B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "EditScore/EditScore-7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EditScore/EditScore-7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/EditScore/EditScore-7B

SGLang

How to use EditScore/EditScore-7B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "EditScore/EditScore-7B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EditScore/EditScore-7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "EditScore/EditScore-7B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EditScore/EditScore-7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use EditScore/EditScore-7B with Docker Model Runner:
```
docker model run hf.co/EditScore/EditScore-7B
```

Update pipeline tag, add paper ID, abstract, and GitHub link

by nielsr HF Staff - opened Oct 1, 2025

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+13

-2

Files changed (1) hide show

README.md +13 -2

README.md CHANGED Viewed

@@ -1,7 +1,8 @@
 ---
 base_model: Qwen/Qwen2.5-VL-7B-Instruct
 library_name: peft
-pipeline_tag: text-generation
 tags:
 - base_model:adapter:Qwen/Qwen2.5-VL-7B-Instruct
 - lora
@@ -29,6 +30,16 @@ tags:
 </h4>
 **EditScore** is a series of state-of-the-art open-source reward models (7B–72B) designed to evaluate and enhance instruction-guided image editing.
 ## ✨ Highlights
 - **State-of-the-Art Performance**: Effectively matches the performance of leading proprietary VLMs. With a self-ensembling strategy, **our largest model surpasses even GPT-5** on our comprehensive benchmark, **EditReward-Bench**.
 - **A Reliable Evaluation Standard**: We introduce **EditReward-Bench**, the first public benchmark specifically designed for evaluating reward models in image editing, featuring 13 subtasks, 11 state-of-the-art editing models (*including proprietary models*) and expert human annotations.
@@ -165,4 +176,4 @@ If you find this repository or our work useful, please consider giving a star
   journal={arXiv preprint arXiv:2509.23909},
   year={2025}
 }
-```

 ---
 base_model: Qwen/Qwen2.5-VL-7B-Instruct
 library_name: peft
+pipeline_tag: image-text-to-text
+paper: 2509.23909
 tags:
 - base_model:adapter:Qwen/Qwen2.5-VL-7B-Instruct
 - lora
 </h4>
 **EditScore** is a series of state-of-the-art open-source reward models (7B–72B) designed to evaluate and enhance instruction-guided image editing.
+## Paper
+[EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling](https://huggingface.co/papers/2509.23909)
+### Abstract
+Instruction-guided image editing has achieved remarkable progress, yet current models still face challenges with complex instructions and often require multiple samples to produce a desired result. Reinforcement Learning (RL) offers a promising solution, but its adoption in image editing has been severely hindered by the lack of a high-fidelity, efficient reward signal. In this work, we present a comprehensive methodology to overcome this barrier, centered on the development of a state-of-the-art, specialized reward model. We first introduce EditReward-Bench, a comprehensive benchmark to systematically evaluate reward models on editing quality. Building on this benchmark, we develop EditScore, a series of reward models (7B-72B) for evaluating the quality of instruction-guided image editing. Through meticulous data curation and filtering, EditScore effectively matches the performance of learning proprietary VLMs. Furthermore, coupled with an effective self-ensemble strategy tailored for the generative nature of EditScore, our largest variant even surpasses GPT-5 in the benchmark. We then demonstrate that a high-fidelity reward model is the key to unlocking online RL for image editing. Our experiments show that, while even the largest open-source VLMs fail to provide an effective learning signal, EditScore enables efficient and robust policy optimization. Applying our framework to a strong base model, OmniGen2, results in a final model that shows a substantial and consistent performance uplift. Overall, this work provides the first systematic path from benchmarking to reward modeling to RL training in image editing, showing that a high-fidelity, domain-specialized reward model is the key to unlocking the full potential of RL in this domain.
+## Code Repository
+The official code can be found on GitHub: [https://github.com/VectorSpaceLab/EditScore](https://github.com/VectorSpaceLab/EditScore)
 ## ✨ Highlights
 - **State-of-the-Art Performance**: Effectively matches the performance of leading proprietary VLMs. With a self-ensembling strategy, **our largest model surpasses even GPT-5** on our comprehensive benchmark, **EditReward-Bench**.
 - **A Reliable Evaluation Standard**: We introduce **EditReward-Bench**, the first public benchmark specifically designed for evaluating reward models in image editing, featuring 13 subtasks, 11 state-of-the-art editing models (*including proprietary models*) and expert human annotations.
   journal={arXiv preprint arXiv:2509.23909},
   year={2025}
 }
+```