Instructions to use EditScore/EditScore-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use EditScore/EditScore-7B with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("/share/project/luoxin/huggingface/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/cc594898137f460bfe9f0759e9844b3ce807cfb5")
model = PeftModel.from_pretrained(base_model, "EditScore/EditScore-7B")

Transformers

How to use EditScore/EditScore-7B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="EditScore/EditScore-7B")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("EditScore/EditScore-7B", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use EditScore/EditScore-7B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "EditScore/EditScore-7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EditScore/EditScore-7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/EditScore/EditScore-7B

SGLang

How to use EditScore/EditScore-7B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "EditScore/EditScore-7B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EditScore/EditScore-7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "EditScore/EditScore-7B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EditScore/EditScore-7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use EditScore/EditScore-7B with Docker Model Runner:
```
docker model run hf.co/EditScore/EditScore-7B
```

nielsr HF Staff commited on Oct 1, 2025

Commit

5a3ea83

verified ·

1 Parent(s): dcce052

Update pipeline tag, add paper ID, abstract, and GitHub link

Browse files

This PR updates the model card for the EditScore model to improve its discoverability and provide more comprehensive information for users.

Key changes include:
* **Updated `pipeline_tag`**: Changed from `text-generation` to `image-text-to-text` to accurately reflect the model's functionality as a reward model for instruction-guided image editing, which takes both images and text as input to produce a textual score.
* **Added `paper` metadata**: Included the Hugging Face paper ID `2509.23909` in the metadata for better integration with the Hugging Face Hub.
* **Added Paper Abstract**: Incorporated the paper's abstract into a dedicated section to give users a quick overview of the model's purpose and methodology.
* **Added Code Repository Link**: Provided a direct link to the official GitHub repository for easy access to the source code and further resources.

These changes enhance the model card's clarity and ensure it meets best practices for documentation on the Hugging Face Hub.

Files changed (1) hide show

README.md +13 -2

README.md CHANGED Viewed

@@ -1,7 +1,8 @@
 ---
 base_model: Qwen/Qwen2.5-VL-7B-Instruct
 library_name: peft
-pipeline_tag: text-generation
 tags:
 - base_model:adapter:Qwen/Qwen2.5-VL-7B-Instruct
 - lora
@@ -29,6 +30,16 @@ tags:
 </h4>
 **EditScore** is a series of state-of-the-art open-source reward models (7B–72B) designed to evaluate and enhance instruction-guided image editing.
 ## ✨ Highlights
 - **State-of-the-Art Performance**: Effectively matches the performance of leading proprietary VLMs. With a self-ensembling strategy, **our largest model surpasses even GPT-5** on our comprehensive benchmark, **EditReward-Bench**.
 - **A Reliable Evaluation Standard**: We introduce **EditReward-Bench**, the first public benchmark specifically designed for evaluating reward models in image editing, featuring 13 subtasks, 11 state-of-the-art editing models (*including proprietary models*) and expert human annotations.
@@ -165,4 +176,4 @@ If you find this repository or our work useful, please consider giving a star
   journal={arXiv preprint arXiv:2509.23909},
   year={2025}
 }
-```

 ---
 base_model: Qwen/Qwen2.5-VL-7B-Instruct
 library_name: peft
+pipeline_tag: image-text-to-text
+paper: 2509.23909
 tags:
 - base_model:adapter:Qwen/Qwen2.5-VL-7B-Instruct
 - lora
 </h4>
 **EditScore** is a series of state-of-the-art open-source reward models (7B–72B) designed to evaluate and enhance instruction-guided image editing.
+## Paper
+[EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling](https://huggingface.co/papers/2509.23909)
+### Abstract
+Instruction-guided image editing has achieved remarkable progress, yet current models still face challenges with complex instructions and often require multiple samples to produce a desired result. Reinforcement Learning (RL) offers a promising solution, but its adoption in image editing has been severely hindered by the lack of a high-fidelity, efficient reward signal. In this work, we present a comprehensive methodology to overcome this barrier, centered on the development of a state-of-the-art, specialized reward model. We first introduce EditReward-Bench, a comprehensive benchmark to systematically evaluate reward models on editing quality. Building on this benchmark, we develop EditScore, a series of reward models (7B-72B) for evaluating the quality of instruction-guided image editing. Through meticulous data curation and filtering, EditScore effectively matches the performance of learning proprietary VLMs. Furthermore, coupled with an effective self-ensemble strategy tailored for the generative nature of EditScore, our largest variant even surpasses GPT-5 in the benchmark. We then demonstrate that a high-fidelity reward model is the key to unlocking online RL for image editing. Our experiments show that, while even the largest open-source VLMs fail to provide an effective learning signal, EditScore enables efficient and robust policy optimization. Applying our framework to a strong base model, OmniGen2, results in a final model that shows a substantial and consistent performance uplift. Overall, this work provides the first systematic path from benchmarking to reward modeling to RL training in image editing, showing that a high-fidelity, domain-specialized reward model is the key to unlocking the full potential of RL in this domain.
+## Code Repository
+The official code can be found on GitHub: [https://github.com/VectorSpaceLab/EditScore](https://github.com/VectorSpaceLab/EditScore)
 ## ✨ Highlights
 - **State-of-the-Art Performance**: Effectively matches the performance of leading proprietary VLMs. With a self-ensembling strategy, **our largest model surpasses even GPT-5** on our comprehensive benchmark, **EditReward-Bench**.
 - **A Reliable Evaluation Standard**: We introduce **EditReward-Bench**, the first public benchmark specifically designed for evaluating reward models in image editing, featuring 13 subtasks, 11 state-of-the-art editing models (*including proprietary models*) and expert human annotations.
   journal={arXiv preprint arXiv:2509.23909},
   year={2025}
 }
+```