Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -8,29 +8,39 @@ library_name: diffusers
|
|
| 8 |
---
|
| 9 |
|
| 10 |
## 🔥🔥🔥 News!!
|
| 11 |
-
*
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
-
<div align="center">
|
| 28 |
-
<img width="720" alt="demo" src="assets/image_edit_demo.gif">
|
| 29 |
-
<p><b>Step1X-Edit:</b> a unified image editing model performs impressively on various genuine user instructions. </p>
|
| 30 |
-
</div>
|
| 31 |
|
| 32 |
|
| 33 |
-
##
|
| 34 |
Install the `diffusers` package from the following command:
|
| 35 |
```bash
|
| 36 |
git clone -b dev/MergeV1-2 https://github.com/Peyton-Chen/diffusers.git
|
|
@@ -74,28 +84,18 @@ The results will look like:
|
|
| 74 |
</div>
|
| 75 |
|
| 76 |
|
| 77 |
-
|
| 78 |
-
|
| 79 |
-
<div align="center">
|
| 80 |
-
<img width="720" alt="demo" src="assets/arch.png">
|
| 81 |
-
</div>
|
| 82 |
-
|
| 83 |
-
Framework of Step1X-Edit. Step1X-Edit leverages the image understanding capabilities
|
| 84 |
-
of MLLMs to parse editing instructions and generate editing tokens, which are then decoded into
|
| 85 |
-
images using a DiT-based network.More details please refer to our [technical report](https://arxiv.org/abs/2504.17761).
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
We release [GEdit-Bench](https://huggingface.co/datasets/stepfun-ai/GEdit-Bench) as a new benchmark, grounded in real-world usages is developed to support more authentic and comprehensive evaluation. This benchmark, which is carefully curated to reflect actual user editing needs and a wide range of editing scenarios, enables more authentic and comprehensive evaluations of image editing models. Part results of the benchmark are shown below:
|
| 89 |
<div align="center">
|
| 90 |
-
<img width="1080" alt="results" src="assets/
|
| 91 |
</div>
|
| 92 |
|
| 93 |
## Citation
|
| 94 |
```
|
| 95 |
@article{liu2025step1x-edit,
|
| 96 |
-
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
|
| 100 |
}
|
| 101 |
```
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
## 🔥🔥🔥 News!!
|
| 11 |
+
* Nov 26, 2025: 👋 We release [Step1X-Edit-v1p2](https://huggingface.co/stepfun-ai/Step1X-Edit-v1p2), a native reasoning edit model with better performance on KRIS-Bench and GEdit-Bench. <!-- technical report can be found [here](). -->
|
| 12 |
+
<table>
|
| 13 |
+
<thead>
|
| 14 |
+
<tr>
|
| 15 |
+
<th rowspan="2">Models</th>
|
| 16 |
+
<th colspan="3"> <div align="center">GEdit-Bench</div> </th>
|
| 17 |
+
<th colspan="4"> <div align="center">Kris-Bench</div> </th>
|
| 18 |
+
</tr>
|
| 19 |
+
<tr>
|
| 20 |
+
<th>G_SC⬆️</th> <th>G_PQ⬆️ </th> <th>G_O⬆️</th> <th>FK⬆️</th> <th>CK⬆️</th> <th>PK⬆️ </th> <th>Overall⬆️</th>
|
| 21 |
+
</tr>
|
| 22 |
+
</thead>
|
| 23 |
+
<tbody>
|
| 24 |
+
<tr>
|
| 25 |
+
<td>Step1X-Edit v1.1 </td> <td>7.66</td> <td>7.35</td> <td>6.97</td> <td>53.05</td> <td>54.34</td> <td>44.66</td> <td>51.59</td>
|
| 26 |
+
</tr>
|
| 27 |
+
<tr>
|
| 28 |
+
<td>Step1x-edit-v1p2-preview </td> <td>8.14</td> <td>7.55</td> <td>7.42</td> <td>60.49</td> <td>58.81</td> <td>41.77</td> <td>52.51</td>
|
| 29 |
+
</tr>
|
| 30 |
+
<tr>
|
| 31 |
+
<td>Step1x-edit-v1p2 (base) </td> <td>8.14</td> <td>7.55</td> <td>7.42</td> <td>60.49</td> <td>58.81</td> <td>41.77</td> <td>52.51</td>
|
| 32 |
+
</tr>
|
| 33 |
+
<tr>
|
| 34 |
+
<td>Step1x-edit-v1p2 (thinking) </td> <td>8.14</td> <td>7.55</td> <td>7.42</td> <td>60.49</td> <td>58.81</td> <td>41.77</td> <td>52.51</td>
|
| 35 |
+
</tr>
|
| 36 |
+
<tr>
|
| 37 |
+
<td>Step1x-edit-v1p2 (thinking + reflection) </td> <td>8.14</td> <td>7.55</td> <td>7.42</td> <td>60.49</td> <td>58.81</td> <td>41.77</td> <td>52.51</td>
|
| 38 |
+
</tr>
|
| 39 |
+
</table>
|
| 40 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
|
| 43 |
+
## ⚡️ Model Usages
|
| 44 |
Install the `diffusers` package from the following command:
|
| 45 |
```bash
|
| 46 |
git clone -b dev/MergeV1-2 https://github.com/Peyton-Chen/diffusers.git
|
|
|
|
| 84 |
</div>
|
| 85 |
|
| 86 |
|
| 87 |
+
## 📖 Introduction
|
| 88 |
+
Step1X-Edit-v1p2 represents a step towards reasoning-enhanced image editing models. We show that unlocking the reasoning capabilities of MLLMs can further expand the limits of instruction-based editing. Specifically, we introduce two complementary reasoning mechanisms, thinking and reflection, to improve instruction comprehension and editing accuracy. Building on these mechanisms, our framework performs editing in a thinking–editing–reflection loop: **the thinking stage** leverages MLLM world knowledge to interpret abstract instructions, while **the reflection stage** reviews the edited outputs, corrects unintended changes, and determines when to stop. For more details, please refer to our technical report.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 89 |
<div align="center">
|
| 90 |
+
<img width="1080" alt="results" src="assets/ReasonEdit_intro.jpg">
|
| 91 |
</div>
|
| 92 |
|
| 93 |
## Citation
|
| 94 |
```
|
| 95 |
@article{liu2025step1x-edit,
|
| 96 |
+
title={Step1X-Edit: A Practical Framework for General Image Editing},
|
| 97 |
+
author={Shiyu Liu and Yucheng Han and Peng Xing and Fukun Yin and Rui Wang and Wei Cheng and Jiaqi Liao and Yingming Wang and Honghao Fu and Chunrui Han and Guopeng Li and Yuang Peng and Quan Sun and Jingwei Wu and Yan Cai and Zheng Ge and Ranchen Ming and Lei Xia and Xianfang Zeng and Yibo Zhu and Binxing Jiao and Xiangyu Zhang and Gang Yu and Daxin Jiang},
|
| 98 |
+
journal={arXiv preprint arXiv:2504.17761},
|
| 99 |
+
year={2025}
|
| 100 |
}
|
| 101 |
```
|