Add model card and metadata for FIRM
#1
by
nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,3 +1,36 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
library_name: diffusers
|
| 4 |
+
pipeline_tag: image-to-image
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
# FIRM-Qwen-Edit
|
| 8 |
+
|
| 9 |
+
This repository contains the weights for the model presented in the paper [Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation](https://huggingface.co/papers/2603.12247).
|
| 10 |
+
|
| 11 |
+
[**Project Page**](https://firm-reward.github.io/) | [**GitHub Repository**](https://github.com/VisionXLab/FIRM-Reward)
|
| 12 |
+
|
| 13 |
+
## Introduction
|
| 14 |
+
|
| 15 |
+
FIRM (Faithful Image Reward Modeling) is a comprehensive framework that develops robust reward models to provide accurate and reliable guidance for faithful image generation and editing. It addresses the common issue of reward models suffering from hallucinations and noisy scores during reinforcement learning (RL).
|
| 16 |
+
|
| 17 |
+
This model, **FIRM-Qwen-Edit**, is an image editing model trained using the FIRM framework. It leverages a novel "Base-and-Bonus" reward strategy called Consistency-Modulated Execution (CME) to balance the competing objectives of instruction following and visual consistency.
|
| 18 |
+
|
| 19 |
+
## Repository Layout
|
| 20 |
+
|
| 21 |
+
The official implementation in the GitHub repository is organized as follows:
|
| 22 |
+
- `generation/`: GenerationRL training and reward serving.
|
| 23 |
+
- `editing/`: EditRL training, reward serving, and reproduction scripts.
|
| 24 |
+
|
| 25 |
+
## Citation
|
| 26 |
+
|
| 27 |
+
If you find this work useful, please cite the following paper:
|
| 28 |
+
|
| 29 |
+
```bibtex
|
| 30 |
+
@article{zhao2025trust,
|
| 31 |
+
title={Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation},
|
| 32 |
+
author={Zhao, Xiangyu and Zhang, Peiyuan and Lin, Junming and Liang, Tianhao and Duan, Yuchen and Ding, Shengyuan and Tian, Changyao and Zang, Yuhang and Yan, Junchi and Yang, Xue},
|
| 33 |
+
journal={arXiv preprint arXiv:2603.12247},
|
| 34 |
+
year={2025}
|
| 35 |
+
}
|
| 36 |
+
```
|