license: apache-2.0
library_name: diffusers
pipeline_tag: image-to-image
FIRM-Qwen-Edit
This repository contains the weights for the model presented in the paper Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation.
Project Page | GitHub Repository
Introduction
FIRM (Faithful Image Reward Modeling) is a comprehensive framework that develops robust reward models to provide accurate and reliable guidance for faithful image generation and editing. It addresses the common issue of reward models suffering from hallucinations and noisy scores during reinforcement learning (RL).
This model, FIRM-Qwen-Edit, is an image editing model trained using the FIRM framework. It leverages a novel "Base-and-Bonus" reward strategy called Consistency-Modulated Execution (CME) to balance the competing objectives of instruction following and visual consistency.
Repository Layout
The official implementation in the GitHub repository is organized as follows:
generation/: GenerationRL training and reward serving.editing/: EditRL training, reward serving, and reproduction scripts.
Citation
If you find this work useful, please cite the following paper:
@article{zhao2025trust,
title={Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation},
author={Zhao, Xiangyu and Zhang, Peiyuan and Lin, Junming and Liang, Tianhao and Duan, Yuchen and Ding, Shengyuan and Tian, Changyao and Zang, Yuhang and Yan, Junchi and Yang, Xue},
journal={arXiv preprint arXiv:2603.12247},
year={2025}
}