Improve model card for RealGen: Add pipeline tag, library name, links, and summary
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,7 +1,29 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
| 4 |
# RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards
|
| 5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
- **Semantic Detector**: Forensic-Chat, a generalizable and interpretable detector optimized from Qwen2.5-VL-7B. It assesses authenticity by analyzing image content (e.g., smooth greasy skin, artifacts in faces/hands, unnatural background blur).
|
| 7 |
-
- **Feature Detector**: OmniAID achieves stable and accurate detection by being pre-trained on large-scale real and synthetic datasets. Feature-level artifacts are primarily associated with frequency artifacts and abnormal noise patterns.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
pipeline_tag: text-to-image
|
| 4 |
+
library_name: transformers
|
| 5 |
---
|
| 6 |
+
|
| 7 |
# RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards
|
| 8 |
+
|
| 9 |
+
The model presented in the paper [RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards](https://huggingface.co/papers/2512.00473) proposes a photorealistic text-to-image framework. RealGen integrates an LLM component for prompt optimization and a diffusion model for realistic image generation. It introduces a "Detector Reward" mechanism, which quantifies artifacts and assesses realism using both semantic-level and feature-level synthetic image detectors. This reward signal is leveraged with the GRPO algorithm to optimize the entire generation pipeline, significantly enhancing image realism and detail.
|
| 10 |
+
|
| 11 |
+
Project Page: https://yejy53.github.io/RealGen/
|
| 12 |
+
Code: https://github.com/yejy53/RealGen
|
| 13 |
+
|
| 14 |
+
## Detection Models
|
| 15 |
+
RealGen utilizes specialized detection models to guide its generation process:
|
| 16 |
- **Semantic Detector**: Forensic-Chat, a generalizable and interpretable detector optimized from Qwen2.5-VL-7B. It assesses authenticity by analyzing image content (e.g., smooth greasy skin, artifacts in faces/hands, unnatural background blur).
|
| 17 |
+
- **Feature Detector**: OmniAID achieves stable and accurate detection by being pre-trained on large-scale real and synthetic datasets. Feature-level artifacts are primarily associated with frequency artifacts and abnormal noise patterns.
|
| 18 |
+
|
| 19 |
+
## Citation
|
| 20 |
+
If you find our work helpful or inspiring, please feel free to cite it.
|
| 21 |
+
|
| 22 |
+
```bib
|
| 23 |
+
@article{ye2025realgen,
|
| 24 |
+
title={RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards},
|
| 25 |
+
author={Ye, Junyan and Zhu, Leqi and Guo, Yuncheng and Jiang, Dongzhi and Huang, Zilong and Zhang, Yifan and Yan, Zhiyuan and Fu, Haohuan and He, Conghui and Li, Weijia},
|
| 26 |
+
journal={arXiv preprint arXiv:2512.00473},
|
| 27 |
+
year={2025}
|
| 28 |
+
}
|
| 29 |
+
```
|