nielsr HF Staff commited on
Commit
1818848
·
verified ·
1 Parent(s): 8dc2a40

Improve model card for RealGen: Add pipeline tag, library name, links, and summary

Browse files

This PR enhances the model card for RealGen by:
- Updating metadata with `pipeline_tag: text-to-image` to accurately reflect its core functionality as a photorealistic text-to-image generation framework, as described in the paper and project details.
- Adding `library_name: transformers` based on explicit evidence from `config.json`, `tokenizer_config.json`, and `preprocessor_config.json`, ensuring compatibility with the Transformers library and enabling inference widgets.
- Adding a link to the official paper: [RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards](https://huggingface.co/papers/2512.00473).
- Providing links to the project page (https://yejy53.github.io/RealGen/) and the GitHub repository (https://github.com/yejy53/RealGen) for easy access to additional resources.
- Including a brief summary of RealGen's contributions to provide better context for users.
- Adding the BibTeX entry for proper citation.

Please review and merge this PR.

Files changed (1) hide show
  1. README.md +24 -2
README.md CHANGED
@@ -1,7 +1,29 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
 
4
  # RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards
5
- Detection Models:
 
 
 
 
 
 
 
6
  - **Semantic Detector**: Forensic-Chat, a generalizable and interpretable detector optimized from Qwen2.5-VL-7B. It assesses authenticity by analyzing image content (e.g., smooth greasy skin, artifacts in faces/hands, unnatural background blur).
7
- - **Feature Detector**: OmniAID achieves stable and accurate detection by being pre-trained on large-scale real and synthetic datasets. Feature-level artifacts are primarily associated with frequency artifacts and abnormal noise patterns.
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: text-to-image
4
+ library_name: transformers
5
  ---
6
+
7
  # RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards
8
+
9
+ The model presented in the paper [RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards](https://huggingface.co/papers/2512.00473) proposes a photorealistic text-to-image framework. RealGen integrates an LLM component for prompt optimization and a diffusion model for realistic image generation. It introduces a "Detector Reward" mechanism, which quantifies artifacts and assesses realism using both semantic-level and feature-level synthetic image detectors. This reward signal is leveraged with the GRPO algorithm to optimize the entire generation pipeline, significantly enhancing image realism and detail.
10
+
11
+ Project Page: https://yejy53.github.io/RealGen/
12
+ Code: https://github.com/yejy53/RealGen
13
+
14
+ ## Detection Models
15
+ RealGen utilizes specialized detection models to guide its generation process:
16
  - **Semantic Detector**: Forensic-Chat, a generalizable and interpretable detector optimized from Qwen2.5-VL-7B. It assesses authenticity by analyzing image content (e.g., smooth greasy skin, artifacts in faces/hands, unnatural background blur).
17
+ - **Feature Detector**: OmniAID achieves stable and accurate detection by being pre-trained on large-scale real and synthetic datasets. Feature-level artifacts are primarily associated with frequency artifacts and abnormal noise patterns.
18
+
19
+ ## Citation
20
+ If you find our work helpful or inspiring, please feel free to cite it.
21
+
22
+ ```bib
23
+ @article{ye2025realgen,
24
+ title={RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards},
25
+ author={Ye, Junyan and Zhu, Leqi and Guo, Yuncheng and Jiang, Dongzhi and Huang, Zilong and Zhang, Yifan and Yan, Zhiyuan and Fu, Haohuan and He, Conghui and Li, Weijia},
26
+ journal={arXiv preprint arXiv:2512.00473},
27
+ year={2025}
28
+ }
29
+ ```