| --- |
| license: apache-2.0 |
| language: |
| - en |
| - zh |
| tags: |
| - text-to-image |
| - fake-image-detection |
| - unigendet |
| - bagel |
| base_model: |
| - ByteDance-Seed/BAGEL-7B-MoT |
| --- |
| |
| <h1 align="center">[CVPR 2026] UniGenDet: A Unified Generative-Discriminative Framework</h1> |
|
|
| <p align="center"> |
| <b> |
| <a href="https://github.com/Zhangyr2022/">Yanran Zhang</a>, |
| <a href="https://wzzheng.net/#">Wenzhao Zheng</a><sup>โ </sup>, |
| <a href="https://joeleelyf.github.io/">Yifei Li</a>, |
| <a href="https://yuby14.github.io/">Bingyao Yu</a>, |
| <a href="https://yzheng97.github.io/">Yu Zheng</a>, |
| <a href="https://leichenthu.github.io/">Lei Chen</a>, |
| <a href="https://scholar.google.com/citations?user=6a79aPwAAAAJ&hl=en">Jie Zhou</a><sup>*</sup>, |
| <a href="https://ivg.au.tsinghua.edu.cn/Jiwen_Lu/">Jiwen Lu</a> |
| </b> |
| <br/> |
| Department of Automation, Tsinghua University, China |
| <br/> |
| <sup>*</sup>Corresponding author <sup>โ </sup>Project leader |
| </p> |
| |
| <p align="center"> |
| <img src="https://cdn-uploads.huggingface.co/production/uploads/661cfae9a853782abad2a495/lBHJD1nNztgmdwc_WqVli.png" width="100%" alt="UniGenDet Teaser"/> |
| </p> |
|
|
| **UniGenDet** is a unified co-evolutionary framework that jointly optimizes image generation and generated-image detection in a single loop. By bridging generation and authenticity understanding through symbiotic multimodal self-attention, UniGenDet turns the traditional "generator vs. detector" arms race into a closed-loop collaboration. |
|
|
| This repository hosts the fine-tuned model weights for UniGenDet. |
|
|
| ### ๐ Links |
| - **GitHub Repository (Code & Detailed Instructions):** [Zhangyr2022/UniGenDet](https://github.com/Zhangyr2022/UniGenDet) |
| - **Paper (arXiv):** [2604.21904](https://arxiv.org/abs/2604.21904v1) |
| - **Project Website:** [UniGenDet Project Page](https://ivg-yanranzhang.github.io/UniGenDet/) |
|
|
| ### ๐ Getting Started |
|
|
| The UniGenDet model supports two main tasks: |
| 1. **Text-to-Image Generation (`t2i`)** |
| 2. **AI-Generated Image Detection and Explanation (`detection`)** |
|
|
| To use these weights for generation, detection, or further fine-tuning, please refer to the official [GitHub repository](https://github.com/Zhangyr2022/UniGenDet). The repository provides a comprehensive `demo.py` script for interactive inference. |
|
|
| **Quick Inference Example Setup:** |
| 1. Clone the GitHub repository: `git clone https://github.com/Zhangyr2022/UniGenDet.git` |
| 2. Install dependencies as outlined in the repo's `README.md`. |
| 3. Download the base BAGEL pretrained assets. |
| 4. Run `demo.py` pointing to this Hugging Face model directory. |
|
|
| For complete installation, data preparation, training (GDUF/DIGA), and evaluation instructions, please consult the [main GitHub repository](https://github.com/Zhangyr2022/UniGenDet). |
|
|
| ### Citation |
|
|
| ```bibtex |
| @article{zhang2026unigendet, |
| title = {UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection}, |
| author = {Zhang, Yanran and Zheng, Wenzhao and Li, Yifei and Yu, Bingyao and Zheng, Yu and Chen, Lei and Zhou, Jie and Lu, Jiwen}, |
| journal = {CoRR}, |
| volume = {abs/2604.21904}, |
| year = {2026}, |
| url = {[https://arxiv.org/abs/2604.21904](https://arxiv.org/abs/2604.21904)}, |
| } |