File size: 3,248 Bytes
fd745fe
 
 
 
 
 
 
 
 
 
35f91c5
 
fd745fe
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
---
license: apache-2.0
language:
- en
- zh
tags:
- text-to-image
- fake-image-detection
- unigendet
- bagel
base_model:
- ByteDance-Seed/BAGEL-7B-MoT
---

<h1 align="center">[CVPR 2026] UniGenDet: A Unified Generative-Discriminative Framework</h1>

<p align="center">
  <b>
    <a href="https://github.com/Zhangyr2022/">Yanran Zhang</a>,
    <a href="https://wzzheng.net/#">Wenzhao Zheng</a><sup>†</sup>,
    <a href="https://joeleelyf.github.io/">Yifei Li</a>,
    <a href="https://yuby14.github.io/">Bingyao Yu</a>,
    <a href="https://yzheng97.github.io/">Yu Zheng</a>,
    <a href="https://leichenthu.github.io/">Lei Chen</a>,
    <a href="https://scholar.google.com/citations?user=6a79aPwAAAAJ&hl=en">Jie Zhou</a><sup>*</sup>,
    <a href="https://ivg.au.tsinghua.edu.cn/Jiwen_Lu/">Jiwen Lu</a>
  </b>
  <br/>
  Department of Automation, Tsinghua University, China
  <br/>
  <sup>*</sup>Corresponding author &nbsp;&nbsp; <sup>†</sup>Project leader
</p>

<p align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/661cfae9a853782abad2a495/lBHJD1nNztgmdwc_WqVli.png" width="100%" alt="UniGenDet Teaser"/>
</p>

**UniGenDet** is a unified co-evolutionary framework that jointly optimizes image generation and generated-image detection in a single loop. By bridging generation and authenticity understanding through symbiotic multimodal self-attention, UniGenDet turns the traditional "generator vs. detector" arms race into a closed-loop collaboration. 

This repository hosts the fine-tuned model weights for UniGenDet.

### 🔗 Links
- **GitHub Repository (Code & Detailed Instructions):** [Zhangyr2022/UniGenDet](https://github.com/Zhangyr2022/UniGenDet)
- **Paper (arXiv):** [2604.21904](https://arxiv.org/abs/2604.21904v1)
- **Project Website:** [UniGenDet Project Page](https://ivg-yanranzhang.github.io/UniGenDet/)

### 🚀 Getting Started

The UniGenDet model supports two main tasks:
1. **Text-to-Image Generation (`t2i`)**
2. **AI-Generated Image Detection and Explanation (`detection`)**

To use these weights for generation, detection, or further fine-tuning, please refer to the official [GitHub repository](https://github.com/Zhangyr2022/UniGenDet). The repository provides a comprehensive `demo.py` script for interactive inference.

**Quick Inference Example Setup:**
1. Clone the GitHub repository: `git clone https://github.com/Zhangyr2022/UniGenDet.git`
2. Install dependencies as outlined in the repo's `README.md`.
3. Download the base BAGEL pretrained assets.
4. Run `demo.py` pointing to this Hugging Face model directory.

For complete installation, data preparation, training (GDUF/DIGA), and evaluation instructions, please consult the [main GitHub repository](https://github.com/Zhangyr2022/UniGenDet).

### Citation

```bibtex
@article{zhang2026unigendet,
  title   = {UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection},
  author  = {Zhang, Yanran and Zheng, Wenzhao and Li, Yifei and Yu, Bingyao and Zheng, Yu and Chen, Lei and Zhou, Jie and Lu, Jiwen},
  journal = {CoRR},
  volume  = {abs/2604.21904},
  year    = {2026},
  url     = {[https://arxiv.org/abs/2604.21904](https://arxiv.org/abs/2604.21904)},
}