---
title: DeForge AI
emoji: π
colorFrom: yellow
colorTo: gray
sdk: gradio
sdk_version: 6.14.0
python_version: '3.13'
app_file: app.py
pinned: false
short_description: AI image detection benchmark including DeForge-AI.
---
Is Artificial Intelligence Generated Image Detection a Solved Problem?
[Ziqiang Li](https://scholar.google.com/citations?user=mj5a8WgAAAAJ&hl=zh-CN)
1, [Jiazhen Yan](https://scholar.google.com/citations?user=QkURh8EAAAAJ&hl=zh-CN)
1, [Ziwen He](https://scholar.google.com/citations?user=PjkDK9cAAAAJ&hl=zh-CN)
1, [Kai Zeng](https://scholar.google.com.hk/citations?user=TsI93SIAAAAJ&hl=zh-CN)
2, [Weiwei Jiang](https://scholar.google.co.jp/citations?user=mbPN0hgAAAAJ&hl=zh-CN)
1, [Lizhi Xiong](https://scholar.google.com/citations?user=-FzrEP4AAAAJ&hl=zh-CN)
1, [Zhangjie Fu](https://scholar.google.com/citations?user=fO9NmagAAAAJ&hl=zh-CN)
1β‘
1Nanjing University of Information Science and Technology
2University of Siena
## π₯ News
* [2025-09-19]πππ AIGIBench is accepted by NeurIPS 2025 Datasets and Benchmarks.
##
**This repository is the official repository of the AIGIBench.**
> [!NOTE]
> This is a **modified version** of the original [AIGIBench](https://github.com/HorizonTEL/AIGIBench) repository. In addition to the original dataset and methods, it includes my custom detection solutions: **DeForge-AI** and **C2P-DINOv2** (intermediary solution).
**This repository contains the AIGIBench dataset and the evaluated methods.**
**AIGIBench** dataset contains two types of training and 25 test subsets. This dataset has the following advantages:
- Comprehensive generate types: including GAN-based Noise-to-Image Generation, Diffusion for Text-to-Image Generation, GANs for Deepfake, Diffusion for Personalized Generation, and Open-source Platforms.
- State-of-the-art Generators: MidjourneyV6, Stable Diffusion 3, Imagen, DALLE3, InstantID, FaceSwap, StyleGAN-XL and so on.
- Completely unknown generation method: Crawl pictures from communities and social media to build datasets CommunityAI & SocialRF, making detection more challenging.

If this project helps you, please fork, watch, and give a star to this repository.
## πDataset
The training set and testing set used in the paper can be downloaded on [Huggingface](https://huggingface.co/datasets/HorizonTEL/AIGIBench)/[Baidu Netdisk](https://pan.baidu.com/s/1XTwfXlfqkGxAwYLxXuZbfA?pwd=sm6v).
Each folder contains compressed files. After unzip the file, files under the data root directory can be organized as follows.
### Train
AIGIBench introduces two training dataset settings: **(i) Setting-I:** Training on 144K images generated by ProGAN across four object categoriesβcar, cat, chair, and horse. **(ii) Setting-II:** Training on 144K images generated by both SD-v1.4 and ProGAN, covering the same four object categories. The data of ProGAN comes from ForenSynths, and the data of sdv1.4 comes from GenImage. In order to maintain the fairness of the training data, we randomly select the sdv1.4 training images of GenImage to keep the same number as ProGAN, and then merge the data. The file directory is as follows:
```
βββ train
β βββ car
β β βββ 0_real
β β βββ 1_fake
β βββ cat
β β βββ ...
β βββ chair
β β βββ ...
β βββ horse
β β βββ ...
β βββ sdv1.4
β β βββ 0_real
β β βββ 1_fake
βββ val
β βββ ...
β β βββ 0_real
β β βββ 1_fake
β β ...
```
### Test
AIGIBench comprehensively tests the performance of the detector and builds a test dataset from five perspectives: GAN-based Noise-to-Image Generation, Diffusion for Text-to-Image Generation, GANs for Deepfake, Diffusion for Personalized Generation, and Open-source Platforms. The file directory is as follows:
```
βββ test
β βββ ProGAN
β β βββ 0_real
β β βββ 1_fake
β βββ R3GAN
β β βββ ...
β β ...
β βββ BlendFace
β β βββ 0_real
β β βββ 1_fake
β βββ InSwap
β β βββ ...
β β ...
β βββ FLUX1-dev
β β βββ 0_real
β β βββ 1_fake
β βββ Midjourney-V6
β β βββ ...
β β ...
β βββ BLIP
β β βββ 0_real
β β βββ 1_fake
β βββ Infinite-ID
β β βββ ...
β β ...
β βββ CommunityAI
β β βββ 0_real
β β βββ 1_fake
β βββ SocialRF
β β βββ ...
```
*Note: The test set count in the paper contained some errors, which we are correcting here. Please note that the number of real images and generated images are consistent; only the number of generated images is listed below.*
| Generator | Number |
|:------: |:---------:|
| CommunityAI | 6000 |
| SocialRF | 3000 |
| FaceSwap | 4000 |
| ImSwap | 4000 |
| WFIR | 1000 |
## πDetection Methods
We use the official code for all detection codes and make unified modifications to the input and output. The code we use for training in Setting-II is publicly available above, the corresponding pre-trained checkpoints are publicly available on [Huggingface](https://huggingface.co/HorizonTEL/AIGIBench). Of course, if you need the code from the original paper, the following is the corresponding detection code in the paperοΌ
- [ResNet-50](https://github.com/huggingface/pytorch-image-models/tree/v0.6.12/timm): Deep Residual Learning for Image Recognition
- [CNNDetection](https://github.com/PeterWang512/CNNDetection): CNN-generated images are surprisingly easy to spot...for now
- [GramNet](https://github.com/liuzhengzhe/Global_Texture_Enhancement_for_Fake_Face_Detection_in_the-Wild): Global Texture Enhancement for Fake Face Detection in the Wild
- [LGrad](https://github.com/chuangchuangtan/LGrad): Learning on Gradients: Generalized Artifacts Representation for GAN-Generated Images Detection
- [CLIPDetection](https://github.com/WisconsinAIVision/UniversalFakeDetect): Towards Universal Fake Image Detectors that Generalize Across Generative Models
- [FreqNet](https://github.com/chuangchuangtan/FreqNet-DeepfakeDetection): FreqNet: A Frequency-domain Image Super-Resolution Network with Dicrete Cosine Transform
- [NPR](https://github.com/chuangchuangtan/NPR-DeepfakeDetection): Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection
- [DFFreq](https://github.com/HorizonTEL/DFFreq-main): Dual Frequency Branch Framework with Reconstructed Sliding Windows Attention for AI-Generated Image Detection
- [LaDeDa](https://github.com/barcavia/RealTime-DeepfakeDetection-in-the-RealWorld): Real-Time Deepfake Detection in the Real-World
- [AIDE](https://github.com/shilinyan99/AIDE): A Sanity Check for AI-generated Image Detection
- [SAFE](https://github.com/Ouxiang-Li/SAFE): Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspectives
- [Effort](https://github.com/YZY-stack/Effort-AIGI-Detection): Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection
## β³Detection Results (Continuously updating)
**To ensure a fair comparison, we retrain all baseline methods on the Setting-II of AIGIBench.**
_If your retrained results differ significantly from those shown, please contact us._
| Method | Paper | Ref | R.Acc. | F.Acc. | Acc. | A.P. |
|:------: |:---------: |:---------:|:------:|:------:|:----:|:----:|
| CNNDetection | CNN-generated images are surprisingly easy to spot... for now | CVPR 2020 |**98.2**| 11.6 | 54.9 | 67.0 |
| Gram-Net | Global Texture Enhancement for Fake Face Detection In the Wild | CVPR 2020 | 90.5 | 26.6 | 58.6 | 62.4 |
| LGrad | Learning on Gradients: Generalized Artifacts Representation for GAN-Generated Images Detection | CVPR 2023 | 85.8 | 39.6 | 62.9 | 66.6 |
| UniFD | Towards Universal Fake Image Detectors that Generalize Across Generative Models | CVPR 2023 | 73.3 | 71.5 | 72.5 | 75.6 |
| FreqNet | Frequency-Aware Deepfake Detection: Improving Generalizability through Frequency Space Learning | AAAI 2024 | 65.9 | 66.4 | 66.2 | 70.1 |
| NPR | Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection | CVPR 2024 | 93.8 | 41.9 | 67.9 | 73.9 |
| Ladeda | Real-Time Deepfake Detection in the Real-World | Arxiv 2024| 91.7 | 54.9 | 73.4 | 79.3 |
| DFFreq | Dual Frequency Branch Framework with Reconstructed Sliding Windows Attention for AI-Generated Image Detection | TIFS 2026 | 91.8 | 58.0 | 75.1 | 82.2 |
| C2P-CLIP* | C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake Detection | AAAI 2025 | 93.8 | 49.8 | 71.8 | 82.2 |
| AIDE | A Sanity Check for AI-generated Image Detection | ICLR 2025 | 88.1 | 67.0 | 77.6 | 82.7 |
| SAFE | Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspectives | KDD 2025 | 89.0 | 66.6 | 78.1 | 83.6 |
| VIB-Net | Towards Universal AI-Generated Image Detection by Variational Information Bottleneck Network | CVPR 2025 | 60.6 |**78.1**| 69.3 | 70.9 |
| $D^3$ | $D^3$: Scaling Up Deepfake Detection by Learning from Discrepancy | CVPR 2025 | 81.0 | 46.4 | 63.7 | 68.9 |
| Effort | Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection | ICML 2025 | 96.9 | 57.1 | 77.1 |**87.2**|
| FerretNet | FerretNet: Efficient Synthetic Image Detection via Local Pixel Dependencies | NIPS 2025 | 96.6 | 61.8 |**79.4**| 85.8 |
| LOTA | LOTA: Bit-Planes Guided AI-Generated Image Detection | ICCV 2025 | 89.3 | 65.1 | 77.4 | 83.1 |
| BSF | Beyond Semantic Features: Pixel-level Mapping for Generalized AI-Generated Image Detection | AAAI 2026 | 91.5 | 65.6 | 78.8 | 81.1 |
| LTD | Layer Consistency Matters: Elegant Latent Transition Discrepancy for Generalizable Synthetic Image Detection | CVPR 2026 | 82.0 | 67.7 | 74.9 | 77.6 |
**For specific reasons, in the following method, we directly utilize the official pre-trained weights for inference.**
| Method | Paper | Ref | R.Acc. | F.Acc. | Acc. | A.P. |
|:------: |:---------: |:---------:|:------:|:------:|:----:|:----:|
| DDA | Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable | NIPS 2025 | 93.9 | 69.3 | 81.6 | 90.2 |
## Citation
```
@inproceedings{li2025artificial,
title={Is Artificial Intelligence Generated Image Detection a Solved Problem?},
author={Li, Ziqiang and Yan, Jiazhen and He, Ziwen and Zeng, Kai and Jiang, Weiwei and Xiong, Lizhi and Fu, Zhangjie},
booktitle={Advances in Neural Information Processing Systems},
year={2025}
}
```
## Contact
If you have any question about this project, please feel free to contact 247918horizon@gmail.com