--- title: DeForge AI emoji: πŸ“Š colorFrom: yellow colorTo: gray sdk: gradio sdk_version: 6.14.0 python_version: '3.13' app_file: app.py pinned: false short_description: AI image detection benchmark including DeForge-AI. ---

Is Artificial Intelligence Generated Image Detection a Solved Problem?

[Ziqiang Li](https://scholar.google.com/citations?user=mj5a8WgAAAAJ&hl=zh-CN)1, [Jiazhen Yan](https://scholar.google.com/citations?user=QkURh8EAAAAJ&hl=zh-CN)1, [Ziwen He](https://scholar.google.com/citations?user=PjkDK9cAAAAJ&hl=zh-CN)1, [Kai Zeng](https://scholar.google.com.hk/citations?user=TsI93SIAAAAJ&hl=zh-CN)2, [Weiwei Jiang](https://scholar.google.co.jp/citations?user=mbPN0hgAAAAJ&hl=zh-CN)1, [Lizhi Xiong](https://scholar.google.com/citations?user=-FzrEP4AAAAJ&hl=zh-CN)1, [Zhangjie Fu](https://scholar.google.com/citations?user=fO9NmagAAAAJ&hl=zh-CN)1‑

‑Corresponding author

1Nanjing University of Information Science and Technology 2University of Siena

## πŸ”₯ News * [2025-09-19]πŸŽ‰πŸŽ‰πŸŽ‰ AIGIBench is accepted by NeurIPS 2025 Datasets and Benchmarks. ## **This repository is the official repository of the AIGIBench.** > [!NOTE] > This is a **modified version** of the original [AIGIBench](https://github.com/HorizonTEL/AIGIBench) repository. In addition to the original dataset and methods, it includes my custom detection solutions: **DeForge-AI** and **C2P-DINOv2** (intermediary solution). **This repository contains the AIGIBench dataset and the evaluated methods.** **AIGIBench** dataset contains two types of training and 25 test subsets. This dataset has the following advantages: - Comprehensive generate types: including GAN-based Noise-to-Image Generation, Diffusion for Text-to-Image Generation, GANs for Deepfake, Diffusion for Personalized Generation, and Open-source Platforms. - State-of-the-art Generators: MidjourneyV6, Stable Diffusion 3, Imagen, DALLE3, InstantID, FaceSwap, StyleGAN-XL and so on. - Completely unknown generation method: Crawl pictures from communities and social media to build datasets CommunityAI & SocialRF, making detection more challenging. ![example](https://github.com/user-attachments/assets/36250270-6fc1-4919-8078-1865f80913c0) If this project helps you, please fork, watch, and give a star to this repository. ## πŸ“šDataset The training set and testing set used in the paper can be downloaded on [Huggingface](https://huggingface.co/datasets/HorizonTEL/AIGIBench)/[Baidu Netdisk](https://pan.baidu.com/s/1XTwfXlfqkGxAwYLxXuZbfA?pwd=sm6v). Each folder contains compressed files. After unzip the file, files under the data root directory can be organized as follows. ### Train AIGIBench introduces two training dataset settings: **(i) Setting-I:** Training on 144K images generated by ProGAN across four object categoriesβ€”car, cat, chair, and horse. **(ii) Setting-II:** Training on 144K images generated by both SD-v1.4 and ProGAN, covering the same four object categories. The data of ProGAN comes from ForenSynths, and the data of sdv1.4 comes from GenImage. In order to maintain the fairness of the training data, we randomly select the sdv1.4 training images of GenImage to keep the same number as ProGAN, and then merge the data. The file directory is as follows: ``` β”œβ”€β”€ train β”‚ β”œβ”€β”€ car β”‚ β”‚ β”œβ”€β”€ 0_real β”‚ β”‚ β”œβ”€β”€ 1_fake β”‚ β”œβ”€β”€ cat β”‚ β”‚ β”œβ”€β”€ ... β”‚ β”œβ”€β”€ chair β”‚ β”‚ β”œβ”€β”€ ... β”‚ β”œβ”€β”€ horse β”‚ β”‚ β”œβ”€β”€ ... β”‚ β”œβ”€β”€ sdv1.4 β”‚ β”‚ β”œβ”€β”€ 0_real β”‚ β”‚ β”œβ”€β”€ 1_fake β”œβ”€β”€ val β”‚ β”œβ”€β”€ ... β”‚ β”‚ β”œβ”€β”€ 0_real β”‚ β”‚ β”œβ”€β”€ 1_fake β”‚ β”‚ ... ``` ### Test AIGIBench comprehensively tests the performance of the detector and builds a test dataset from five perspectives: GAN-based Noise-to-Image Generation, Diffusion for Text-to-Image Generation, GANs for Deepfake, Diffusion for Personalized Generation, and Open-source Platforms. The file directory is as follows: ``` β”œβ”€β”€ test β”‚ β”œβ”€β”€ ProGAN β”‚ β”‚ β”œβ”€β”€ 0_real β”‚ β”‚ β”œβ”€β”€ 1_fake β”‚ β”œβ”€β”€ R3GAN β”‚ β”‚ β”œβ”€β”€ ... β”‚ β”‚ ... β”‚ β”œβ”€β”€ BlendFace β”‚ β”‚ β”œβ”€β”€ 0_real β”‚ β”‚ β”œβ”€β”€ 1_fake β”‚ β”œβ”€β”€ InSwap β”‚ β”‚ β”œβ”€β”€ ... β”‚ β”‚ ... β”‚ β”œβ”€β”€ FLUX1-dev β”‚ β”‚ β”œβ”€β”€ 0_real β”‚ β”‚ β”œβ”€β”€ 1_fake β”‚ β”œβ”€β”€ Midjourney-V6 β”‚ β”‚ β”œβ”€β”€ ... β”‚ β”‚ ... β”‚ β”œβ”€β”€ BLIP β”‚ β”‚ β”œβ”€β”€ 0_real β”‚ β”‚ β”œβ”€β”€ 1_fake β”‚ β”œβ”€β”€ Infinite-ID β”‚ β”‚ β”œβ”€β”€ ... β”‚ β”‚ ... β”‚ β”œβ”€β”€ CommunityAI β”‚ β”‚ β”œβ”€β”€ 0_real β”‚ β”‚ β”œβ”€β”€ 1_fake β”‚ β”œβ”€β”€ SocialRF β”‚ β”‚ β”œβ”€β”€ ... ``` *Note: The test set count in the paper contained some errors, which we are correcting here. Please note that the number of real images and generated images are consistent; only the number of generated images is listed below.* | Generator | Number | |:------: |:---------:| | CommunityAI | 6000 | | SocialRF | 3000 | | FaceSwap | 4000 | | ImSwap | 4000 | | WFIR | 1000 | ## πŸ”Detection Methods We use the official code for all detection codes and make unified modifications to the input and output. The code we use for training in Setting-II is publicly available above, the corresponding pre-trained checkpoints are publicly available on [Huggingface](https://huggingface.co/HorizonTEL/AIGIBench). Of course, if you need the code from the original paper, the following is the corresponding detection code in the paper: - [ResNet-50](https://github.com/huggingface/pytorch-image-models/tree/v0.6.12/timm): Deep Residual Learning for Image Recognition - [CNNDetection](https://github.com/PeterWang512/CNNDetection): CNN-generated images are surprisingly easy to spot...for now - [GramNet](https://github.com/liuzhengzhe/Global_Texture_Enhancement_for_Fake_Face_Detection_in_the-Wild): Global Texture Enhancement for Fake Face Detection in the Wild - [LGrad](https://github.com/chuangchuangtan/LGrad): Learning on Gradients: Generalized Artifacts Representation for GAN-Generated Images Detection - [CLIPDetection](https://github.com/WisconsinAIVision/UniversalFakeDetect): Towards Universal Fake Image Detectors that Generalize Across Generative Models - [FreqNet](https://github.com/chuangchuangtan/FreqNet-DeepfakeDetection): FreqNet: A Frequency-domain Image Super-Resolution Network with Dicrete Cosine Transform - [NPR](https://github.com/chuangchuangtan/NPR-DeepfakeDetection): Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection - [DFFreq](https://github.com/HorizonTEL/DFFreq-main): Dual Frequency Branch Framework with Reconstructed Sliding Windows Attention for AI-Generated Image Detection - [LaDeDa](https://github.com/barcavia/RealTime-DeepfakeDetection-in-the-RealWorld): Real-Time Deepfake Detection in the Real-World - [AIDE](https://github.com/shilinyan99/AIDE): A Sanity Check for AI-generated Image Detection - [SAFE](https://github.com/Ouxiang-Li/SAFE): Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspectives - [Effort](https://github.com/YZY-stack/Effort-AIGI-Detection): Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection ## ⏳Detection Results (Continuously updating) **To ensure a fair comparison, we retrain all baseline methods on the Setting-II of AIGIBench.** _If your retrained results differ significantly from those shown, please contact us._ | Method | Paper | Ref | R.Acc. | F.Acc. | Acc. | A.P. | |:------: |:---------: |:---------:|:------:|:------:|:----:|:----:| | CNNDetection | CNN-generated images are surprisingly easy to spot... for now | CVPR 2020 |**98.2**| 11.6 | 54.9 | 67.0 | | Gram-Net | Global Texture Enhancement for Fake Face Detection In the Wild | CVPR 2020 | 90.5 | 26.6 | 58.6 | 62.4 | | LGrad | Learning on Gradients: Generalized Artifacts Representation for GAN-Generated Images Detection | CVPR 2023 | 85.8 | 39.6 | 62.9 | 66.6 | | UniFD | Towards Universal Fake Image Detectors that Generalize Across Generative Models | CVPR 2023 | 73.3 | 71.5 | 72.5 | 75.6 | | FreqNet | Frequency-Aware Deepfake Detection: Improving Generalizability through Frequency Space Learning | AAAI 2024 | 65.9 | 66.4 | 66.2 | 70.1 | | NPR | Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection | CVPR 2024 | 93.8 | 41.9 | 67.9 | 73.9 | | Ladeda | Real-Time Deepfake Detection in the Real-World | Arxiv 2024| 91.7 | 54.9 | 73.4 | 79.3 | | DFFreq | Dual Frequency Branch Framework with Reconstructed Sliding Windows Attention for AI-Generated Image Detection | TIFS 2026 | 91.8 | 58.0 | 75.1 | 82.2 | | C2P-CLIP* | C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake Detection | AAAI 2025 | 93.8 | 49.8 | 71.8 | 82.2 | | AIDE | A Sanity Check for AI-generated Image Detection | ICLR 2025 | 88.1 | 67.0 | 77.6 | 82.7 | | SAFE | Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspectives | KDD 2025 | 89.0 | 66.6 | 78.1 | 83.6 | | VIB-Net | Towards Universal AI-Generated Image Detection by Variational Information Bottleneck Network | CVPR 2025 | 60.6 |**78.1**| 69.3 | 70.9 | | $D^3$ | $D^3$: Scaling Up Deepfake Detection by Learning from Discrepancy | CVPR 2025 | 81.0 | 46.4 | 63.7 | 68.9 | | Effort | Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection | ICML 2025 | 96.9 | 57.1 | 77.1 |**87.2**| | FerretNet | FerretNet: Efficient Synthetic Image Detection via Local Pixel Dependencies | NIPS 2025 | 96.6 | 61.8 |**79.4**| 85.8 | | LOTA | LOTA: Bit-Planes Guided AI-Generated Image Detection | ICCV 2025 | 89.3 | 65.1 | 77.4 | 83.1 | | BSF | Beyond Semantic Features: Pixel-level Mapping for Generalized AI-Generated Image Detection | AAAI 2026 | 91.5 | 65.6 | 78.8 | 81.1 | | LTD | Layer Consistency Matters: Elegant Latent Transition Discrepancy for Generalizable Synthetic Image Detection | CVPR 2026 | 82.0 | 67.7 | 74.9 | 77.6 | **For specific reasons, in the following method, we directly utilize the official pre-trained weights for inference.** | Method | Paper | Ref | R.Acc. | F.Acc. | Acc. | A.P. | |:------: |:---------: |:---------:|:------:|:------:|:----:|:----:| | DDA | Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable | NIPS 2025 | 93.9 | 69.3 | 81.6 | 90.2 | ## Citation ``` @inproceedings{li2025artificial, title={Is Artificial Intelligence Generated Image Detection a Solved Problem?}, author={Li, Ziqiang and Yan, Jiazhen and He, Ziwen and Zeng, Kai and Jiang, Weiwei and Xiong, Lizhi and Fu, Zhangjie}, booktitle={Advances in Neural Information Processing Systems}, year={2025} } ``` ## Contact If you have any question about this project, please feel free to contact 247918horizon@gmail.com