File size: 6,014 Bytes
75aaabf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
---
license: mit
tags:
- computer-vision
- anomaly-detection
- deep-svdd
- ai-generated-images
- image-classification
- pytorch-lightning
datasets:
- cifar10
library_name: pytorch-lightning
pipeline_tag: image-classification
---

# πŸ” AI Image Detector - Deep SVDD

<div align="center">

**One-Class Deep Learning Model for Detecting AI-Generated Images**

[![Model](https://img.shields.io/badge/Model-Deep%20SVDD-blue)](https://huggingface.co/ash12321/ai-image-detector-deepsvdd)
[![Framework](https://img.shields.io/badge/Framework-PyTorch%20Lightning-red)](https://lightning.ai/)
[![Dataset](https://img.shields.io/badge/Dataset-CIFAR--10-green)](https://www.cs.toronto.edu/~kriz/cifar.html)

</div>

## πŸ“– Model Description

This model detects AI-generated images using **Deep Support Vector Data Description (SVDD)**, a one-class learning approach. It was trained exclusively on real images to learn what "real" looks like, allowing it to identify synthetic/AI-generated images as anomalies.

### Key Features

- βœ… **Enhanced Deep SVDD Architecture** with channel attention mechanisms
- βœ… **Trained on 35,000 real images** from CIFAR-10 dataset
- βœ… **L4 GPU Optimized** with mixed precision training (16-bit)
- βœ… **Advanced Augmentation**: Mixup, multi-scale, contrastive learning
- βœ… **Robust Evaluation**: 70/15/15 train/val/test split with unseen test data

## 🎯 Performance Metrics

| Metric | Value |
|--------|-------|
| **Test Loss** | 0.7637 |
| **Mean Distance** | 0.7637 |
| **Std Distance** | 0.0024 |
| **95th Percentile** | 0.7700 |
| **Radius Threshold** | 0.7747 |

## πŸš€ Quick Start

### Installation

```bash
pip install torch torchvision pytorch-lightning huggingface-hub pillow
```

### Basic Usage

```python
import torch
from huggingface_hub import hf_hub_download
from PIL import Image
import torchvision.transforms as transforms

# Download model
model_path = hf_hub_download(
    repo_id="ash12321/ai-image-detector-deepsvdd",
    filename="model.ckpt"
)

# Load model (you'll need the model class definition)
from model import AdvancedDeepSVDD

model = AdvancedDeepSVDD.load_from_checkpoint(model_path)
model.eval()

# Prepare image
transform = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.4914, 0.4822, 0.4465],
        std=[0.2470, 0.2435, 0.2616]
    )
])

image = Image.open('test_image.jpg').convert('RGB')
image_tensor = transform(image).unsqueeze(0)

# Predict
is_fake, scores, distances = model.predict_anomaly(image_tensor)

print(f"AI-Generated: {is_fake[0].item()}")
print(f"Confidence: {scores[0].item()*100:.1f}%")
print(f"Anomaly Score: {scores[0].item():.4f}")
```

### Using with Gradio

```python
import gradio as gr

def predict(image):
    img_tensor = transform(image).unsqueeze(0)
    is_fake, scores, _ = model.predict_anomaly(img_tensor)
    
    result = "🚨 AI-Generated" if is_fake[0] else "βœ… Real Image"
    confidence = f"{scores[0].item()*100:.1f}%"
    
    return f"**{result}** (Confidence: {confidence})"

demo = gr.Interface(
    fn=predict,
    inputs=gr.Image(type="pil"),
    outputs=gr.Markdown(),
    title="AI Image Detector"
)

demo.launch()
```

## πŸ—οΈ Architecture Details

### Enhanced Deep SVDD Encoder

```
Input (3x32x32) 
β†’ Stem Conv (64 channels)
β†’ Layer1 (64β†’128) + Channel Attention
β†’ Layer2 (128β†’256) + Channel Attention  
β†’ Layer3 (256β†’512) + Channel Attention
β†’ Dual Pooling (Avg + Max)
β†’ Projection Head (1024β†’512β†’128)
β†’ Output (128-dim latent space)
```

### Training Optimizations

- **Optimizer**: AdamW (lr=1e-3, weight_decay=1e-3)
- **Scheduler**: OneCycleLR with cosine annealing
- **Batch Size**: 128 (L4 GPU optimized)
- **Augmentation**: Mixup (Ξ±=0.2), multi-scale, extensive transforms
- **Loss**: SVDD objective + contrastive diversity + L2 regularization

## πŸ“Š Training Configuration

```python
Model Parameters: 5.3M trainable
Epochs: 30
Training Samples: 35,000 (70%)
Validation Samples: 7,500 (15%)
Test Samples: 7,500 (15%)
Precision: 16-bit mixed precision
GPU: NVIDIA L4 with Tensor Cores
```

## 🎨 Data Augmentation Pipeline

**Training Augmentations:**
- Multi-scale resizing (32, 64, 96 pixels)
- Random resized crop (scale: 0.5-1.0)
- Random horizontal/vertical flips
- Random rotation (Β±20Β°)
- Color jitter (brightness, contrast, saturation, hue)
- Gaussian blur
- Random erasing
- Mixup augmentation

**Validation/Test:**
- Simple resize to 32x32
- Normalize with CIFAR-10 statistics

## πŸ’‘ Use Cases

- **Content Moderation**: Identify AI-generated images in uploads
- **Digital Forensics**: Verify authenticity of images
- **Research**: Study differences between real and synthetic images
- **Education**: Demonstrate one-class learning techniques

## ⚠️ Limitations

- **Training Domain**: Optimized for natural images similar to CIFAR-10
- **Image Size**: Trained on 32x32 images (resize larger images)
- **Generalization**: May require fine-tuning for specific domains
- **False Positives**: Unusual real images may be flagged as AI-generated
- **Not Foolproof**: Sophisticated AI images may evade detection

## πŸ“š Citation

If you use this model in your research, please cite:

```bibtex
@misc{ai-image-detector-deepsvdd-2024,
  author = {ash12321},
  title = {AI Image Detector using Deep SVDD},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/ash12321/ai-image-detector-deepsvdd}},
}
```

## πŸ“„ License

This model is released under the MIT License.

## 🀝 Contributing

Contributions, issues, and feature requests are welcome!

## πŸ‘€ Author

**ash12321**
- Hugging Face: [@ash12321](https://huggingface.co/ash12321)

## πŸ™ Acknowledgments

- CIFAR-10 dataset creators
- PyTorch Lightning team
- Deep SVDD paper authors
- Hugging Face for hosting infrastructure

---

<div align="center">

**[Try it on Hugging Face Spaces](https://huggingface.co/spaces/ash12321/ai-image-detector-demo)** 

</div>