Spaces:
Runtime error
Runtime error
Commit Β·
3248dd8
0
Parent(s):
Initial deployment with trained models
Browse files- .gitattributes +1 -0
- README.md +118 -0
- app.py +266 -0
- app_gradio_ui.py +528 -0
- meta_model.pkl +0 -0
- requirements.txt +20 -0
- resnet1d_best.pth +3 -0
- save_models_for_deploy.py +133 -0
- tcn_best.pth +3 -0
.gitattributes
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,118 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: AI Image Detector
|
| 3 |
+
emoji: π
|
| 4 |
+
colorFrom: blue
|
| 5 |
+
colorTo: purple
|
| 6 |
+
sdk: gradio
|
| 7 |
+
sdk_version: 4.0.0
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
license: mit
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# π AI Image Detector
|
| 14 |
+
|
| 15 |
+
Detect whether an image is **AI-generated** or a **real photograph** using a stacking ensemble of deep learning models.
|
| 16 |
+
|
| 17 |
+
## ποΈ Architecture
|
| 18 |
+
|
| 19 |
+
This detector uses a **two-stage pipeline**:
|
| 20 |
+
|
| 21 |
+
### Stage 1: Feature Extraction
|
| 22 |
+
- **Qwen2.5-VL-3B** vision-language model extracts spatial features from input images
|
| 23 |
+
- Features preserve spatial relationships and semantic information
|
| 24 |
+
|
| 25 |
+
### Stage 2: Classification (Stacking Ensemble)
|
| 26 |
+
- **TCN** (Temporal Convolutional Network) - captures sequential patterns in spatial features
|
| 27 |
+
- **ResNet-1D** (Deep Residual Network) - learns hierarchical representations
|
| 28 |
+
- **Meta-Learner** (Logistic Regression) - combines base model predictions for final verdict
|
| 29 |
+
|
| 30 |
+
## π Performance
|
| 31 |
+
|
| 32 |
+
| Model | Accuracy | F1 Score | AUC-ROC |
|
| 33 |
+
|-------|----------|----------|---------|
|
| 34 |
+
| TCN | 96.64% | 96.81% | 0.9851 |
|
| 35 |
+
| ResNet-1D | 96.76% | 96.90% | 0.9867 |
|
| 36 |
+
| **Stacking Ensemble** | **97.18%** | **97.25%** | **0.9892** |
|
| 37 |
+
|
| 38 |
+
## π Usage
|
| 39 |
+
|
| 40 |
+
### Web Interface
|
| 41 |
+
Simply upload an image and click "Detect" to get:
|
| 42 |
+
- Prediction (AI Generated / Real)
|
| 43 |
+
- Confidence score
|
| 44 |
+
- Individual model predictions
|
| 45 |
+
|
| 46 |
+
### API Usage
|
| 47 |
+
```python
|
| 48 |
+
from gradio_client import Client
|
| 49 |
+
|
| 50 |
+
client = Client("your-username/ai-image-detector")
|
| 51 |
+
result = client.predict(
|
| 52 |
+
image="path/to/your/image.jpg",
|
| 53 |
+
api_name="/detect"
|
| 54 |
+
)
|
| 55 |
+
print(result)
|
| 56 |
+
```
|
| 57 |
+
|
| 58 |
+
### Local Usage
|
| 59 |
+
```python
|
| 60 |
+
from app import AIImageDetector
|
| 61 |
+
|
| 62 |
+
detector = AIImageDetector(models_dir="models")
|
| 63 |
+
result = detector.predict("your_image.jpg")
|
| 64 |
+
|
| 65 |
+
print(f"Prediction: {result['prediction']}")
|
| 66 |
+
print(f"Confidence: {result['confidence']:.2%}")
|
| 67 |
+
```
|
| 68 |
+
|
| 69 |
+
## π Model Files
|
| 70 |
+
|
| 71 |
+
The following files are required in the `models/` directory:
|
| 72 |
+
|
| 73 |
+
```
|
| 74 |
+
models/
|
| 75 |
+
βββ tcn_best.pth # Trained TCN model weights
|
| 76 |
+
βββ resnet1d_best.pth # Trained ResNet-1D model weights
|
| 77 |
+
βββ meta_model.pkl # Trained meta-learner (sklearn)
|
| 78 |
+
βββ config.pkl # Model configuration (max_patches, hidden_dim)
|
| 79 |
+
```
|
| 80 |
+
|
| 81 |
+
## π§ Training
|
| 82 |
+
|
| 83 |
+
This model was trained on a dataset of:
|
| 84 |
+
- ~10,000 AI-generated images (from various generators)
|
| 85 |
+
- ~10,000 real photographs
|
| 86 |
+
|
| 87 |
+
Training pipeline:
|
| 88 |
+
1. Extract spatial features using Qwen2.5-VL
|
| 89 |
+
2. Train individual models (TCN, ResNet-1D)
|
| 90 |
+
3. Train meta-learner on validation predictions
|
| 91 |
+
4. Evaluate on held-out test set
|
| 92 |
+
|
| 93 |
+
## β οΈ Limitations
|
| 94 |
+
|
| 95 |
+
- Best performance on images similar to training distribution
|
| 96 |
+
- May struggle with:
|
| 97 |
+
- Very low resolution images
|
| 98 |
+
- Heavily compressed images
|
| 99 |
+
- Screenshots or digitally altered photos
|
| 100 |
+
- New AI generators not in training data
|
| 101 |
+
|
| 102 |
+
## π Citation
|
| 103 |
+
|
| 104 |
+
If you use this model, please cite:
|
| 105 |
+
|
| 106 |
+
```bibtex
|
| 107 |
+
@misc{ai-image-detector-2024,
|
| 108 |
+
author = {Your Name},
|
| 109 |
+
title = {AI Image Detector: Stacking Ensemble for Detecting AI-Generated Images},
|
| 110 |
+
year = {2024},
|
| 111 |
+
publisher = {Hugging Face},
|
| 112 |
+
url = {https://huggingface.co/spaces/your-username/ai-image-detector}
|
| 113 |
+
}
|
| 114 |
+
```
|
| 115 |
+
|
| 116 |
+
## π License
|
| 117 |
+
|
| 118 |
+
MIT License - See LICENSE file for details.
|
app.py
ADDED
|
@@ -0,0 +1,266 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""
|
| 3 |
+
AI Image Detector - API Endpoint for Hugging Face Spaces
|
| 4 |
+
Returns JSON with AI probability percentage
|
| 5 |
+
"""
|
| 6 |
+
|
| 7 |
+
import gradio as gr
|
| 8 |
+
import torch
|
| 9 |
+
import torch.nn as nn
|
| 10 |
+
import numpy as np
|
| 11 |
+
from PIL import Image
|
| 12 |
+
import pickle
|
| 13 |
+
import os
|
| 14 |
+
|
| 15 |
+
# ==================== MODEL DEFINITIONS ====================
|
| 16 |
+
|
| 17 |
+
class TemporalBlock(nn.Module):
|
| 18 |
+
def __init__(self, in_channels, out_channels, kernel_size, stride, dilation, dropout=0.3):
|
| 19 |
+
super(TemporalBlock, self).__init__()
|
| 20 |
+
padding = (kernel_size - 1) * dilation
|
| 21 |
+
self.conv1 = nn.Conv1d(in_channels, out_channels, kernel_size, stride=stride, padding=padding, dilation=dilation)
|
| 22 |
+
self.bn1 = nn.BatchNorm1d(out_channels)
|
| 23 |
+
self.relu1 = nn.ReLU()
|
| 24 |
+
self.dropout1 = nn.Dropout(dropout)
|
| 25 |
+
self.conv2 = nn.Conv1d(out_channels, out_channels, kernel_size, stride=stride, padding=padding, dilation=dilation)
|
| 26 |
+
self.bn2 = nn.BatchNorm1d(out_channels)
|
| 27 |
+
self.relu2 = nn.ReLU()
|
| 28 |
+
self.dropout2 = nn.Dropout(dropout)
|
| 29 |
+
self.downsample = nn.Conv1d(in_channels, out_channels, 1) if in_channels != out_channels else None
|
| 30 |
+
self.relu = nn.ReLU()
|
| 31 |
+
|
| 32 |
+
def forward(self, x):
|
| 33 |
+
out = self.conv1(x)
|
| 34 |
+
out = out[:, :, :-self.conv1.padding[0]] if self.conv1.padding[0] > 0 else out
|
| 35 |
+
out = self.dropout1(self.relu1(self.bn1(out)))
|
| 36 |
+
out = self.conv2(out)
|
| 37 |
+
out = out[:, :, :-self.conv2.padding[0]] if self.conv2.padding[0] > 0 else out
|
| 38 |
+
out = self.dropout2(self.relu2(self.bn2(out)))
|
| 39 |
+
res = x if self.downsample is None else self.downsample(x)
|
| 40 |
+
if res.size(2) != out.size(2):
|
| 41 |
+
diff = res.size(2) - out.size(2)
|
| 42 |
+
res = res[:, :, :-diff] if diff > 0 else nn.functional.pad(res, (0, -diff))
|
| 43 |
+
return self.relu(out + res)
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
class TCN(nn.Module):
|
| 47 |
+
def __init__(self, input_dim, num_channels=[128, 256, 512, 512], kernel_size=3, dropout=0.3):
|
| 48 |
+
super(TCN, self).__init__()
|
| 49 |
+
layers = []
|
| 50 |
+
for i in range(len(num_channels)):
|
| 51 |
+
dilation = 2 ** i
|
| 52 |
+
in_ch = input_dim if i == 0 else num_channels[i-1]
|
| 53 |
+
layers.append(TemporalBlock(in_ch, num_channels[i], kernel_size, stride=1, dilation=dilation, dropout=dropout))
|
| 54 |
+
self.network = nn.Sequential(*layers)
|
| 55 |
+
self.classifier = nn.Sequential(
|
| 56 |
+
nn.AdaptiveAvgPool1d(1), nn.Flatten(), nn.Dropout(0.5),
|
| 57 |
+
nn.Linear(num_channels[-1], 256), nn.ReLU(), nn.Dropout(0.3), nn.Linear(256, 2)
|
| 58 |
+
)
|
| 59 |
+
|
| 60 |
+
def forward(self, x):
|
| 61 |
+
return self.classifier(self.network(x.transpose(1, 2)))
|
| 62 |
+
|
| 63 |
+
|
| 64 |
+
class ResidualBlock1D(nn.Module):
|
| 65 |
+
def __init__(self, in_channels, out_channels, stride=1):
|
| 66 |
+
super(ResidualBlock1D, self).__init__()
|
| 67 |
+
mid = out_channels // 4
|
| 68 |
+
self.conv1 = nn.Conv1d(in_channels, mid, 1, bias=False)
|
| 69 |
+
self.bn1 = nn.BatchNorm1d(mid)
|
| 70 |
+
self.conv2 = nn.Conv1d(mid, mid, 3, stride=stride, padding=1, bias=False)
|
| 71 |
+
self.bn2 = nn.BatchNorm1d(mid)
|
| 72 |
+
self.conv3 = nn.Conv1d(mid, out_channels, 1, bias=False)
|
| 73 |
+
self.bn3 = nn.BatchNorm1d(out_channels)
|
| 74 |
+
self.relu = nn.ReLU(inplace=True)
|
| 75 |
+
self.dropout = nn.Dropout(0.3)
|
| 76 |
+
self.shortcut = nn.Sequential()
|
| 77 |
+
if stride != 1 or in_channels != out_channels:
|
| 78 |
+
self.shortcut = nn.Sequential(nn.Conv1d(in_channels, out_channels, 1, stride=stride, bias=False), nn.BatchNorm1d(out_channels))
|
| 79 |
+
|
| 80 |
+
def forward(self, x):
|
| 81 |
+
out = self.relu(self.bn1(self.conv1(x)))
|
| 82 |
+
out = self.relu(self.bn2(self.conv2(out)))
|
| 83 |
+
out = self.bn3(self.conv3(out))
|
| 84 |
+
return self.dropout(self.relu(out + self.shortcut(x)))
|
| 85 |
+
|
| 86 |
+
|
| 87 |
+
class ResNet1D(nn.Module):
|
| 88 |
+
def __init__(self, input_dim, num_classes=2):
|
| 89 |
+
super(ResNet1D, self).__init__()
|
| 90 |
+
self.conv1 = nn.Conv1d(input_dim, 64, 7, stride=2, padding=3, bias=False)
|
| 91 |
+
self.bn1 = nn.BatchNorm1d(64)
|
| 92 |
+
self.relu = nn.ReLU(inplace=True)
|
| 93 |
+
self.maxpool = nn.MaxPool1d(3, stride=2, padding=1)
|
| 94 |
+
self.layer1 = self._make_layer(64, 256, 2, 1)
|
| 95 |
+
self.layer2 = self._make_layer(256, 512, 2, 2)
|
| 96 |
+
self.layer3 = self._make_layer(512, 1024, 2, 2)
|
| 97 |
+
self.avgpool = nn.AdaptiveAvgPool1d(1)
|
| 98 |
+
self.fc = nn.Sequential(nn.Dropout(0.5), nn.Linear(1024, 512), nn.ReLU(), nn.Dropout(0.3), nn.Linear(512, num_classes))
|
| 99 |
+
|
| 100 |
+
def _make_layer(self, in_ch, out_ch, blocks, stride):
|
| 101 |
+
layers = [ResidualBlock1D(in_ch, out_ch, stride)]
|
| 102 |
+
for _ in range(1, blocks):
|
| 103 |
+
layers.append(ResidualBlock1D(out_ch, out_ch, 1))
|
| 104 |
+
return nn.Sequential(*layers)
|
| 105 |
+
|
| 106 |
+
def forward(self, x):
|
| 107 |
+
x = self.maxpool(self.relu(self.bn1(self.conv1(x.transpose(1, 2)))))
|
| 108 |
+
x = self.layer3(self.layer2(self.layer1(x)))
|
| 109 |
+
return self.fc(self.avgpool(x).view(x.size(0), -1))
|
| 110 |
+
|
| 111 |
+
|
| 112 |
+
# ==================== DETECTOR CLASS ====================
|
| 113 |
+
|
| 114 |
+
class AIDetector:
|
| 115 |
+
def __init__(self):
|
| 116 |
+
self.device = "cpu" # Force CPU for HF Spaces free tier
|
| 117 |
+
self.hidden_dim = 2048
|
| 118 |
+
self.max_patches = 103
|
| 119 |
+
|
| 120 |
+
# Models (lazy loaded)
|
| 121 |
+
self.qwen_model = None
|
| 122 |
+
self.qwen_processor = None
|
| 123 |
+
self.tcn = None
|
| 124 |
+
self.resnet = None
|
| 125 |
+
self.meta_model = None
|
| 126 |
+
self.loaded = False
|
| 127 |
+
|
| 128 |
+
def load_qwen(self):
|
| 129 |
+
"""Load Qwen2.5-VL for feature extraction"""
|
| 130 |
+
if self.qwen_model is None:
|
| 131 |
+
print("Loading Qwen2.5-VL-3B-Instruct (this takes ~2 minutes on CPU)...")
|
| 132 |
+
from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
|
| 133 |
+
|
| 134 |
+
model_id = "Qwen/Qwen2.5-VL-3B-Instruct"
|
| 135 |
+
self.qwen_processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
|
| 136 |
+
self.qwen_model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
|
| 137 |
+
model_id,
|
| 138 |
+
torch_dtype=torch.float32,
|
| 139 |
+
device_map="cpu",
|
| 140 |
+
trust_remote_code=True,
|
| 141 |
+
low_cpu_mem_usage=True
|
| 142 |
+
)
|
| 143 |
+
self.qwen_model.eval()
|
| 144 |
+
print("Qwen loaded!")
|
| 145 |
+
|
| 146 |
+
def load_classifiers(self):
|
| 147 |
+
"""Load trained classifiers"""
|
| 148 |
+
if not self.loaded:
|
| 149 |
+
print("Loading classifiers...")
|
| 150 |
+
|
| 151 |
+
# TCN
|
| 152 |
+
self.tcn = TCN(self.hidden_dim).to(self.device)
|
| 153 |
+
self.tcn.load_state_dict(torch.load("tcn_best.pth", map_location=self.device))
|
| 154 |
+
self.tcn.eval()
|
| 155 |
+
|
| 156 |
+
# ResNet-1D
|
| 157 |
+
self.resnet = ResNet1D(self.hidden_dim).to(self.device)
|
| 158 |
+
self.resnet.load_state_dict(torch.load("resnet1d_best.pth", map_location=self.device))
|
| 159 |
+
self.resnet.eval()
|
| 160 |
+
|
| 161 |
+
# Meta-learner
|
| 162 |
+
with open("meta_model.pkl", "rb") as f:
|
| 163 |
+
self.meta_model = pickle.load(f)
|
| 164 |
+
|
| 165 |
+
self.loaded = True
|
| 166 |
+
print("Classifiers loaded!")
|
| 167 |
+
|
| 168 |
+
def extract_features(self, image: Image.Image) -> np.ndarray:
|
| 169 |
+
"""Extract features from image using Qwen2.5-VL"""
|
| 170 |
+
self.load_qwen()
|
| 171 |
+
from qwen_vl_utils import process_vision_info
|
| 172 |
+
|
| 173 |
+
# Resize image
|
| 174 |
+
image = image.convert("RGB").resize((256, 256), Image.LANCZOS)
|
| 175 |
+
|
| 176 |
+
messages = [{"role": "user", "content": [{"type": "image", "image": image}, {"type": "text", "text": "Describe"}]}]
|
| 177 |
+
text = self.qwen_processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
| 178 |
+
image_inputs, _ = process_vision_info(messages)
|
| 179 |
+
inputs = self.qwen_processor(text=[text], images=image_inputs, padding=True, return_tensors="pt")
|
| 180 |
+
|
| 181 |
+
with torch.no_grad():
|
| 182 |
+
outputs = self.qwen_model.model(
|
| 183 |
+
input_ids=inputs["input_ids"],
|
| 184 |
+
attention_mask=inputs["attention_mask"],
|
| 185 |
+
pixel_values=inputs.get("pixel_values"),
|
| 186 |
+
image_grid_thw=inputs.get("image_grid_thw"),
|
| 187 |
+
output_hidden_states=True
|
| 188 |
+
)
|
| 189 |
+
features = outputs.last_hidden_state[0].cpu().numpy()
|
| 190 |
+
|
| 191 |
+
return features
|
| 192 |
+
|
| 193 |
+
def pad_features(self, features: np.ndarray) -> np.ndarray:
|
| 194 |
+
"""Pad/truncate features to expected size"""
|
| 195 |
+
n = features.shape[0]
|
| 196 |
+
if n < self.max_patches:
|
| 197 |
+
padded = np.zeros((self.max_patches, self.hidden_dim), dtype=np.float32)
|
| 198 |
+
padded[:n] = features
|
| 199 |
+
return padded
|
| 200 |
+
return features[:self.max_patches].astype(np.float32)
|
| 201 |
+
|
| 202 |
+
def predict(self, image: Image.Image) -> dict:
|
| 203 |
+
"""Full prediction pipeline"""
|
| 204 |
+
self.load_classifiers()
|
| 205 |
+
|
| 206 |
+
# Extract features
|
| 207 |
+
features = self.extract_features(image)
|
| 208 |
+
features = self.pad_features(features)
|
| 209 |
+
x = torch.FloatTensor(features).unsqueeze(0).to(self.device)
|
| 210 |
+
|
| 211 |
+
# Get base model predictions
|
| 212 |
+
with torch.no_grad():
|
| 213 |
+
tcn_out = torch.softmax(self.tcn(x), dim=1)
|
| 214 |
+
tcn_prob = tcn_out[0, 1].item()
|
| 215 |
+
tcn_pred = 1 if tcn_prob > 0.5 else 0
|
| 216 |
+
|
| 217 |
+
resnet_out = torch.softmax(self.resnet(x), dim=1)
|
| 218 |
+
resnet_prob = resnet_out[0, 1].item()
|
| 219 |
+
resnet_pred = 1 if resnet_prob > 0.5 else 0
|
| 220 |
+
|
| 221 |
+
# Meta-learner stacking
|
| 222 |
+
meta_features = np.array([[tcn_pred, tcn_prob, resnet_pred, resnet_prob]])
|
| 223 |
+
final_prob = self.meta_model.predict_proba(meta_features)[0, 1]
|
| 224 |
+
|
| 225 |
+
# AI probability is 1 - real probability
|
| 226 |
+
ai_percentage = (1 - final_prob) * 100
|
| 227 |
+
|
| 228 |
+
return {
|
| 229 |
+
"ai_percentage": round(ai_percentage, 2),
|
| 230 |
+
"real_percentage": round(final_prob * 100, 2),
|
| 231 |
+
"verdict": "AI Generated" if ai_percentage > 50 else "Real",
|
| 232 |
+
"confidence": round(max(ai_percentage, 100 - ai_percentage), 2),
|
| 233 |
+
"tcn_ai_prob": round((1 - tcn_prob) * 100, 2),
|
| 234 |
+
"resnet_ai_prob": round((1 - resnet_prob) * 100, 2)
|
| 235 |
+
}
|
| 236 |
+
|
| 237 |
+
|
| 238 |
+
# ==================== GRADIO API ====================
|
| 239 |
+
|
| 240 |
+
detector = AIDetector()
|
| 241 |
+
|
| 242 |
+
def detect(image):
|
| 243 |
+
"""API endpoint function"""
|
| 244 |
+
if image is None:
|
| 245 |
+
return {"error": "No image provided"}
|
| 246 |
+
|
| 247 |
+
try:
|
| 248 |
+
result = detector.predict(image)
|
| 249 |
+
return result
|
| 250 |
+
except Exception as e:
|
| 251 |
+
return {"error": str(e)}
|
| 252 |
+
|
| 253 |
+
|
| 254 |
+
# Simple UI for testing + API endpoint
|
| 255 |
+
demo = gr.Interface(
|
| 256 |
+
fn=detect,
|
| 257 |
+
inputs=gr.Image(type="pil", label="Upload Image"),
|
| 258 |
+
outputs=gr.JSON(label="Result"),
|
| 259 |
+
title="AI Image Detector API",
|
| 260 |
+
description="Upload an image to detect if it's AI-generated. Returns JSON with ai_percentage.",
|
| 261 |
+
allow_flagging="never"
|
| 262 |
+
)
|
| 263 |
+
|
| 264 |
+
# This exposes the API at /api/predict
|
| 265 |
+
if __name__ == "__main__":
|
| 266 |
+
demo.launch()
|
app_gradio_ui.py
ADDED
|
@@ -0,0 +1,528 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""
|
| 3 |
+
AI Image Detector - Hugging Face Spaces Deployment
|
| 4 |
+
Stacking Ensemble (TCN + ResNet-1D) with Qwen2.5-VL Feature Extraction
|
| 5 |
+
|
| 6 |
+
This app detects whether an image is AI-generated or real using:
|
| 7 |
+
1. Qwen2.5-VL for spatial feature extraction
|
| 8 |
+
2. TCN + ResNet-1D stacking ensemble for classification
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
import gradio as gr
|
| 12 |
+
import torch
|
| 13 |
+
import torch.nn as nn
|
| 14 |
+
import numpy as np
|
| 15 |
+
from PIL import Image
|
| 16 |
+
import pickle
|
| 17 |
+
import os
|
| 18 |
+
|
| 19 |
+
# ==================== MODEL DEFINITIONS ====================
|
| 20 |
+
|
| 21 |
+
class TemporalBlock(nn.Module):
|
| 22 |
+
"""Temporal Block with Dilated Convolutions"""
|
| 23 |
+
|
| 24 |
+
def __init__(self, in_channels, out_channels, kernel_size, stride, dilation, dropout=0.3):
|
| 25 |
+
super(TemporalBlock, self).__init__()
|
| 26 |
+
|
| 27 |
+
padding = (kernel_size - 1) * dilation
|
| 28 |
+
|
| 29 |
+
self.conv1 = nn.Conv1d(in_channels, out_channels, kernel_size,
|
| 30 |
+
stride=stride, padding=padding, dilation=dilation)
|
| 31 |
+
self.bn1 = nn.BatchNorm1d(out_channels)
|
| 32 |
+
self.relu1 = nn.ReLU()
|
| 33 |
+
self.dropout1 = nn.Dropout(dropout)
|
| 34 |
+
|
| 35 |
+
self.conv2 = nn.Conv1d(out_channels, out_channels, kernel_size,
|
| 36 |
+
stride=stride, padding=padding, dilation=dilation)
|
| 37 |
+
self.bn2 = nn.BatchNorm1d(out_channels)
|
| 38 |
+
self.relu2 = nn.ReLU()
|
| 39 |
+
self.dropout2 = nn.Dropout(dropout)
|
| 40 |
+
|
| 41 |
+
self.downsample = nn.Conv1d(in_channels, out_channels, 1) if in_channels != out_channels else None
|
| 42 |
+
self.relu = nn.ReLU()
|
| 43 |
+
|
| 44 |
+
def forward(self, x):
|
| 45 |
+
out = self.conv1(x)
|
| 46 |
+
out = out[:, :, :-self.conv1.padding[0]] if self.conv1.padding[0] > 0 else out
|
| 47 |
+
out = self.bn1(out)
|
| 48 |
+
out = self.relu1(out)
|
| 49 |
+
out = self.dropout1(out)
|
| 50 |
+
|
| 51 |
+
out = self.conv2(out)
|
| 52 |
+
out = out[:, :, :-self.conv2.padding[0]] if self.conv2.padding[0] > 0 else out
|
| 53 |
+
out = self.bn2(out)
|
| 54 |
+
out = self.relu2(out)
|
| 55 |
+
out = self.dropout2(out)
|
| 56 |
+
|
| 57 |
+
res = x if self.downsample is None else self.downsample(x)
|
| 58 |
+
|
| 59 |
+
if res.size(2) != out.size(2):
|
| 60 |
+
diff = res.size(2) - out.size(2)
|
| 61 |
+
if diff > 0:
|
| 62 |
+
res = res[:, :, :-diff]
|
| 63 |
+
else:
|
| 64 |
+
res = nn.functional.pad(res, (0, -diff))
|
| 65 |
+
|
| 66 |
+
return self.relu(out + res)
|
| 67 |
+
|
| 68 |
+
|
| 69 |
+
class TCN(nn.Module):
|
| 70 |
+
"""Temporal Convolutional Network"""
|
| 71 |
+
|
| 72 |
+
def __init__(self, input_dim, num_channels=[128, 256, 512, 512], kernel_size=3, dropout=0.3):
|
| 73 |
+
super(TCN, self).__init__()
|
| 74 |
+
|
| 75 |
+
layers = []
|
| 76 |
+
num_levels = len(num_channels)
|
| 77 |
+
|
| 78 |
+
for i in range(num_levels):
|
| 79 |
+
dilation = 2 ** i
|
| 80 |
+
in_channels = input_dim if i == 0 else num_channels[i-1]
|
| 81 |
+
out_channels = num_channels[i]
|
| 82 |
+
|
| 83 |
+
layers.append(
|
| 84 |
+
TemporalBlock(in_channels, out_channels, kernel_size,
|
| 85 |
+
stride=1, dilation=dilation, dropout=dropout)
|
| 86 |
+
)
|
| 87 |
+
|
| 88 |
+
self.network = nn.Sequential(*layers)
|
| 89 |
+
|
| 90 |
+
self.classifier = nn.Sequential(
|
| 91 |
+
nn.AdaptiveAvgPool1d(1),
|
| 92 |
+
nn.Flatten(),
|
| 93 |
+
nn.Dropout(0.5),
|
| 94 |
+
nn.Linear(num_channels[-1], 256),
|
| 95 |
+
nn.ReLU(),
|
| 96 |
+
nn.Dropout(0.3),
|
| 97 |
+
nn.Linear(256, 2)
|
| 98 |
+
)
|
| 99 |
+
|
| 100 |
+
def forward(self, x):
|
| 101 |
+
x = x.transpose(1, 2)
|
| 102 |
+
x = self.network(x)
|
| 103 |
+
x = self.classifier(x)
|
| 104 |
+
return x
|
| 105 |
+
|
| 106 |
+
|
| 107 |
+
class ResidualBlock1D(nn.Module):
|
| 108 |
+
"""1D Residual Block"""
|
| 109 |
+
|
| 110 |
+
def __init__(self, in_channels, out_channels, stride=1):
|
| 111 |
+
super(ResidualBlock1D, self).__init__()
|
| 112 |
+
|
| 113 |
+
mid_channels = out_channels // 4
|
| 114 |
+
|
| 115 |
+
self.conv1 = nn.Conv1d(in_channels, mid_channels, kernel_size=1, bias=False)
|
| 116 |
+
self.bn1 = nn.BatchNorm1d(mid_channels)
|
| 117 |
+
|
| 118 |
+
self.conv2 = nn.Conv1d(mid_channels, mid_channels, kernel_size=3,
|
| 119 |
+
stride=stride, padding=1, bias=False)
|
| 120 |
+
self.bn2 = nn.BatchNorm1d(mid_channels)
|
| 121 |
+
|
| 122 |
+
self.conv3 = nn.Conv1d(mid_channels, out_channels, kernel_size=1, bias=False)
|
| 123 |
+
self.bn3 = nn.BatchNorm1d(out_channels)
|
| 124 |
+
|
| 125 |
+
self.relu = nn.ReLU(inplace=True)
|
| 126 |
+
self.dropout = nn.Dropout(0.3)
|
| 127 |
+
|
| 128 |
+
self.shortcut = nn.Sequential()
|
| 129 |
+
if stride != 1 or in_channels != out_channels:
|
| 130 |
+
self.shortcut = nn.Sequential(
|
| 131 |
+
nn.Conv1d(in_channels, out_channels, kernel_size=1,
|
| 132 |
+
stride=stride, bias=False),
|
| 133 |
+
nn.BatchNorm1d(out_channels)
|
| 134 |
+
)
|
| 135 |
+
|
| 136 |
+
def forward(self, x):
|
| 137 |
+
residual = x
|
| 138 |
+
|
| 139 |
+
out = self.relu(self.bn1(self.conv1(x)))
|
| 140 |
+
out = self.relu(self.bn2(self.conv2(out)))
|
| 141 |
+
out = self.bn3(self.conv3(out))
|
| 142 |
+
|
| 143 |
+
out += self.shortcut(residual)
|
| 144 |
+
out = self.relu(out)
|
| 145 |
+
out = self.dropout(out)
|
| 146 |
+
|
| 147 |
+
return out
|
| 148 |
+
|
| 149 |
+
|
| 150 |
+
class ResNet1D(nn.Module):
|
| 151 |
+
"""ResNet-1D for Sequential Classification"""
|
| 152 |
+
|
| 153 |
+
def __init__(self, input_dim, num_classes=2):
|
| 154 |
+
super(ResNet1D, self).__init__()
|
| 155 |
+
|
| 156 |
+
self.conv1 = nn.Conv1d(input_dim, 64, kernel_size=7, stride=2, padding=3, bias=False)
|
| 157 |
+
self.bn1 = nn.BatchNorm1d(64)
|
| 158 |
+
self.relu = nn.ReLU(inplace=True)
|
| 159 |
+
self.maxpool = nn.MaxPool1d(kernel_size=3, stride=2, padding=1)
|
| 160 |
+
|
| 161 |
+
self.layer1 = self._make_layer(64, 256, num_blocks=2, stride=1)
|
| 162 |
+
self.layer2 = self._make_layer(256, 512, num_blocks=2, stride=2)
|
| 163 |
+
self.layer3 = self._make_layer(512, 1024, num_blocks=2, stride=2)
|
| 164 |
+
|
| 165 |
+
self.avgpool = nn.AdaptiveAvgPool1d(1)
|
| 166 |
+
self.fc = nn.Sequential(
|
| 167 |
+
nn.Dropout(0.5),
|
| 168 |
+
nn.Linear(1024, 512),
|
| 169 |
+
nn.ReLU(),
|
| 170 |
+
nn.Dropout(0.3),
|
| 171 |
+
nn.Linear(512, num_classes)
|
| 172 |
+
)
|
| 173 |
+
|
| 174 |
+
def _make_layer(self, in_channels, out_channels, num_blocks, stride):
|
| 175 |
+
layers = []
|
| 176 |
+
layers.append(ResidualBlock1D(in_channels, out_channels, stride))
|
| 177 |
+
for _ in range(1, num_blocks):
|
| 178 |
+
layers.append(ResidualBlock1D(out_channels, out_channels, stride=1))
|
| 179 |
+
return nn.Sequential(*layers)
|
| 180 |
+
|
| 181 |
+
def forward(self, x):
|
| 182 |
+
x = x.transpose(1, 2)
|
| 183 |
+
|
| 184 |
+
x = self.relu(self.bn1(self.conv1(x)))
|
| 185 |
+
x = self.maxpool(x)
|
| 186 |
+
|
| 187 |
+
x = self.layer1(x)
|
| 188 |
+
x = self.layer2(x)
|
| 189 |
+
x = self.layer3(x)
|
| 190 |
+
|
| 191 |
+
x = self.avgpool(x)
|
| 192 |
+
x = x.view(x.size(0), -1)
|
| 193 |
+
x = self.fc(x)
|
| 194 |
+
|
| 195 |
+
return x
|
| 196 |
+
|
| 197 |
+
|
| 198 |
+
# ==================== FEATURE EXTRACTOR ====================
|
| 199 |
+
|
| 200 |
+
class FeatureExtractor:
|
| 201 |
+
"""Extract spatial features using Qwen2.5-VL"""
|
| 202 |
+
|
| 203 |
+
def __init__(self, model_id="Qwen/Qwen2.5-VL-3B-Instruct"):
|
| 204 |
+
self.device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 205 |
+
self.model = None
|
| 206 |
+
self.processor = None
|
| 207 |
+
self.model_id = model_id
|
| 208 |
+
self.target_size = (256, 256)
|
| 209 |
+
|
| 210 |
+
def load_model(self):
|
| 211 |
+
"""Load the Qwen2.5-VL model (lazy loading)"""
|
| 212 |
+
if self.model is None:
|
| 213 |
+
from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
|
| 214 |
+
|
| 215 |
+
print(f"Loading {self.model_id}...")
|
| 216 |
+
self.processor = AutoProcessor.from_pretrained(
|
| 217 |
+
self.model_id,
|
| 218 |
+
trust_remote_code=True
|
| 219 |
+
)
|
| 220 |
+
|
| 221 |
+
self.model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
|
| 222 |
+
self.model_id,
|
| 223 |
+
torch_dtype=torch.float16 if self.device == "cuda" else torch.float32,
|
| 224 |
+
device_map="auto",
|
| 225 |
+
trust_remote_code=True,
|
| 226 |
+
low_cpu_mem_usage=True
|
| 227 |
+
)
|
| 228 |
+
self.model.eval()
|
| 229 |
+
print("Model loaded!")
|
| 230 |
+
|
| 231 |
+
def preprocess_image(self, image):
|
| 232 |
+
"""Preprocess image with aspect ratio preservation"""
|
| 233 |
+
if isinstance(image, str):
|
| 234 |
+
image = Image.open(image)
|
| 235 |
+
|
| 236 |
+
image = image.convert('RGB')
|
| 237 |
+
|
| 238 |
+
# Resize with padding to preserve aspect ratio
|
| 239 |
+
width, height = image.size
|
| 240 |
+
scale = min(self.target_size[0] / width, self.target_size[1] / height)
|
| 241 |
+
|
| 242 |
+
new_width = int(width * scale)
|
| 243 |
+
new_height = int(height * scale)
|
| 244 |
+
image = image.resize((new_width, new_height), Image.LANCZOS)
|
| 245 |
+
|
| 246 |
+
# Create black canvas and paste resized image in center
|
| 247 |
+
canvas = Image.new('RGB', self.target_size, (0, 0, 0))
|
| 248 |
+
paste_x = (self.target_size[0] - new_width) // 2
|
| 249 |
+
paste_y = (self.target_size[1] - new_height) // 2
|
| 250 |
+
canvas.paste(image, (paste_x, paste_y))
|
| 251 |
+
|
| 252 |
+
return canvas
|
| 253 |
+
|
| 254 |
+
def extract_features(self, image):
|
| 255 |
+
"""Extract spatial features from an image"""
|
| 256 |
+
self.load_model()
|
| 257 |
+
|
| 258 |
+
from qwen_vl_utils import process_vision_info
|
| 259 |
+
|
| 260 |
+
image = self.preprocess_image(image)
|
| 261 |
+
|
| 262 |
+
with torch.no_grad():
|
| 263 |
+
messages = [{
|
| 264 |
+
"role": "user",
|
| 265 |
+
"content": [
|
| 266 |
+
{"type": "image", "image": image},
|
| 267 |
+
{"type": "text", "text": "Image"}
|
| 268 |
+
]
|
| 269 |
+
}]
|
| 270 |
+
|
| 271 |
+
text = self.processor.apply_chat_template(
|
| 272 |
+
messages, tokenize=False, add_generation_prompt=True
|
| 273 |
+
)
|
| 274 |
+
|
| 275 |
+
image_inputs, _ = process_vision_info(messages)
|
| 276 |
+
|
| 277 |
+
inputs = self.processor(
|
| 278 |
+
text=[text],
|
| 279 |
+
images=image_inputs,
|
| 280 |
+
padding=True,
|
| 281 |
+
return_tensors="pt"
|
| 282 |
+
)
|
| 283 |
+
|
| 284 |
+
inputs = {k: v.to(self.device) if isinstance(v, torch.Tensor) else v
|
| 285 |
+
for k, v in inputs.items()}
|
| 286 |
+
|
| 287 |
+
outputs = self.model.model(
|
| 288 |
+
input_ids=inputs['input_ids'],
|
| 289 |
+
attention_mask=inputs['attention_mask'],
|
| 290 |
+
pixel_values=inputs.get('pixel_values'),
|
| 291 |
+
image_grid_thw=inputs.get('image_grid_thw'),
|
| 292 |
+
output_hidden_states=True
|
| 293 |
+
)
|
| 294 |
+
|
| 295 |
+
spatial_features = outputs.last_hidden_state[0].cpu().numpy()
|
| 296 |
+
|
| 297 |
+
return spatial_features
|
| 298 |
+
|
| 299 |
+
|
| 300 |
+
# ==================== DETECTOR ====================
|
| 301 |
+
|
| 302 |
+
class AIImageDetector:
|
| 303 |
+
"""AI Image Detector using Stacking Ensemble"""
|
| 304 |
+
|
| 305 |
+
def __init__(self, models_dir="models"):
|
| 306 |
+
self.device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 307 |
+
self.models_dir = models_dir
|
| 308 |
+
|
| 309 |
+
self.feature_extractor = FeatureExtractor()
|
| 310 |
+
self.tcn_model = None
|
| 311 |
+
self.resnet_model = None
|
| 312 |
+
self.meta_model = None
|
| 313 |
+
|
| 314 |
+
self.max_patches = None
|
| 315 |
+
self.hidden_dim = None
|
| 316 |
+
|
| 317 |
+
def load_models(self):
|
| 318 |
+
"""Load all trained models"""
|
| 319 |
+
print("Loading models...")
|
| 320 |
+
|
| 321 |
+
# Load config
|
| 322 |
+
config_path = os.path.join(self.models_dir, "config.pkl")
|
| 323 |
+
if os.path.exists(config_path):
|
| 324 |
+
with open(config_path, 'rb') as f:
|
| 325 |
+
config = pickle.load(f)
|
| 326 |
+
self.max_patches = config['max_patches']
|
| 327 |
+
self.hidden_dim = config['hidden_dim']
|
| 328 |
+
else:
|
| 329 |
+
# Default values from training
|
| 330 |
+
self.max_patches = 256
|
| 331 |
+
self.hidden_dim = 2048
|
| 332 |
+
|
| 333 |
+
# Load TCN
|
| 334 |
+
tcn_path = os.path.join(self.models_dir, "tcn_best.pth")
|
| 335 |
+
if os.path.exists(tcn_path):
|
| 336 |
+
self.tcn_model = TCN(self.hidden_dim).to(self.device)
|
| 337 |
+
self.tcn_model.load_state_dict(torch.load(tcn_path, map_location=self.device))
|
| 338 |
+
self.tcn_model.eval()
|
| 339 |
+
print("TCN loaded!")
|
| 340 |
+
|
| 341 |
+
# Load ResNet-1D
|
| 342 |
+
resnet_path = os.path.join(self.models_dir, "resnet1d_best.pth")
|
| 343 |
+
if os.path.exists(resnet_path):
|
| 344 |
+
self.resnet_model = ResNet1D(self.hidden_dim).to(self.device)
|
| 345 |
+
self.resnet_model.load_state_dict(torch.load(resnet_path, map_location=self.device))
|
| 346 |
+
self.resnet_model.eval()
|
| 347 |
+
print("ResNet-1D loaded!")
|
| 348 |
+
|
| 349 |
+
# Load meta-learner
|
| 350 |
+
meta_path = os.path.join(self.models_dir, "meta_model.pkl")
|
| 351 |
+
if os.path.exists(meta_path):
|
| 352 |
+
with open(meta_path, 'rb') as f:
|
| 353 |
+
self.meta_model = pickle.load(f)
|
| 354 |
+
print("Meta-learner loaded!")
|
| 355 |
+
|
| 356 |
+
print("All models loaded!")
|
| 357 |
+
|
| 358 |
+
def pad_features(self, features):
|
| 359 |
+
"""Pad features to max_patches"""
|
| 360 |
+
num_patches = features.shape[0]
|
| 361 |
+
|
| 362 |
+
if num_patches < self.max_patches:
|
| 363 |
+
padded = np.zeros((self.max_patches, self.hidden_dim), dtype=np.float32)
|
| 364 |
+
padded[:num_patches, :] = features
|
| 365 |
+
return padded
|
| 366 |
+
else:
|
| 367 |
+
return features[:self.max_patches, :]
|
| 368 |
+
|
| 369 |
+
def predict(self, image):
|
| 370 |
+
"""
|
| 371 |
+
Predict whether an image is AI-generated or real
|
| 372 |
+
|
| 373 |
+
Returns:
|
| 374 |
+
dict: {
|
| 375 |
+
'prediction': 'AI Generated' or 'Real',
|
| 376 |
+
'confidence': float,
|
| 377 |
+
'tcn_prob': float,
|
| 378 |
+
'resnet_prob': float,
|
| 379 |
+
'details': str
|
| 380 |
+
}
|
| 381 |
+
"""
|
| 382 |
+
# Ensure models are loaded
|
| 383 |
+
if self.tcn_model is None:
|
| 384 |
+
self.load_models()
|
| 385 |
+
|
| 386 |
+
# Extract features
|
| 387 |
+
features = self.feature_extractor.extract_features(image)
|
| 388 |
+
|
| 389 |
+
# Pad features
|
| 390 |
+
padded_features = self.pad_features(features)
|
| 391 |
+
|
| 392 |
+
# Convert to tensor
|
| 393 |
+
x = torch.FloatTensor(padded_features).unsqueeze(0).to(self.device)
|
| 394 |
+
|
| 395 |
+
# Get predictions from base models
|
| 396 |
+
with torch.no_grad():
|
| 397 |
+
tcn_output = self.tcn_model(x)
|
| 398 |
+
tcn_probs = torch.softmax(tcn_output, dim=1)
|
| 399 |
+
tcn_prob = tcn_probs[0, 1].cpu().item() # Probability of Real
|
| 400 |
+
tcn_pred = 1 if tcn_prob > 0.5 else 0
|
| 401 |
+
|
| 402 |
+
resnet_output = self.resnet_model(x)
|
| 403 |
+
resnet_probs = torch.softmax(resnet_output, dim=1)
|
| 404 |
+
resnet_prob = resnet_probs[0, 1].cpu().item() # Probability of Real
|
| 405 |
+
resnet_pred = 1 if resnet_prob > 0.5 else 0
|
| 406 |
+
|
| 407 |
+
# Stack for meta-learner
|
| 408 |
+
if self.meta_model is not None:
|
| 409 |
+
meta_features = np.array([[tcn_pred, tcn_prob, resnet_pred, resnet_prob]])
|
| 410 |
+
final_pred = self.meta_model.predict(meta_features)[0]
|
| 411 |
+
final_prob = self.meta_model.predict_proba(meta_features)[0, 1]
|
| 412 |
+
else:
|
| 413 |
+
# Simple averaging fallback
|
| 414 |
+
final_prob = (tcn_prob + resnet_prob) / 2
|
| 415 |
+
final_pred = 1 if final_prob > 0.5 else 0
|
| 416 |
+
|
| 417 |
+
# Determine prediction
|
| 418 |
+
prediction = "Real" if final_pred == 1 else "AI Generated"
|
| 419 |
+
confidence = final_prob if final_pred == 1 else (1 - final_prob)
|
| 420 |
+
|
| 421 |
+
return {
|
| 422 |
+
'prediction': prediction,
|
| 423 |
+
'confidence': confidence,
|
| 424 |
+
'tcn_prob': tcn_prob,
|
| 425 |
+
'resnet_prob': resnet_prob,
|
| 426 |
+
'details': f"TCN: {tcn_prob:.2%} Real | ResNet-1D: {resnet_prob:.2%} Real | Ensemble: {final_prob:.2%} Real"
|
| 427 |
+
}
|
| 428 |
+
|
| 429 |
+
|
| 430 |
+
# ==================== GRADIO INTERFACE ====================
|
| 431 |
+
|
| 432 |
+
# Initialize detector (will load models on first prediction)
|
| 433 |
+
detector = AIImageDetector(models_dir="models")
|
| 434 |
+
|
| 435 |
+
def detect_image(image):
|
| 436 |
+
"""Gradio interface function"""
|
| 437 |
+
if image is None:
|
| 438 |
+
return "Please upload an image", "", ""
|
| 439 |
+
|
| 440 |
+
try:
|
| 441 |
+
result = detector.predict(image)
|
| 442 |
+
|
| 443 |
+
# Format output
|
| 444 |
+
if result['prediction'] == "AI Generated":
|
| 445 |
+
label = f"π€ AI Generated ({result['confidence']:.1%} confidence)"
|
| 446 |
+
color = "red"
|
| 447 |
+
else:
|
| 448 |
+
label = f"π· Real Image ({result['confidence']:.1%} confidence)"
|
| 449 |
+
color = "green"
|
| 450 |
+
|
| 451 |
+
details = result['details']
|
| 452 |
+
|
| 453 |
+
# Create confidence display
|
| 454 |
+
ai_conf = 1 - result['confidence'] if result['prediction'] == "Real" else result['confidence']
|
| 455 |
+
real_conf = result['confidence'] if result['prediction'] == "Real" else 1 - result['confidence']
|
| 456 |
+
|
| 457 |
+
confidence_display = f"""
|
| 458 |
+
### Model Predictions:
|
| 459 |
+
- **TCN Model**: {result['tcn_prob']:.1%} Real / {1-result['tcn_prob']:.1%} AI
|
| 460 |
+
- **ResNet-1D Model**: {result['resnet_prob']:.1%} Real / {1-result['resnet_prob']:.1%} AI
|
| 461 |
+
|
| 462 |
+
### Final Ensemble Verdict:
|
| 463 |
+
- **AI Generated**: {ai_conf:.1%}
|
| 464 |
+
- **Real Image**: {real_conf:.1%}
|
| 465 |
+
"""
|
| 466 |
+
|
| 467 |
+
return label, confidence_display, details
|
| 468 |
+
|
| 469 |
+
except Exception as e:
|
| 470 |
+
return f"Error: {str(e)}", "", ""
|
| 471 |
+
|
| 472 |
+
|
| 473 |
+
# Create Gradio interface
|
| 474 |
+
with gr.Blocks(title="AI Image Detector", theme=gr.themes.Soft()) as demo:
|
| 475 |
+
gr.Markdown("""
|
| 476 |
+
# π AI Image Detector
|
| 477 |
+
|
| 478 |
+
**Detect whether an image is AI-generated or a real photograph**
|
| 479 |
+
|
| 480 |
+
This detector uses a stacking ensemble of:
|
| 481 |
+
- π§ **TCN** (Temporal Convolutional Network)
|
| 482 |
+
- ποΈ **ResNet-1D** (Deep Residual Network)
|
| 483 |
+
|
| 484 |
+
Features are extracted using **Qwen2.5-VL** vision-language model.
|
| 485 |
+
|
| 486 |
+
---
|
| 487 |
+
""")
|
| 488 |
+
|
| 489 |
+
with gr.Row():
|
| 490 |
+
with gr.Column(scale=1):
|
| 491 |
+
image_input = gr.Image(type="pil", label="Upload Image")
|
| 492 |
+
detect_btn = gr.Button("π Detect", variant="primary")
|
| 493 |
+
|
| 494 |
+
with gr.Column(scale=1):
|
| 495 |
+
prediction_output = gr.Textbox(label="Prediction", lines=1)
|
| 496 |
+
confidence_output = gr.Markdown(label="Confidence Details")
|
| 497 |
+
details_output = gr.Textbox(label="Raw Details", lines=2)
|
| 498 |
+
|
| 499 |
+
# Examples
|
| 500 |
+
gr.Markdown("### πΈ Try these examples:")
|
| 501 |
+
gr.Examples(
|
| 502 |
+
examples=[
|
| 503 |
+
# Add example images here
|
| 504 |
+
],
|
| 505 |
+
inputs=image_input
|
| 506 |
+
)
|
| 507 |
+
|
| 508 |
+
# Connect button
|
| 509 |
+
detect_btn.click(
|
| 510 |
+
fn=detect_image,
|
| 511 |
+
inputs=[image_input],
|
| 512 |
+
outputs=[prediction_output, confidence_output, details_output]
|
| 513 |
+
)
|
| 514 |
+
|
| 515 |
+
gr.Markdown("""
|
| 516 |
+
---
|
| 517 |
+
### βΉοΈ About
|
| 518 |
+
|
| 519 |
+
This model was trained on a dataset of AI-generated and real images.
|
| 520 |
+
|
| 521 |
+
**Accuracy**: ~97%+ on test set
|
| 522 |
+
|
| 523 |
+
**Note**: Results are probabilistic. Always verify important decisions.
|
| 524 |
+
""")
|
| 525 |
+
|
| 526 |
+
|
| 527 |
+
if __name__ == "__main__":
|
| 528 |
+
demo.launch()
|
meta_model.pkl
ADDED
|
Binary file (742 Bytes). View file
|
|
|
requirements.txt
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# AI Image Detector - Hugging Face Spaces Requirements
|
| 2 |
+
|
| 3 |
+
# Core ML frameworks
|
| 4 |
+
torch>=2.0.0
|
| 5 |
+
transformers>=4.37.0
|
| 6 |
+
accelerate>=0.25.0
|
| 7 |
+
|
| 8 |
+
# Qwen2.5-VL specific
|
| 9 |
+
qwen-vl-utils==0.0.8
|
| 10 |
+
|
| 11 |
+
# ML utilities
|
| 12 |
+
numpy>=1.24.0
|
| 13 |
+
scikit-learn>=1.3.0
|
| 14 |
+
pillow>=10.0.0
|
| 15 |
+
|
| 16 |
+
# Web interface
|
| 17 |
+
gradio>=4.0.0
|
| 18 |
+
|
| 19 |
+
# Optional: for faster inference
|
| 20 |
+
# bitsandbytes>=0.41.0 # 8-bit quantization
|
resnet1d_best.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bc6e47b7f2c97b8ec1a10c9c0dbb6e19afe59200000512e39fd9d51bd11470d8
|
| 3 |
+
size 15486826
|
save_models_for_deploy.py
ADDED
|
@@ -0,0 +1,133 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""
|
| 3 |
+
Save Models for Hugging Face Deployment
|
| 4 |
+
|
| 5 |
+
Run this cell AFTER training TCN, ResNet-1D, and the stacking ensemble.
|
| 6 |
+
This will save all necessary files for deployment.
|
| 7 |
+
|
| 8 |
+
Required variables in memory:
|
| 9 |
+
- tcn_results (with 'model' key or separate tcn model)
|
| 10 |
+
- resnet1d_results (with 'model' key or separate resnet model)
|
| 11 |
+
- meta_model (trained sklearn meta-learner)
|
| 12 |
+
- data (with 'sequential' containing max_patches and hidden_dim)
|
| 13 |
+
"""
|
| 14 |
+
|
| 15 |
+
import torch
|
| 16 |
+
import pickle
|
| 17 |
+
import os
|
| 18 |
+
import shutil
|
| 19 |
+
|
| 20 |
+
# ==================== CONFIGURATION ====================
|
| 21 |
+
|
| 22 |
+
OUTPUT_DIR = '/kaggle/working/deploy_models'
|
| 23 |
+
os.makedirs(OUTPUT_DIR, exist_ok=True)
|
| 24 |
+
|
| 25 |
+
print("="*80)
|
| 26 |
+
print("πΎ SAVING MODELS FOR HUGGING FACE DEPLOYMENT")
|
| 27 |
+
print("="*80)
|
| 28 |
+
|
| 29 |
+
# ==================== SAVE CONFIG ====================
|
| 30 |
+
|
| 31 |
+
print("\nπ Saving configuration...")
|
| 32 |
+
|
| 33 |
+
config = {
|
| 34 |
+
'max_patches': data['sequential']['max_patches'] if 'max_patches' in data['sequential'] else 256,
|
| 35 |
+
'hidden_dim': data['sequential']['hidden_dim'] if 'hidden_dim' in data['sequential'] else 2048,
|
| 36 |
+
}
|
| 37 |
+
|
| 38 |
+
config_path = os.path.join(OUTPUT_DIR, 'config.pkl')
|
| 39 |
+
with open(config_path, 'wb') as f:
|
| 40 |
+
pickle.dump(config, f)
|
| 41 |
+
print(f"β
Config saved to {config_path}")
|
| 42 |
+
print(f" β’ max_patches: {config['max_patches']}")
|
| 43 |
+
print(f" β’ hidden_dim: {config['hidden_dim']}")
|
| 44 |
+
|
| 45 |
+
# ==================== SAVE TCN MODEL ====================
|
| 46 |
+
|
| 47 |
+
print("\nπ¦ Saving TCN model...")
|
| 48 |
+
|
| 49 |
+
# Check if TCN model state dict already exists
|
| 50 |
+
if os.path.exists('tcn_best.pth'):
|
| 51 |
+
shutil.copy('tcn_best.pth', os.path.join(OUTPUT_DIR, 'tcn_best.pth'))
|
| 52 |
+
print(f"β
TCN model copied from tcn_best.pth")
|
| 53 |
+
else:
|
| 54 |
+
print("β οΈ tcn_best.pth not found. Please save your TCN model:")
|
| 55 |
+
print(" torch.save(tcn_model.state_dict(), 'tcn_best.pth')")
|
| 56 |
+
|
| 57 |
+
# ==================== SAVE RESNET-1D MODEL ====================
|
| 58 |
+
|
| 59 |
+
print("\nπ¦ Saving ResNet-1D model...")
|
| 60 |
+
|
| 61 |
+
# Check if ResNet model state dict already exists
|
| 62 |
+
if os.path.exists('resnet1d_best.pth'):
|
| 63 |
+
shutil.copy('resnet1d_best.pth', os.path.join(OUTPUT_DIR, 'resnet1d_best.pth'))
|
| 64 |
+
print(f"β
ResNet-1D model copied from resnet1d_best.pth")
|
| 65 |
+
else:
|
| 66 |
+
print("β οΈ resnet1d_best.pth not found. Please save your ResNet model:")
|
| 67 |
+
print(" torch.save(resnet_model.state_dict(), 'resnet1d_best.pth')")
|
| 68 |
+
|
| 69 |
+
# ==================== SAVE META-LEARNER ====================
|
| 70 |
+
|
| 71 |
+
print("\nπ¦ Saving meta-learner...")
|
| 72 |
+
|
| 73 |
+
try:
|
| 74 |
+
meta_path = os.path.join(OUTPUT_DIR, 'meta_model.pkl')
|
| 75 |
+
with open(meta_path, 'wb') as f:
|
| 76 |
+
pickle.dump(meta_model, f)
|
| 77 |
+
print(f"β
Meta-learner saved to {meta_path}")
|
| 78 |
+
except NameError:
|
| 79 |
+
print("β οΈ meta_model not found. Please ensure the stacking ensemble has been trained.")
|
| 80 |
+
|
| 81 |
+
# ==================== CREATE ZIP ====================
|
| 82 |
+
|
| 83 |
+
print("\nπ¦ Creating deployment package...")
|
| 84 |
+
|
| 85 |
+
import zipfile
|
| 86 |
+
|
| 87 |
+
zip_path = '/kaggle/working/huggingface_deploy.zip'
|
| 88 |
+
with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
|
| 89 |
+
for root, dirs, files in os.walk(OUTPUT_DIR):
|
| 90 |
+
for file in files:
|
| 91 |
+
file_path = os.path.join(root, file)
|
| 92 |
+
arcname = os.path.relpath(file_path, OUTPUT_DIR)
|
| 93 |
+
zipf.write(file_path, arcname)
|
| 94 |
+
|
| 95 |
+
zip_size_mb = os.path.getsize(zip_path) / (1024**2)
|
| 96 |
+
print(f"β
Deployment package created: {zip_path}")
|
| 97 |
+
print(f" Size: {zip_size_mb:.2f} MB")
|
| 98 |
+
|
| 99 |
+
# ==================== SUMMARY ====================
|
| 100 |
+
|
| 101 |
+
print("\n" + "="*80)
|
| 102 |
+
print("π DEPLOYMENT CHECKLIST")
|
| 103 |
+
print("="*80)
|
| 104 |
+
|
| 105 |
+
print("\nβ
Files saved:")
|
| 106 |
+
for f in os.listdir(OUTPUT_DIR):
|
| 107 |
+
size_kb = os.path.getsize(os.path.join(OUTPUT_DIR, f)) / 1024
|
| 108 |
+
print(f" β’ {f} ({size_kb:.1f} KB)")
|
| 109 |
+
|
| 110 |
+
print("\nπ Next Steps:")
|
| 111 |
+
print("1. Download huggingface_deploy.zip from Kaggle")
|
| 112 |
+
print("2. Create a new Hugging Face Space:")
|
| 113 |
+
print(" huggingface-cli repo create your-username/ai-image-detector --type space --space_sdk gradio")
|
| 114 |
+
print("3. Clone and add files:")
|
| 115 |
+
print(" git clone https://huggingface.co/spaces/your-username/ai-image-detector")
|
| 116 |
+
print(" cd ai-image-detector")
|
| 117 |
+
print(" # Extract model files to models/ folder")
|
| 118 |
+
print(" # Copy app.py, requirements.txt, README.md")
|
| 119 |
+
print("4. Push to Hugging Face:")
|
| 120 |
+
print(" git add .")
|
| 121 |
+
print(" git commit -m 'Add AI image detector'")
|
| 122 |
+
print(" git push")
|
| 123 |
+
|
| 124 |
+
print("\nπ‘ Alternative: Upload directly via Hugging Face web interface")
|
| 125 |
+
print(" Go to huggingface.co/new-space and upload files")
|
| 126 |
+
|
| 127 |
+
print("="*80)
|
| 128 |
+
print("β
SAVE COMPLETE!")
|
| 129 |
+
print("="*80)
|
| 130 |
+
|
| 131 |
+
# Display download link
|
| 132 |
+
from IPython.display import FileLink
|
| 133 |
+
FileLink(zip_path)
|
tcn_best.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5f99f0185ffefb8de5eda586f3ceb99f4fb5d46c13ab74a61862e3dc92f2582a
|
| 3 |
+
size 17845974
|