Spaces:

dk2430098
/

Image-Forensics-Detect

Running

File size: 9,585 Bytes

a0d92b3
928b74f
 
 
 
a0d92b3
 
 
928b74f
a0d92b3
928b74f

---
title: Image-Forensic
emoji: 🔍
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false
---
# ImageForensics-Detect

> **Research-grade multi-branch image forensics platform for detecting Real vs. AI-Generated images.**  
> B.Tech Final Year Project · IEEE-Style Research System

---

## 🧠 What This Does

ImageForensics-Detect analyzes uploaded images through **5 independent forensic detection branches** and fuses their outputs using **certainty-weighted probabilistic fusion** to decide whether an image is:

- **Real** — captured by a physical camera
- **AI-Generated** — created by GANs or diffusion models (Stable Diffusion, DALL-E, Midjourney, etc.)

---

## 🏗️ Architecture

```
Input Image
    │
    ├──► Spectral Branch   (FFT/DCT analysis)         [no training needed]
    ├──► Edge Branch       (Sobel/Laplacian forensics) [no training needed]
    ├──► CNN Branch        (EfficientNet-B0 / TF)      [train_cnn.py]
    ├──► ViT Branch        (ViT-B/16 / PyTorch+timm)   [train_vit.py]
    └──► Diffusion Branch  (residual noise analysis)   [no training needed]
             │
    Certainty-Weighted Probabilistic Fusion
             │
    ┌────────────────────────────────┐
    │  Prediction: Real / AI-Gen     │
    │  Confidence: 97.1%             │
    │  Grad-CAM heatmap              │
    │  Spectral anomaly map          │
    │  Noise residual map            │
    └────────────────────────────────┘
```

---

## 📁 Folder Structure

```
ImageForensics-Detect/
├── data/raw/{real,fake}/          ← Your dataset goes here
├── models/                        ← Saved .h5 / .pth weights
├── branches/
│   ├── spectral_branch.py         ✅ COMPLETE (signal processing)
│   ├── edge_branch.py             ✅ COMPLETE (signal processing)
│   ├── cnn_branch.py              🔵 BASELINE (needs training)
│   ├── vit_branch.py              🔵 BASELINE (needs training)
│   └── diffusion_branch.py        ✅ COMPLETE (signal processing)
├── fusion/fusion.py               ✅ COMPLETE
├── explainability/
│   ├── gradcam.py                 ✅ COMPLETE
│   └── spectral_heatmap.py        ✅ COMPLETE
├── training/
│   ├── dataset_loader.py          ✅ COMPLETE
│   ├── train_cnn.py               ✅ COMPLETE
│   ├── train_vit.py               ✅ COMPLETE
│   └── evaluate.py                ✅ COMPLETE
├── backend/app.py                 ✅ COMPLETE (FastAPI)
├── frontend/{index.html,style.css,app.js}  ✅ COMPLETE
├── utils/{image_utils.py,logger.py}        ✅ COMPLETE
└── outputs/                       ← Logs, heatmaps, eval results
```

---

## ⚙️ Installation

### Prerequisites
- Python 3.9+
- pip / conda

### 1. Create Virtual Environment
```bash
cd "ImageForensics-Detect"
python -m venv venv
source venv/bin/activate          # macOS/Linux
# venv\Scripts\activate           # Windows
```

### 2. Install Dependencies
```bash
pip install -r requirements.txt
```

> **Note**: On Apple Silicon (M1/M2/M3), use `pip install tensorflow-macos tensorflow-metal` instead of `tensorflow`.

---

## 🗂️ Dataset Setup

Populate the dataset folders before training:
```
data/raw/
├── real/    ← Real camera photos (.jpg, .png)
└── fake/    ← AI-generated images (.jpg, .png)
```

**Recommended datasets:**
| Type | Dataset | Source |
|---|---|---|
| Real | RAISE-1K / VISION / MIT-5k | Kaggle / research groups |
| AI-Gen | ThisPersonDoesNotExist / SDXL outputs | Collected/scraped |
| Mixed | ArtiFact / CNNDetection | GitHub papers |

The loader auto-splits: **70% train / 15% val / 15% test** (stratified).

---

## 🏋️ Training

### Train CNN Branch (EfficientNet-B0 / TensorFlow)
```bash
python training/train_cnn.py --epochs 30 --batch_size 32 --lr 1e-4
# Saves: models/cnn_branch.h5
```

### Train ViT Branch (ViT-B/16 / PyTorch)
```bash
python training/train_vit.py --epochs 20 --batch_size 16 --lr 1e-4
# Saves: models/vit_branch.pth
```

> **Without training:** The system is still functional — the 3 handcrafted branches (Spectral, Edge, Diffusion) produce real forensic outputs immediately. CNN/ViT branches return neutral `0.5` confidence and are flagged as "untrained" in the API response.

---

## 📊 Evaluation

```bash
# Evaluate entire fusion system
python training/evaluate.py

# Evaluate individual branches
python training/evaluate.py --branch spectral
python training/evaluate.py --branch edge
python training/evaluate.py --branch cnn
python training/evaluate.py --branch vit
python training/evaluate.py --branch diffusion
```

Reports saved to `outputs/`:
- `confusion_matrix_<branch>.png`
- `roc_curve_<branch>.png`
- `evaluation_<branch>.csv`

---

## 🚀 Running the System

### Step 1: Start Backend API
```bash
uvicorn backend.app:app --reload --host 0.0.0.0 --port 8000
```

### Step 2: Open Frontend
Open `frontend/index.html` in your browser (double-click, or use Live Server in VS Code).

### Step 3: Upload and Analyze
Drag-and-drop any image → Click **Analyze Image** → View results.

---

## 🌐 API Reference

### `POST /predict`
Upload an image and receive full forensic analysis.

**Request:**
```bash
curl -X POST "http://localhost:8000/predict" \
     -F "file=@your_image.jpg"
```

**Response:**
```json
{
  "prediction": "AI-Generated",
  "confidence": 97.1,
  "prob_fake": 0.9855,
  "branches": {
    "spectral":  { "prob_fake": 0.9420, "confidence": 0.8800, "label": "AI-Generated" },
    "edge":      { "prob_fake": 0.8100, "confidence": 0.7200, "label": "AI-Generated" },
    "cnn":       { "prob_fake": 0.9820, "confidence": 0.9640, "label": "AI-Generated" },
    "vit":       { "prob_fake": 0.9600, "confidence": 0.9200, "label": "AI-Generated" },
    "diffusion": { "prob_fake": 0.8900, "confidence": 0.8300, "label": "AI-Generated" }
  },
  "gradcam_b64":   "<base64-encoded JPEG>",
  "spectrum_b64":  "<base64-encoded JPEG>",
  "noise_map_b64": "<base64-encoded JPEG>",
  "edge_map_b64":  "<base64-encoded JPEG>",
  "low_certainty": false
}
```

### `GET /health`
```json
{ "status": "ok", "service": "ImageForensics-Detect", "version": "1.0.0" }
```

### `GET /logs`
```json
{ "total": 42, "real": 18, "ai_generated": 24 }
```

---

## 📝 Research Methodology

### Title (Suggested)
> **"Multi-Branch Certainty-Weighted Forensic Detection of AI-Generated Images Using Spectral Analysis, Edge Statistics, CNN, and Vision Transformers"**

### Abstract
This work presents ImageForensics-Detect, a multi-branch forensic analysis framework for distinguishing real camera photographs from AI-generated images produced by GANs and diffusion models. The system integrates five complementary detection branches: (1) spectral analysis using FFT and DCT to capture frequency-domain artifacts; (2) edge analysis using Sobel/Laplacian operators and gradient distribution statistics; (3) a CNN branch (EfficientNet-B0) for local texture and patch-level artifact detection; (4) a ViT branch (ViT-B/16) for global semantic inconsistency detection; and (5) a diffusion residual branch analyzing noise kurtosis and spatial uniformity. Branch predictions are combined using certainty-weighted probabilistic fusion, ensuring that uncertain or untrained branches contribute proportionally less to the final decision.

### Key Design Decisions

| Decision | Rationale |
|---|---|
| Multi-branch ensemble | No single signal catches all generator types |
| Certainty-weighted fusion | Prevents weak/untrained branches from degrading accuracy |
| FFT + DCT (spectral) | GAN checkerboard artifacts are frequency-domain detectable |
| EfficientNet-B0 | Best accuracy-efficiency trade-off for the CNN branch |
| ViT-B/16 (timm) | Global receptive field catches semantic incoherence CNNs miss |
| Noise uniformity (diffusion) | Diffusion models produce spatially uniform noise fields |

### References (Key Papers)
1. Wang et al. (2020). *CNN-generated images are surprisingly easy to spot... for now.* CVPR.
2. Frank et al. (2020). *Leveraging Frequency Analysis for Deep Fake Image Recognition.* ICML.
3. Corvi et al. (2023). *On the detection of synthetic images generated by diffusion models.* ICASSP.
4. Ojha et al. (2023). *Towards Universal Fake Image Detection.* CVPR.
5. Selvaraju et al. (2017). *Grad-CAM: Visual Explanations from Deep Networks.* ICCV.

---

## 🛠️ Tech Stack

| Layer | Technology |
|---|---|
| CNN Branch | TensorFlow 2.13+, EfficientNet-B0 |
| ViT Branch | PyTorch 2.0+, timm, ViT-B/16 |
| Signal Branches | OpenCV, NumPy, SciPy |
| Backend | FastAPI, Uvicorn |
| Frontend | HTML5, CSS3, Vanilla JavaScript |
| Evaluation | Scikit-learn, Matplotlib, Seaborn |

---

## 📌 Status Summary

| Module | Status | Result |
|---|---|---|
| Spectral Branch | ✅ Complete | Signal Forensics |
| Edge Branch | ✅ Complete | Signal Forensics |
| Diffusion Branch | ✅ Complete | Signal Forensics |
| CNN Branch | ✅ Trained | EfficientNet-B0 |
| ViT Branch | ✅ Fully Trained | **99.30% Accuracy** |
| Fusion Module | ✅ Complete | Certainty-Weighted |
| Grad-CAM | ✅ Complete | (with Saliency Fallback) |
| Frontend UI | ✅ Enhanced | Stats Hero + Prob Bar |
| FastAPI Backend | ✅ Complete | Port 8000 |

---

*Built for IEEE-style research deployment. B.Tech Final Year Project.*