title: Image-Forensic
emoji: π
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false
ImageForensics-Detect
Research-grade multi-branch image forensics platform for detecting Real vs. AI-Generated images.
B.Tech Final Year Project Β· IEEE-Style Research System
π§ What This Does
ImageForensics-Detect analyzes uploaded images through 5 independent forensic detection branches and fuses their outputs using certainty-weighted probabilistic fusion to decide whether an image is:
- Real β captured by a physical camera
- AI-Generated β created by GANs or diffusion models (Stable Diffusion, DALL-E, Midjourney, etc.)
ποΈ Architecture
Input Image
β
ββββΊ Spectral Branch (FFT/DCT analysis) [no training needed]
ββββΊ Edge Branch (Sobel/Laplacian forensics) [no training needed]
ββββΊ CNN Branch (EfficientNet-B0 / TF) [train_cnn.py]
ββββΊ ViT Branch (ViT-B/16 / PyTorch+timm) [train_vit.py]
ββββΊ Diffusion Branch (residual noise analysis) [no training needed]
β
Certainty-Weighted Probabilistic Fusion
β
ββββββββββββββββββββββββββββββββββ
β Prediction: Real / AI-Gen β
β Confidence: 97.1% β
β Grad-CAM heatmap β
β Spectral anomaly map β
β Noise residual map β
ββββββββββββββββββββββββββββββββββ
π Folder Structure
ImageForensics-Detect/
βββ data/raw/{real,fake}/ β Your dataset goes here
βββ models/ β Saved .h5 / .pth weights
βββ branches/
β βββ spectral_branch.py β
COMPLETE (signal processing)
β βββ edge_branch.py β
COMPLETE (signal processing)
β βββ cnn_branch.py π΅ BASELINE (needs training)
β βββ vit_branch.py π΅ BASELINE (needs training)
β βββ diffusion_branch.py β
COMPLETE (signal processing)
βββ fusion/fusion.py β
COMPLETE
βββ explainability/
β βββ gradcam.py β
COMPLETE
β βββ spectral_heatmap.py β
COMPLETE
βββ training/
β βββ dataset_loader.py β
COMPLETE
β βββ train_cnn.py β
COMPLETE
β βββ train_vit.py β
COMPLETE
β βββ evaluate.py β
COMPLETE
βββ backend/app.py β
COMPLETE (FastAPI)
βββ frontend/{index.html,style.css,app.js} β
COMPLETE
βββ utils/{image_utils.py,logger.py} β
COMPLETE
βββ outputs/ β Logs, heatmaps, eval results
βοΈ Installation
Prerequisites
- Python 3.9+
- pip / conda
1. Create Virtual Environment
cd "ImageForensics-Detect"
python -m venv venv
source venv/bin/activate # macOS/Linux
# venv\Scripts\activate # Windows
2. Install Dependencies
pip install -r requirements.txt
Note: On Apple Silicon (M1/M2/M3), use
pip install tensorflow-macos tensorflow-metalinstead oftensorflow.
ποΈ Dataset Setup
Populate the dataset folders before training:
data/raw/
βββ real/ β Real camera photos (.jpg, .png)
βββ fake/ β AI-generated images (.jpg, .png)
Recommended datasets:
| Type | Dataset | Source |
|---|---|---|
| Real | RAISE-1K / VISION / MIT-5k | Kaggle / research groups |
| AI-Gen | ThisPersonDoesNotExist / SDXL outputs | Collected/scraped |
| Mixed | ArtiFact / CNNDetection | GitHub papers |
The loader auto-splits: 70% train / 15% val / 15% test (stratified).
ποΈ Training
Train CNN Branch (EfficientNet-B0 / TensorFlow)
python training/train_cnn.py --epochs 30 --batch_size 32 --lr 1e-4
# Saves: models/cnn_branch.h5
Train ViT Branch (ViT-B/16 / PyTorch)
python training/train_vit.py --epochs 20 --batch_size 16 --lr 1e-4
# Saves: models/vit_branch.pth
Without training: The system is still functional β the 3 handcrafted branches (Spectral, Edge, Diffusion) produce real forensic outputs immediately. CNN/ViT branches return neutral
0.5confidence and are flagged as "untrained" in the API response.
π Evaluation
# Evaluate entire fusion system
python training/evaluate.py
# Evaluate individual branches
python training/evaluate.py --branch spectral
python training/evaluate.py --branch edge
python training/evaluate.py --branch cnn
python training/evaluate.py --branch vit
python training/evaluate.py --branch diffusion
Reports saved to outputs/:
confusion_matrix_<branch>.pngroc_curve_<branch>.pngevaluation_<branch>.csv
π Running the System
Step 1: Start Backend API
uvicorn backend.app:app --reload --host 0.0.0.0 --port 8000
Step 2: Open Frontend
Open frontend/index.html in your browser (double-click, or use Live Server in VS Code).
Step 3: Upload and Analyze
Drag-and-drop any image β Click Analyze Image β View results.
π API Reference
POST /predict
Upload an image and receive full forensic analysis.
Request:
curl -X POST "http://localhost:8000/predict" \
-F "file=@your_image.jpg"
Response:
{
"prediction": "AI-Generated",
"confidence": 97.1,
"prob_fake": 0.9855,
"branches": {
"spectral": { "prob_fake": 0.9420, "confidence": 0.8800, "label": "AI-Generated" },
"edge": { "prob_fake": 0.8100, "confidence": 0.7200, "label": "AI-Generated" },
"cnn": { "prob_fake": 0.9820, "confidence": 0.9640, "label": "AI-Generated" },
"vit": { "prob_fake": 0.9600, "confidence": 0.9200, "label": "AI-Generated" },
"diffusion": { "prob_fake": 0.8900, "confidence": 0.8300, "label": "AI-Generated" }
},
"gradcam_b64": "<base64-encoded JPEG>",
"spectrum_b64": "<base64-encoded JPEG>",
"noise_map_b64": "<base64-encoded JPEG>",
"edge_map_b64": "<base64-encoded JPEG>",
"low_certainty": false
}
GET /health
{ "status": "ok", "service": "ImageForensics-Detect", "version": "1.0.0" }
GET /logs
{ "total": 42, "real": 18, "ai_generated": 24 }
π Research Methodology
Title (Suggested)
"Multi-Branch Certainty-Weighted Forensic Detection of AI-Generated Images Using Spectral Analysis, Edge Statistics, CNN, and Vision Transformers"
Abstract
This work presents ImageForensics-Detect, a multi-branch forensic analysis framework for distinguishing real camera photographs from AI-generated images produced by GANs and diffusion models. The system integrates five complementary detection branches: (1) spectral analysis using FFT and DCT to capture frequency-domain artifacts; (2) edge analysis using Sobel/Laplacian operators and gradient distribution statistics; (3) a CNN branch (EfficientNet-B0) for local texture and patch-level artifact detection; (4) a ViT branch (ViT-B/16) for global semantic inconsistency detection; and (5) a diffusion residual branch analyzing noise kurtosis and spatial uniformity. Branch predictions are combined using certainty-weighted probabilistic fusion, ensuring that uncertain or untrained branches contribute proportionally less to the final decision.
Key Design Decisions
| Decision | Rationale |
|---|---|
| Multi-branch ensemble | No single signal catches all generator types |
| Certainty-weighted fusion | Prevents weak/untrained branches from degrading accuracy |
| FFT + DCT (spectral) | GAN checkerboard artifacts are frequency-domain detectable |
| EfficientNet-B0 | Best accuracy-efficiency trade-off for the CNN branch |
| ViT-B/16 (timm) | Global receptive field catches semantic incoherence CNNs miss |
| Noise uniformity (diffusion) | Diffusion models produce spatially uniform noise fields |
References (Key Papers)
- Wang et al. (2020). CNN-generated images are surprisingly easy to spot... for now. CVPR.
- Frank et al. (2020). Leveraging Frequency Analysis for Deep Fake Image Recognition. ICML.
- Corvi et al. (2023). On the detection of synthetic images generated by diffusion models. ICASSP.
- Ojha et al. (2023). Towards Universal Fake Image Detection. CVPR.
- Selvaraju et al. (2017). Grad-CAM: Visual Explanations from Deep Networks. ICCV.
π οΈ Tech Stack
| Layer | Technology |
|---|---|
| CNN Branch | TensorFlow 2.13+, EfficientNet-B0 |
| ViT Branch | PyTorch 2.0+, timm, ViT-B/16 |
| Signal Branches | OpenCV, NumPy, SciPy |
| Backend | FastAPI, Uvicorn |
| Frontend | HTML5, CSS3, Vanilla JavaScript |
| Evaluation | Scikit-learn, Matplotlib, Seaborn |
π Status Summary
| Module | Status | Result |
|---|---|---|
| Spectral Branch | β Complete | Signal Forensics |
| Edge Branch | β Complete | Signal Forensics |
| Diffusion Branch | β Complete | Signal Forensics |
| CNN Branch | β Trained | EfficientNet-B0 |
| ViT Branch | β Fully Trained | 99.30% Accuracy |
| Fusion Module | β Complete | Certainty-Weighted |
| Grad-CAM | β Complete | (with Saliency Fallback) |
| Frontend UI | β Enhanced | Stats Hero + Prob Bar |
| FastAPI Backend | β Complete | Port 8000 |
Built for IEEE-style research deployment. B.Tech Final Year Project.