| --- |
| title: Image-Forensic |
| emoji: π |
| colorFrom: indigo |
| colorTo: purple |
| sdk: docker |
| pinned: false |
| --- |
| # ImageForensics-Detect |
|
|
| > **Research-grade multi-branch image forensics platform for detecting Real vs. AI-Generated images.** |
| > B.Tech Final Year Project Β· IEEE-Style Research System |
|
|
| --- |
|
|
| ## π§ What This Does |
|
|
| ImageForensics-Detect analyzes uploaded images through **5 independent forensic detection branches** and fuses their outputs using **certainty-weighted probabilistic fusion** to decide whether an image is: |
|
|
| - **Real** β captured by a physical camera |
| - **AI-Generated** β created by GANs or diffusion models (Stable Diffusion, DALL-E, Midjourney, etc.) |
|
|
| --- |
|
|
| ## ποΈ Architecture |
|
|
| ``` |
| Input Image |
| β |
| ββββΊ Spectral Branch (FFT/DCT analysis) [no training needed] |
| ββββΊ Edge Branch (Sobel/Laplacian forensics) [no training needed] |
| ββββΊ CNN Branch (EfficientNet-B0 / TF) [train_cnn.py] |
| ββββΊ ViT Branch (ViT-B/16 / PyTorch+timm) [train_vit.py] |
| ββββΊ Diffusion Branch (residual noise analysis) [no training needed] |
| β |
| Certainty-Weighted Probabilistic Fusion |
| β |
| ββββββββββββββββββββββββββββββββββ |
| β Prediction: Real / AI-Gen β |
| β Confidence: 97.1% β |
| β Grad-CAM heatmap β |
| β Spectral anomaly map β |
| β Noise residual map β |
| ββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| --- |
|
|
| ## π Folder Structure |
|
|
| ``` |
| ImageForensics-Detect/ |
| βββ data/raw/{real,fake}/ β Your dataset goes here |
| βββ models/ β Saved .h5 / .pth weights |
| βββ branches/ |
| β βββ spectral_branch.py β
COMPLETE (signal processing) |
| β βββ edge_branch.py β
COMPLETE (signal processing) |
| β βββ cnn_branch.py π΅ BASELINE (needs training) |
| β βββ vit_branch.py π΅ BASELINE (needs training) |
| β βββ diffusion_branch.py β
COMPLETE (signal processing) |
| βββ fusion/fusion.py β
COMPLETE |
| βββ explainability/ |
| β βββ gradcam.py β
COMPLETE |
| β βββ spectral_heatmap.py β
COMPLETE |
| βββ training/ |
| β βββ dataset_loader.py β
COMPLETE |
| β βββ train_cnn.py β
COMPLETE |
| β βββ train_vit.py β
COMPLETE |
| β βββ evaluate.py β
COMPLETE |
| βββ backend/app.py β
COMPLETE (FastAPI) |
| βββ frontend/{index.html,style.css,app.js} β
COMPLETE |
| βββ utils/{image_utils.py,logger.py} β
COMPLETE |
| βββ outputs/ β Logs, heatmaps, eval results |
| ``` |
|
|
| --- |
|
|
| ## βοΈ Installation |
|
|
| ### Prerequisites |
| - Python 3.9+ |
| - pip / conda |
|
|
| ### 1. Create Virtual Environment |
| ```bash |
| cd "ImageForensics-Detect" |
| python -m venv venv |
| source venv/bin/activate # macOS/Linux |
| # venv\Scripts\activate # Windows |
| ``` |
|
|
| ### 2. Install Dependencies |
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| > **Note**: On Apple Silicon (M1/M2/M3), use `pip install tensorflow-macos tensorflow-metal` instead of `tensorflow`. |
|
|
| --- |
|
|
| ## ποΈ Dataset Setup |
|
|
| Populate the dataset folders before training: |
| ``` |
| data/raw/ |
| βββ real/ β Real camera photos (.jpg, .png) |
| βββ fake/ β AI-generated images (.jpg, .png) |
| ``` |
|
|
| **Recommended datasets:** |
| | Type | Dataset | Source | |
| |---|---|---| |
| | Real | RAISE-1K / VISION / MIT-5k | Kaggle / research groups | |
| | AI-Gen | ThisPersonDoesNotExist / SDXL outputs | Collected/scraped | |
| | Mixed | ArtiFact / CNNDetection | GitHub papers | |
|
|
| The loader auto-splits: **70% train / 15% val / 15% test** (stratified). |
|
|
| --- |
|
|
| ## ποΈ Training |
|
|
| ### Train CNN Branch (EfficientNet-B0 / TensorFlow) |
| ```bash |
| python training/train_cnn.py --epochs 30 --batch_size 32 --lr 1e-4 |
| # Saves: models/cnn_branch.h5 |
| ``` |
|
|
| ### Train ViT Branch (ViT-B/16 / PyTorch) |
| ```bash |
| python training/train_vit.py --epochs 20 --batch_size 16 --lr 1e-4 |
| # Saves: models/vit_branch.pth |
| ``` |
|
|
| > **Without training:** The system is still functional β the 3 handcrafted branches (Spectral, Edge, Diffusion) produce real forensic outputs immediately. CNN/ViT branches return neutral `0.5` confidence and are flagged as "untrained" in the API response. |
|
|
| --- |
|
|
| ## π Evaluation |
|
|
| ```bash |
| # Evaluate entire fusion system |
| python training/evaluate.py |
| |
| # Evaluate individual branches |
| python training/evaluate.py --branch spectral |
| python training/evaluate.py --branch edge |
| python training/evaluate.py --branch cnn |
| python training/evaluate.py --branch vit |
| python training/evaluate.py --branch diffusion |
| ``` |
|
|
| Reports saved to `outputs/`: |
| - `confusion_matrix_<branch>.png` |
| - `roc_curve_<branch>.png` |
| - `evaluation_<branch>.csv` |
|
|
| --- |
|
|
| ## π Running the System |
|
|
| ### Step 1: Start Backend API |
| ```bash |
| uvicorn backend.app:app --reload --host 0.0.0.0 --port 8000 |
| ``` |
|
|
| ### Step 2: Open Frontend |
| Open `frontend/index.html` in your browser (double-click, or use Live Server in VS Code). |
|
|
| ### Step 3: Upload and Analyze |
| Drag-and-drop any image β Click **Analyze Image** β View results. |
|
|
| --- |
|
|
| ## π API Reference |
|
|
| ### `POST /predict` |
| Upload an image and receive full forensic analysis. |
|
|
| **Request:** |
| ```bash |
| curl -X POST "http://localhost:8000/predict" \ |
| -F "file=@your_image.jpg" |
| ``` |
|
|
| **Response:** |
| ```json |
| { |
| "prediction": "AI-Generated", |
| "confidence": 97.1, |
| "prob_fake": 0.9855, |
| "branches": { |
| "spectral": { "prob_fake": 0.9420, "confidence": 0.8800, "label": "AI-Generated" }, |
| "edge": { "prob_fake": 0.8100, "confidence": 0.7200, "label": "AI-Generated" }, |
| "cnn": { "prob_fake": 0.9820, "confidence": 0.9640, "label": "AI-Generated" }, |
| "vit": { "prob_fake": 0.9600, "confidence": 0.9200, "label": "AI-Generated" }, |
| "diffusion": { "prob_fake": 0.8900, "confidence": 0.8300, "label": "AI-Generated" } |
| }, |
| "gradcam_b64": "<base64-encoded JPEG>", |
| "spectrum_b64": "<base64-encoded JPEG>", |
| "noise_map_b64": "<base64-encoded JPEG>", |
| "edge_map_b64": "<base64-encoded JPEG>", |
| "low_certainty": false |
| } |
| ``` |
|
|
| ### `GET /health` |
| ```json |
| { "status": "ok", "service": "ImageForensics-Detect", "version": "1.0.0" } |
| ``` |
|
|
| ### `GET /logs` |
| ```json |
| { "total": 42, "real": 18, "ai_generated": 24 } |
| ``` |
|
|
| --- |
|
|
| ## π Research Methodology |
|
|
| ### Title (Suggested) |
| > **"Multi-Branch Certainty-Weighted Forensic Detection of AI-Generated Images Using Spectral Analysis, Edge Statistics, CNN, and Vision Transformers"** |
|
|
| ### Abstract |
| This work presents ImageForensics-Detect, a multi-branch forensic analysis framework for distinguishing real camera photographs from AI-generated images produced by GANs and diffusion models. The system integrates five complementary detection branches: (1) spectral analysis using FFT and DCT to capture frequency-domain artifacts; (2) edge analysis using Sobel/Laplacian operators and gradient distribution statistics; (3) a CNN branch (EfficientNet-B0) for local texture and patch-level artifact detection; (4) a ViT branch (ViT-B/16) for global semantic inconsistency detection; and (5) a diffusion residual branch analyzing noise kurtosis and spatial uniformity. Branch predictions are combined using certainty-weighted probabilistic fusion, ensuring that uncertain or untrained branches contribute proportionally less to the final decision. |
|
|
| ### Key Design Decisions |
|
|
| | Decision | Rationale | |
| |---|---| |
| | Multi-branch ensemble | No single signal catches all generator types | |
| | Certainty-weighted fusion | Prevents weak/untrained branches from degrading accuracy | |
| | FFT + DCT (spectral) | GAN checkerboard artifacts are frequency-domain detectable | |
| | EfficientNet-B0 | Best accuracy-efficiency trade-off for the CNN branch | |
| | ViT-B/16 (timm) | Global receptive field catches semantic incoherence CNNs miss | |
| | Noise uniformity (diffusion) | Diffusion models produce spatially uniform noise fields | |
|
|
| ### References (Key Papers) |
| 1. Wang et al. (2020). *CNN-generated images are surprisingly easy to spot... for now.* CVPR. |
| 2. Frank et al. (2020). *Leveraging Frequency Analysis for Deep Fake Image Recognition.* ICML. |
| 3. Corvi et al. (2023). *On the detection of synthetic images generated by diffusion models.* ICASSP. |
| 4. Ojha et al. (2023). *Towards Universal Fake Image Detection.* CVPR. |
| 5. Selvaraju et al. (2017). *Grad-CAM: Visual Explanations from Deep Networks.* ICCV. |
|
|
| --- |
|
|
| ## π οΈ Tech Stack |
|
|
| | Layer | Technology | |
| |---|---| |
| | CNN Branch | TensorFlow 2.13+, EfficientNet-B0 | |
| | ViT Branch | PyTorch 2.0+, timm, ViT-B/16 | |
| | Signal Branches | OpenCV, NumPy, SciPy | |
| | Backend | FastAPI, Uvicorn | |
| | Frontend | HTML5, CSS3, Vanilla JavaScript | |
| | Evaluation | Scikit-learn, Matplotlib, Seaborn | |
|
|
| --- |
|
|
| ## π Status Summary |
|
|
| | Module | Status | Result | |
| |---|---|---| |
| | Spectral Branch | β
Complete | Signal Forensics | |
| | Edge Branch | β
Complete | Signal Forensics | |
| | Diffusion Branch | β
Complete | Signal Forensics | |
| | CNN Branch | β
Trained | EfficientNet-B0 | |
| | ViT Branch | β
Fully Trained | **99.30% Accuracy** | |
| | Fusion Module | β
Complete | Certainty-Weighted | |
| | Grad-CAM | β
Complete | (with Saliency Fallback) | |
| | Frontend UI | β
Enhanced | Stats Hero + Prob Bar | |
| | FastAPI Backend | β
Complete | Port 8000 | |
|
|
| --- |
|
|
| *Built for IEEE-style research deployment. B.Tech Final Year Project.* |
|
|