Harshasnade
/

Deepfake_Detection_System_V1

@@ -1,18 +1,148 @@
 ---
 tags:
 - deepfake-detection
-- pytorch
 - image-classification
 library_name: pytorch
 license: mit
 ---
-# DeepGuard - Deepfake Detection Model
-This repository contains the trained weights for the DeepGuard Deepfake Detection System.
 ## Model Details
-- **Architecture**: Ensembled EfficientNetV2-S + Swin-V2-T + Custom CNN
-- **Input Size**: 224x224
-- **Format**: SafeTensors (PyTorch)

 ---
 tags:
 - deepfake-detection
+- computer-vision
 - image-classification
+- pytorch
+- efficientnet
+- swin-transformer
+- security
 library_name: pytorch
 license: mit
+metrics:
+- accuracy
+- f1
+pipeline_tag: image-classification
 ---
+# DeepGuard - Deepfake Detection System
 ## Model Details
+### Model Description
+DeepGuard is a robust Deepfake Detection System designed to identify AI-generated images with high precision. It employs an ensemble architecture combining **EfficientNetV2-S** and **Swin Transformer V2-T** with a custom Convolutional Neural Network (CNN) head. This hybrid approach leverages both local feature extraction (CNN) and global context understanding (Transformers) to spot manipulation artifacts often invisible to the human eye.
+- **Developed by:** Harshvardhan Asnade
+- **Model type:** Ensemble (EfficientNetV2 + SwinV2 + Custom CNN)
+- **Language(s):** Python, PyTorch
+- **License:** MIT
+- **Finetuned from model:** Torchvision pre-trained weights (ImageNet)
+### Model Sources
+- **Repository:** https://github.com/Harshvardhan-Asnade/Deepfake-Model
+- **Demo:** https://deepfakescan.vercel.app/ (Live Web App)
+## Uses
+### Direct Use
+The model is designed to classify single images as either **REAL** or **FAKE**. It outputs a probability score (0.0 - 1.0) and a confidence metric. It is suitable for:
+- Content moderation
+- Social media verification
+- Digital forensics (preliminary analysis)
+### Out-of-Scope Use
+- **Video Analysis:** While it can analyze individual frames, it does not currently leverage temporal coherence in videos (frame-by-frame analysis only).
+- **Audio Deepfakes:** This model is strictly for visual content.
+- **Legal Proof:** The model provides a probabilistic assessment and should not be the sole basis for legal judgments.
+## How to Get Started with the Model
+```python
+import torch
+import torch.nn as nn
+from torchvision import models
+import albumentations as A
+from albumentations.pytorch import ToTensorV2
+from safetensors.torch import load_file
+import cv2
+# Define Model Architecture
+class DeepfakeDetector(nn.Module):
+    def __init__(self, pretrained=False):
+        super(DeepfakeDetector, self).__init__()
+        self.efficientnet = models.efficientnet_v2_s(weights='DEFAULT' if pretrained else None)
+        self.swin = models.swin_v2_t(weights='DEFAULT' if pretrained else None)
+        self.efficientnet.classifier = nn.Identity()
+        self.swin.head = nn.Identity()
+        self.classifier = nn.Sequential(
+            nn.Linear(1280 + 768, 512),
+            nn.BatchNorm1d(512),
+            nn.ReLU(),
+            nn.Dropout(0.4),
+            nn.Linear(512, 1)
+        )
+    def forward(self, x):
+        f1 = self.efficientnet(x)
+        f2 = self.swin(x)
+        combined = torch.cat((f1, f2), dim=1)
+        return self.classifier(combined)
+# Load Model
+device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+model = DeepfakeDetector(pretrained=False).to(device)
+state_dict = load_file("best_model.safetensors")
+model.load_state_dict(state_dict)
+model.eval()
+```
+## Training Details
+### Training Data
+The model was trained on a diverse dataset comprising:
+- **Real Images:** FFHQ, CelebA-HQ
+- **Deepfake Images:** Generated using StyleGAN2, Diffusion Models, and FaceSwap techniques.
+- **Data Augmentation:** extensive augmentation (compression, noise, blur) was applied to robustify the model against social media re-compression artifacts.
+### Training Procedure
+- **Optimizer:** AdamW
+- **Loss Function:** BCEWithLogitsLoss
+- **Scheduler:** OneCycleLR
+- **Epochs:** 10+ with Early Stopping
+- **Input Resolution:** 224x224
+#### Training Hyperparameters
+- **Batch Size:** 32
+- **Precision:** Mixed Precision (FP16)
+## Evaluation
+### Results
+The model achieves high accuracy on standard benchmarks:
+- **Test Accuracy:** ~92-95% (on unseen test split)
+- **Generalization:** Shows strong resilience to JPEG compression compared to standard CNNs.
+## Technical Specifications
+### Model Architecture
+The specific ensemble combines:
+1.  **EfficientNetV2-S:** Excellent at capturing sharp, high-frequency details (e.g., hair textures, eye reflections).
+2.  **Swin Transformer (V2-T):** Captures global semantic inconsistencies (e.g., facial structural alignment).
+### Compute Infrastructure
+- **Hardware:** Trained on Mac M-Series (MPS) / NVIDIA GPUs.
+- **Framework:** PyTorch 2.6+
+## Citation
+```bibtex
+@misc{deepguard2024,
+  author = {Asnade, Harshvardhan},
+  title = {DeepGuard: Ensemble Deepfake Detection System},
+  year = {2024},
+  publisher = {Hugging Face},
+  howpublished = {\url{https://huggingface.co/Harshasnade/Deepfake_Detection_System_V1}}
+}
+```