|
|
---
|
|
|
license: mit
|
|
|
tags:
|
|
|
- deepfake-detection
|
|
|
- computer-vision
|
|
|
- image-classification
|
|
|
- xceptionnet
|
|
|
- face-forensics
|
|
|
- pytorch
|
|
|
- synthetic-media
|
|
|
- media-forensics
|
|
|
metrics:
|
|
|
- accuracy
|
|
|
- auc
|
|
|
- f1
|
|
|
language:
|
|
|
- en
|
|
|
library_name: pytorch
|
|
|
pipeline_tag: image-classification
|
|
|
---
|
|
|
|
|
|
# FaceForge Detector: State-of-the-Art Deepfake Detection
|
|
|
|
|
|
[](https://doi.org/10.5281/zenodo.18530439)
|
|
|
[](https://github.com/Huzaifanasir95/FaceForge)
|
|
|
[](https://opensource.org/licenses/MIT)
|
|
|
|
|
|
π― **99.33% Accuracy | 0.9995 AUC-ROC | 0.67% Error Rate**
|
|
|
|
|
|
## Model Description
|
|
|
|
|
|
FaceForge Detector is a high-performance deepfake detection model based on XceptionNet architecture that achieves state-of-the-art results on the FaceForensics++ dataset. This model can distinguish authentic faces from AI-generated deepfakes with exceptional accuracy.
|
|
|
|
|
|
**Key Features:**
|
|
|
- π 99.33% accuracy on test set (1,500 samples)
|
|
|
- π 0.9995 AUC-ROC score
|
|
|
- β‘ <200ms inference time per image
|
|
|
- π― Only 2 false negatives out of 750 deepfakes (0.27% miss rate)
|
|
|
- π§ 22M trainable parameters
|
|
|
- π» Trained efficiently on CPU (7.5 hours)
|
|
|
|
|
|
## Model Architecture
|
|
|
|
|
|
```
|
|
|
XceptionNet Backbone (20.8M params)
|
|
|
βββ Entry Flow
|
|
|
βββ Middle Flow (8 blocks)
|
|
|
βββ Exit Flow
|
|
|
βββ Global Average Pooling
|
|
|
βββ Custom Classification Head (1.1M params)
|
|
|
βββ Dropout (p=0.5)
|
|
|
βββ FC(2048 β 512) + ReLU
|
|
|
βββ Dropout (p=0.3)
|
|
|
βββ FC(512 β 2) [Real/Fake]
|
|
|
```
|
|
|
|
|
|
## Performance Metrics
|
|
|
|
|
|
### Test Set Results (1,500 samples)
|
|
|
|
|
|
| Metric | Value |
|
|
|
|--------|-------|
|
|
|
| **Accuracy** | 99.33% |
|
|
|
| **AUC-ROC** | 0.9995 |
|
|
|
| **Precision** | 98.94% |
|
|
|
| **Recall** | 99.73% |
|
|
|
| **F1 Score** | 0.9934 |
|
|
|
| **Specificity** | 98.93% |
|
|
|
|
|
|
### Confusion Matrix
|
|
|
|
|
|
| | Predicted Real | Predicted Fake |
|
|
|
|--|----------------|----------------|
|
|
|
| **Actual Real** | 742 (TN) | 8 (FP) |
|
|
|
| **Actual Fake** | 2 (FN) | 748 (TP) |
|
|
|
|
|
|
**Total Errors:** 10 out of 1,500 (0.67%)
|
|
|
|
|
|
## Comparison with State-of-the-Art
|
|
|
|
|
|
| Method | Accuracy | AUC-ROC |
|
|
|
|--------|----------|---------|
|
|
|
| MesoNet | 83.1% | 0.847 |
|
|
|
| XceptionNet (base) | 95.3% | 0.969 |
|
|
|
| EfficientNet-B4 | 96.8% | 0.981 |
|
|
|
| Capsule Network | 97.4% | 0.986 |
|
|
|
| Face X-ray | 98.1% | 0.991 |
|
|
|
| **FaceForge (Ours)** | **99.33%** | **0.9995** |
|
|
|
|
|
|
## Usage
|
|
|
|
|
|
### Installation
|
|
|
|
|
|
```bash
|
|
|
pip install torch torchvision timm pillow numpy
|
|
|
```
|
|
|
|
|
|
### Inference
|
|
|
|
|
|
```python
|
|
|
import torch
|
|
|
import timm
|
|
|
from PIL import Image
|
|
|
from torchvision import transforms
|
|
|
|
|
|
# Load model
|
|
|
model = timm.create_model('xception', pretrained=False, num_classes=2)
|
|
|
checkpoint = torch.load('detector_best.pth', map_location='cpu')
|
|
|
model.load_state_dict(checkpoint['model_state_dict'])
|
|
|
model.eval()
|
|
|
|
|
|
# Preprocessing
|
|
|
transform = transforms.Compose([
|
|
|
transforms.Resize((224, 224)),
|
|
|
transforms.ToTensor(),
|
|
|
transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
|
|
|
])
|
|
|
|
|
|
# Inference
|
|
|
def detect_deepfake(image_path):
|
|
|
image = Image.open(image_path).convert('RGB')
|
|
|
input_tensor = transform(image).unsqueeze(0)
|
|
|
|
|
|
with torch.no_grad():
|
|
|
logits = model(input_tensor)
|
|
|
probs = torch.softmax(logits, dim=1)
|
|
|
prediction = torch.argmax(probs, dim=1).item()
|
|
|
confidence = probs[0][prediction].item()
|
|
|
|
|
|
label = "REAL" if prediction == 0 else "FAKE"
|
|
|
return label, confidence
|
|
|
|
|
|
# Example
|
|
|
label, confidence = detect_deepfake("face.jpg")
|
|
|
print(f"Prediction: {label} (Confidence: {confidence:.2%})")
|
|
|
```
|
|
|
|
|
|
### Gradio Demo
|
|
|
|
|
|
```python
|
|
|
import gradio as gr
|
|
|
|
|
|
def predict(image):
|
|
|
label, confidence = detect_deepfake(image)
|
|
|
return f"{label} ({confidence:.2%})"
|
|
|
|
|
|
demo = gr.Interface(
|
|
|
fn=predict,
|
|
|
inputs=gr.Image(type="filepath"),
|
|
|
outputs="text",
|
|
|
title="FaceForge Deepfake Detector",
|
|
|
description="Upload a face image to detect if it's real or AI-generated"
|
|
|
)
|
|
|
|
|
|
demo.launch()
|
|
|
```
|
|
|
|
|
|
## Training Details
|
|
|
|
|
|
### Dataset
|
|
|
- **Source:** FaceForensics++ (c40 compression)
|
|
|
- **Training:** 7,000 images (3,500 real + 3,500 fake)
|
|
|
- **Validation:** 1,500 images (750 real + 750 fake)
|
|
|
- **Test:** 1,500 images (750 real + 750 fake)
|
|
|
|
|
|
### Hyperparameters
|
|
|
```yaml
|
|
|
optimizer: AdamW
|
|
|
learning_rate: 1e-4
|
|
|
weight_decay: 1e-4
|
|
|
batch_size: 32
|
|
|
epochs: 10
|
|
|
lr_schedule: Cosine Annealing (1e-4 β 1e-6)
|
|
|
augmentation:
|
|
|
- Random Horizontal Flip (p=0.5)
|
|
|
- Color Jitter (brightness=0.2, contrast=0.2, saturation=0.2)
|
|
|
```
|
|
|
|
|
|
### Training Time
|
|
|
- **Total:** 7.52 hours (451.5 minutes)
|
|
|
- **Per Epoch:** ~45 minutes
|
|
|
- **Hardware:** CPU (no GPU required)
|
|
|
- **Best Checkpoint:** Epoch 8 (0.9998 AUC-ROC)
|
|
|
|
|
|
## Limitations
|
|
|
|
|
|
1. **Dataset Scope:** Trained on FaceForensics++ deepfakes; may need fine-tuning for other manipulation methods
|
|
|
2. **Single Frame:** Processes individual images; doesn't leverage temporal information from videos
|
|
|
3. **Compression:** Trained on c40 compression; performance may vary with different quality levels
|
|
|
4. **Domain:** Optimized for face-centric images; may struggle with partial faces or unusual angles
|
|
|
|
|
|
## Ethical Considerations
|
|
|
|
|
|
This model is intended for:
|
|
|
β
Research and education
|
|
|
β
Content moderation
|
|
|
β
Forensic analysis
|
|
|
β
Fact-checking
|
|
|
|
|
|
**Not intended for:**
|
|
|
β Malicious surveillance
|
|
|
β Discriminatory profiling
|
|
|
β Invasion of privacy
|
|
|
|
|
|
## Citation
|
|
|
|
|
|
```bibtex
|
|
|
@techreport{nasir2026faceforge,
|
|
|
title={FaceForge: A Deep Learning Framework for Facial Manipulation Generation and Detection},
|
|
|
author={Nasir, Huzaifa},
|
|
|
institution={National University of Computer and Emerging Sciences},
|
|
|
year={2026},
|
|
|
doi={10.5281/zenodo.18530439}
|
|
|
}
|
|
|
```
|
|
|
|
|
|
## Links
|
|
|
|
|
|
- π **Paper:** https://doi.org/10.5281/zenodo.18530439
|
|
|
- π» **Code:** https://github.com/Huzaifanasir95/FaceForge
|
|
|
- π¨ **Generator Model:** https://huggingface.co/Huzaifanasir95/faceforge-generator
|
|
|
- π **Interactive Demo:** See notebook `03_detector_and_adversarial.ipynb`
|
|
|
|
|
|
## License
|
|
|
|
|
|
This model is released under CC BY 4.0 license. See LICENSE file for details.
|
|
|
|
|
|
## Author
|
|
|
|
|
|
**Huzaifa Nasir**
|
|
|
National University of Computer and Emerging Sciences (NUCES)
|
|
|
Islamabad, Pakistan
|
|
|
π§ nasirhuzaifa95@gmail.com
|
|
|
|
|
|
## Acknowledgments
|
|
|
|
|
|
- FaceForensics++ dataset creators
|
|
|
- PyTorch and timm library developers
|
|
|
- XceptionNet architecture (FranΓ§ois Chollet)
|
|
|
|