Xception5o / README.md
Arman176's picture
Update README.md
ece5947 verified
---
license: apache-2.0
base_model:
- keras/xception_41_imagenet
---
# Model Summary
This model is designed for detecting deepfake content in images and video frames. It uses a lightweight Convolutional Neural Network (CNN) trained on the **FaceForensics++ dataset**, focusing on high-resolution face manipulations (c23 compression). The model classifies whether a face in an input image is **real or fake**.
* Architecture: CNN-based binary classifier
* Input: Aligned and cropped face images (224x224 RGB)
* Output: Real or Fake label with confidence
* Accuracy: \~92% on unseen FaceForensics++ test set
## Usage
```python
from keras.models import load_model
import cv2
import numpy as np
model = load_model('deepfake_cnn_model.h5')
def preprocess(img_path):
img = cv2.imread(img_path)
img = cv2.resize(img, (224, 224))
img = img / 255.0
return np.expand_dims(img, axis=0)
input_img = preprocess('test_face.jpg')
pred = model.predict(input_img)
print("Fake" if pred[0][0] > 0.5 else "Real")
```
**Input shape**: `(1, 224, 224, 3)`
**Output**: Probability of being fake
⚠️ *Fails with very low-resolution images or occluded faces.*
## System
This model is **standalone**, usable in any face verification system or deepfake detection pipeline. Inputs should be properly aligned face crops. Output can be integrated into moderation systems or alerts.
**Dependencies**: Keras/TensorFlow, OpenCV for preprocessing
## Implementation requirements
* Trained on Google Colab with a single NVIDIA T4 GPU
* Training time: \~6 hours over 30 epochs
* Model inference: <50ms per image
* Memory requirement: \~150MB RAM at inference
# Model Characteristics
## Model initialization
The model was **trained from scratch** using CNN layers, ReLU activations, dropout, and batch normalization.
## Model stats
* Size: \~10MB
* Layers: \~8 convolutional layers + dense head
* Inference latency: \~40ms on GPU, \~200ms on CPU
## Other details
* Not pruned or quantized
* No use of differential privacy during training
# Data Overview
## Training data
* Dataset: FaceForensics++ (c23 compression level)
* Preprocessing: face alignment (using Dlib), resize to 224x224, normalization
* Augmentations: horizontal flip, brightness variation
## Demographic groups
The dataset contains celebrity faces scraped from YouTube. It includes a mix of ethnicities and genders, but **not balanced or labeled** explicitly by demographic.
## Evaluation data
* Train/Val/Test: 70% / 15% / 15%
* The test set includes unseen identities and manipulations (Deepfakes, FaceSwap, NeuralTextures)
# Evaluation Results
## Summary
* Accuracy: \~92%
* F1 Score: 0.91
* ROC-AUC: 0.95
## Subgroup evaluation results
No explicit subgroup evaluation was conducted, but performance dropped slightly on:
* Low-light images
* Images with occlusions (masks, hands)
## Fairness
No explicit fairness metrics were applied due to lack of demographic labels. However, output bias may exist due to uneven representation in training data.
## Usage limitations
* Struggles on low-res or occluded faces
* Doesn’t work on audio-based or voice deepfakes
* Requires good lighting and clear facial visibility
* Not suitable for legal or forensics-grade use cases without further testing
## Ethics
This model is intended for **educational and research purposes only**. It should not be used to make real-world judgments (legal, political, etc.) without human oversight. Deepfake detection systems must be transparent about their limitations and avoid misuse in surveillance or personal targeting.