File size: 3,595 Bytes

ece5947

---
license: apache-2.0
base_model:
- keras/xception_41_imagenet
---
# Model Summary

This model is designed for detecting deepfake content in images and video frames. It uses a lightweight Convolutional Neural Network (CNN) trained on the **FaceForensics++ dataset**, focusing on high-resolution face manipulations (c23 compression). The model classifies whether a face in an input image is **real or fake**.

* Architecture: CNN-based binary classifier
* Input: Aligned and cropped face images (224x224 RGB)
* Output: Real or Fake label with confidence
* Accuracy: \~92% on unseen FaceForensics++ test set

## Usage

```python
from keras.models import load_model
import cv2
import numpy as np

model = load_model('deepfake_cnn_model.h5')

def preprocess(img_path):
    img = cv2.imread(img_path)
    img = cv2.resize(img, (224, 224))
    img = img / 255.0
    return np.expand_dims(img, axis=0)

input_img = preprocess('test_face.jpg')
pred = model.predict(input_img)
print("Fake" if pred[0][0] &gt; 0.5 else "Real")
```

**Input shape**: `(1, 224, 224, 3)`
**Output**: Probability of being fake

⚠️ *Fails with very low-resolution images or occluded faces.*

## System

This model is **standalone**, usable in any face verification system or deepfake detection pipeline. Inputs should be properly aligned face crops. Output can be integrated into moderation systems or alerts.

**Dependencies**: Keras/TensorFlow, OpenCV for preprocessing

## Implementation requirements

* Trained on Google Colab with a single NVIDIA T4 GPU
* Training time: \~6 hours over 30 epochs
* Model inference: &lt;50ms per image
* Memory requirement: \~150MB RAM at inference

# Model Characteristics

## Model initialization

The model was **trained from scratch** using CNN layers, ReLU activations, dropout, and batch normalization.

## Model stats

* Size: \~10MB
* Layers: \~8 convolutional layers + dense head
* Inference latency: \~40ms on GPU, \~200ms on CPU

## Other details

* Not pruned or quantized
* No use of differential privacy during training

# Data Overview

## Training data

* Dataset: FaceForensics++ (c23 compression level)
* Preprocessing: face alignment (using Dlib), resize to 224x224, normalization
* Augmentations: horizontal flip, brightness variation

## Demographic groups

The dataset contains celebrity faces scraped from YouTube. It includes a mix of ethnicities and genders, but **not balanced or labeled** explicitly by demographic.

## Evaluation data

* Train/Val/Test: 70% / 15% / 15%
* The test set includes unseen identities and manipulations (Deepfakes, FaceSwap, NeuralTextures)

# Evaluation Results

## Summary

* Accuracy: \~92%
* F1 Score: 0.91
* ROC-AUC: 0.95

## Subgroup evaluation results

No explicit subgroup evaluation was conducted, but performance dropped slightly on:

* Low-light images
* Images with occlusions (masks, hands)

## Fairness

No explicit fairness metrics were applied due to lack of demographic labels. However, output bias may exist due to uneven representation in training data.

## Usage limitations

* Struggles on low-res or occluded faces
* Doesn’t work on audio-based or voice deepfakes
* Requires good lighting and clear facial visibility
* Not suitable for legal or forensics-grade use cases without further testing

## Ethics

This model is intended for **educational and research purposes only**. It should not be used to make real-world judgments (legal, political, etc.) without human oversight. Deepfake detection systems must be transparent about their limitations and avoid misuse in surveillance or personal targeting.