Xception5o / README.md

Update README.md

ece5947 verified 6 months ago

3.6 kB

	---
	license: apache-2.0
	base_model:
	- keras/xception_41_imagenet
	---
	# Model Summary

	This model is designed for detecting deepfake content in images and video frames. It uses a lightweight Convolutional Neural Network (CNN) trained on the FaceForensics++ dataset, focusing on high-resolution face manipulations (c23 compression). The model classifies whether a face in an input image is real or fake.

	* Architecture: CNN-based binary classifier
	* Input: Aligned and cropped face images (224x224 RGB)
	* Output: Real or Fake label with confidence
	* Accuracy: \~92% on unseen FaceForensics++ test set

	## Usage

	```python
	from keras.models import load_model
	import cv2
	import numpy as np

	model = load_model('deepfake_cnn_model.h5')

	def preprocess(img_path):
	img = cv2.imread(img_path)
	img = cv2.resize(img, (224, 224))
	img = img / 255.0
	return np.expand_dims(img, axis=0)

	input_img = preprocess('test_face.jpg')
	pred = model.predict(input_img)
	print("Fake" if pred[0][0] > 0.5 else "Real")
	```

	Input shape: `(1, 224, 224, 3)`
	Output: Probability of being fake

	⚠️ Fails with very low-resolution images or occluded faces.

	## System

	This model is standalone, usable in any face verification system or deepfake detection pipeline. Inputs should be properly aligned face crops. Output can be integrated into moderation systems or alerts.

	Dependencies: Keras/TensorFlow, OpenCV for preprocessing

	## Implementation requirements

	* Trained on Google Colab with a single NVIDIA T4 GPU
	* Training time: \~6 hours over 30 epochs
	* Model inference: <50ms per image
	* Memory requirement: \~150MB RAM at inference

	# Model Characteristics

	## Model initialization

	The model was trained from scratch using CNN layers, ReLU activations, dropout, and batch normalization.

	## Model stats

	* Size: \~10MB
	* Layers: \~8 convolutional layers + dense head
	* Inference latency: \~40ms on GPU, \~200ms on CPU

	## Other details

	* Not pruned or quantized
	* No use of differential privacy during training

	# Data Overview

	## Training data

	* Dataset: FaceForensics++ (c23 compression level)
	* Preprocessing: face alignment (using Dlib), resize to 224x224, normalization
	* Augmentations: horizontal flip, brightness variation

	## Demographic groups

	The dataset contains celebrity faces scraped from YouTube. It includes a mix of ethnicities and genders, but not balanced or labeled explicitly by demographic.

	## Evaluation data

	* Train/Val/Test: 70% / 15% / 15%
	* The test set includes unseen identities and manipulations (Deepfakes, FaceSwap, NeuralTextures)

	# Evaluation Results

	## Summary

	* Accuracy: \~92%
	* F1 Score: 0.91
	* ROC-AUC: 0.95

	## Subgroup evaluation results

	No explicit subgroup evaluation was conducted, but performance dropped slightly on:

	* Low-light images
	* Images with occlusions (masks, hands)

	## Fairness

	No explicit fairness metrics were applied due to lack of demographic labels. However, output bias may exist due to uneven representation in training data.

	## Usage limitations

	* Struggles on low-res or occluded faces
	* Doesn’t work on audio-based or voice deepfakes
	* Requires good lighting and clear facial visibility
	* Not suitable for legal or forensics-grade use cases without further testing

	## Ethics

	This model is intended for educational and research purposes only. It should not be used to make real-world judgments (legal, political, etc.) without human oversight. Deepfake detection systems must be transparent about their limitations and avoid misuse in surveillance or personal targeting.