๐ Detect deepfakes with state-of-the-art accuracy
A ResNeXt-101 32ร8d backbone โ initialised from Instagram's weakly-supervised pretrained weights โ fine-tuned to expose AI-generated and manipulated faces with high confidence.
โ
Real Face ๐จ Deepfake
โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ
Confidence: 97.3% Confidence: 99.1%
๐บ๏ธ Navigation
๐ Overview ยท ๐๏ธ Architecture ยท ๐ Quick Start ยท ๐ผ๏ธ Examples ยท ๐๏ธ Training ยท โ ๏ธ Limitations ยท ๐ Cite
๐ Model Overview
|
|
๐๏ธ Architecture
The backbone uses grouped convolutions with cardinality 32 โ each layer splits into 32 parallel transformation paths, then aggregates. This lets the network learn diverse artefact patterns (blending seams, frequency inconsistencies, unnatural textures) simultaneously.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ResNeXt-101 32ร8d โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ ๐ท Input โโโถ ๐ฑ STEM โ
โ Conv 7ร7 โ BN โ ReLU โ MaxPool โ
โ 3 โ 64 channels โ
โ โ โ
โ โโโโโโโโโโผโโโโโโโโโ โ
โ โ ๐งฉ LAYER 1 โ ร3 blocks ยท ch 256 โ
โ โโโโโโโโโโฌโโโโโโโโโ โ
โ โโโโโโโโโโผโโโโโโโโโ โ
โ โ ๐งฉ LAYER 2 โ ร4 blocks ยท ch 512 โ
โ โโโโโโโโโโฌโโโโโโโโโ โ
โ โโโโโโโโโโผโโโโโโโโโ โ
โ โ ๐งฉ LAYER 3 โ ร23 blocks ยท ch 1024 โโโ deepest โ
โ โโโโโโโโโโฌโโโโโโโโโ โ
โ โโโโโโโโโโผโโโโโโโโโ โ
โ โ ๐งฉ LAYER 4 โ ร3 blocks ยท ch 2048 โ
โ โโโโโโโโโโฌโโโโโโโโโ โ
โ Global Avg Pool โ
โ โ โ
โ โโโโโโโโโโโผโโโโโโโโโโ โ
โ โ ๐ฏ FC HEAD โ 2048 โ num_classes โ
โ โโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Each bottleneck block:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 1ร1 Conv (expand) โ 3ร3 GroupConv (groups=32) โ
โ โ 1ร1 Conv (compress) + Skip Connection โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ Quick Start
1๏ธโฃ Install dependencies
# Clone the repo
git clone https://github.com/accel-reg/deepfake-detection.git
cd deepfake-detection
# Install requirements
pip install -r requirements.txt
๐ฆ What's in requirements.txt?
torch>=1.13
torchvision>=0.14
Pillow
opencv-python
huggingface_hub
2๏ธโฃ Load the model
import torch
from model import DeepfakeDetector # from the repo
# โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# Option A โ load local file
# โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
model = DeepfakeDetector()
model.load_state_dict(torch.load("ig.bin", map_location="cpu"))
model.eval()
# โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# Option B โ pull from HuggingFace ๐ค
# โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
from huggingface_hub import hf_hub_download
path = hf_hub_download(repo_id="accel69/depfake-detection", filename="ig.bin")
model = DeepfakeDetector()
model.load_state_dict(torch.load(path, map_location="cpu"))
model.eval()
3๏ธโฃ Preprocess & predict
from torchvision import transforms
from PIL import Image
# โโ Standard ImageNet preprocessing โโโโโโโโโโโโโโโโโโโโโโโโโโโ
transform = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std =[0.229, 0.224, 0.225]
),
])
# โโ Predict โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
img = Image.open("face.jpg").convert("RGB")
x = transform(img).unsqueeze(0) # โ (1, 3, 224, 224)
with torch.no_grad():
probs = torch.softmax(model(x), dim=1)
pred = probs.argmax(dim=1).item()
label = "๐จ FAKE" if pred == 1 else "โ
REAL"
confidence = probs[0, pred].item()
print(f" Result : {label}")
print(f" Confidence : {confidence:.2%}")
๐ผ๏ธ Inference Examples
๐ Batch inference on a folder
from pathlib import Path
image_dir = Path("frames/")
results = {"real": 0, "fake": 0}
for img_path in sorted(image_dir.glob("*.jpg")):
img = Image.open(img_path).convert("RGB")
x = transform(img).unsqueeze(0)
with torch.no_grad():
probs = torch.softmax(model(x), dim=1)
is_fake = probs.argmax().item() == 1
confidence = probs.max().item()
label = "๐จ FAKE" if is_fake else "โ
REAL"
results["fake" if is_fake else "real"] += 1
print(f" {img_path.name:<35} {label} ({confidence:.2%})")
print(f"\n ๐ Summary โ โ
Real: {results['real']} | ๐จ Fake: {results['fake']}")
โก GPU acceleration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
print(f" ๐ฅ Running on : {device}")
print(f" โก CUDA cores : {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'N/A'}")
# Move input to same device
x = x.to(device)
with torch.no_grad():
probs = torch.softmax(model(x), dim=1)
๐ฅ Video frame-by-frame analysis
import cv2
cap = cv2.VideoCapture("video.mp4")
fake_frames = 0
total_frames = 0
print(" ๐ฌ Analysing video...")
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
img = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
x = transform(img).unsqueeze(0)
with torch.no_grad():
pred = model(x).argmax(dim=1).item()
fake_frames += pred
total_frames += 1
cap.release()
fake_pct = fake_frames / total_frames
verdict = "๐จ LIKELY DEEPFAKE" if fake_pct > 0.5 else "โ
LIKELY REAL"
print(f"\n โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ")
print(f" โ ๐ฌ Total frames : {total_frames:<12}โ")
print(f" โ ๐จ Fake frames : {fake_frames:<12}โ")
print(f" โ โ
Real frames : {total_frames-fake_frames:<12}โ")
print(f" โ ๐ Fake ratio : {fake_pct:<11.1%} โ")
print(f" โ ๐ Verdict : {verdict:<12}โ")
print(f" โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ")
๐๏ธ Training Details
Training Pipeline
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ฆ Backbone Instagram WSL ResNeXt-101 32ร8d
๐ผ๏ธ Resolution 224 ร 224 RGB
๐ Normalisation ImageNet mean [0.485 0.456 0.406]
std [0.229 0.224 0.225]
๐ Loss function Cross-Entropy
๐ Augmentation Horizontal flip ยท Colour jitter
Random crop ยท Rotation
| โ๏ธ Hyperparameter | ๐ Value |
|---|---|
| ๐งฑ Backbone init | Instagram WSL pretrained (WSL-Images) |
| ๐ท Input resolution | 224 ร 224 |
| ๐ Normalisation | ImageNet mean / std |
| ๐ Loss | Cross-Entropy |
| ๐ Augmentations | Flip, colour jitter, random crop |
๐ Full configs, dataset prep scripts and training logs โ GitHub Repository
โ ๏ธ Limitations
๐ง Read before deploying in any production or real-world system.
| โ ๏ธ Risk | ๐ Details |
|---|---|
| ๐ Novel forgery methods | May not detect unseen GAN/diffusion techniques |
| ๐ Alignment sensitivity | Poor face crop โ lower accuracy. Use a dedicated face detector first |
| ๐ Distribution shift | Different cameras, compression, or lighting may degrade results |
| โ๏ธ Demographic bias | Not audited across demographic groups โ evaluate independently |
| ๐ No temporal context | Frame-level only โ no multi-frame consistency modelling |
๐ฏ Intended Use
๐ License & Citation
Released under the MIT License โ free to use, modify, and distribute with attribution.
@misc{ig-deepfake-detection-2025,
author = {accel69},
title = {ig.bin โ Deepfake Face Detection with ResNeXt-101 32x8d},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/accel69/depfake-detection}
}