YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

DFG - Deepfake Genome Codebase

1. Environment Setup

Create and activate the conda environment:

# Create a new conda environment (Python 3.10 recommended)
conda create -n dfg python=3.10 -y

# Activate the environment
conda activate dfg

# Install dependencies
pip install -r requirements.txt

2. Dataset Configuration

Before training or testing, you need to update the dataset global path to match your actual data location.

Open training/dataset/abstract_dataset.py and modify the DATASET_GLOBAL_PATH variable:

# Change this to your actual dataset root path
DATASET_GLOBAL_PATH = "/your/actual/dataset/path/"

This path should point to the root directory containing your deepfake detection datasets (e.g., DeepFakeGenome, deepfake_detecton_dataset, etc.).

3. Project and Dataset Structure

DFG/
├── preprocessing/
│   └── dataset_json/          # Dataset index JSON files
│       ├── protocol_2_train.json
│       ├── protocol_2_test.json
│       ├── protocol_3_test.json
│       ├── protocol_4_test.json
│       └── ...
├── training/
│   ├── config/
│   │   └── detector/          # Detector config YAML files
│   ├── detectors/             # Detector implementations
│   │   ├── __init__.py        # Register all detectors here
│   │   ├── base_detector.py
│   │   └── ...
│   ├── networks/              # Backbone network implementations
│   ├── loss/                  # Loss function definitions
│   ├── metrics/               # Evaluation metrics
│   ├── train.py               # Training entry point
│   └── test_pall.py           # Testing entry point
├── train.sh                   # Training script examples
├── test.sh                    # Testing script examples
├── requirements.txt           # Python dependencies
└── README.md

4. Training

Refer to train.sh for all training commands. Example:

python -m torch.distributed.launch --master_port=29503 --nproc_per_node=8 training/train.py \
    --detector_path ./training/config/detector/clip_large_fft.yaml \
    --no-save_feat --ddp

Key arguments:

--master_port: port for distributed training (change if port conflicts occur)
--nproc_per_node: number of GPUs
--detector_path: path to the detector config YAML
--no-save_feat: disable feature saving during training
--ddp: enable DistributedDataParallel

5. Testing

Refer to test.sh for all testing commands. Example:

# Test on protocol 2 & 3
python -m torch.distributed.launch --master_port=29510 --nproc_per_node=8 training/test_pall.py --ddp \
    --test_dataset "protocol_2_test" "protocol_3_test" \
    --detector_path ./training/config/detector/clip_large_fft.yaml \
    --weights_path logs/clip_models/clip_large_fft_2025-11-08-13-56-51

# Test on protocol 4
python -m torch.distributed.launch --master_port=29512 --nproc_per_node=8 training/test_pall.py --ddp \
    --test_dataset "protocol_4_test" \
    --detector_path ./training/config/detector/clip_large_fft.yaml \
    --weights_path logs/clip_models/clip_large_fft_2025-11-08-13-56-51 \
    --test_config test_config_p4.yaml

Key arguments:

--test_dataset: one or more dataset names (must match JSON filenames under preprocessing/dataset_json/)
--weights_path: path to trained model checkpoint directory
--test_config: additional test configuration (required for protocol 4)

6. Adding a Custom Detector

To integrate your own detector into the framework, follow these three steps:

Step 1: Create the detector config YAML

Create a new file under training/config/detector/, e.g., my_detector.yaml:

# log dir
log_dir: logs/my_detector

# model setting
pretrained: null
model_name: my_detector
backbone_name: resnet34

# backbone setting
backbone_config:
  mode: original
  num_classes: 2
  inc: 3
  dropout: false

# dataset
all_dataset: [FaceForensics++, FF-F2F, FF-DF, FF-FS, FF-NT, FaceShifter, DeepFakeDetection, Celeb-DF-v1, Celeb-DF-v2, DFDCP, DFDC, DeeperForensics-1.0, UADFV]
train_dataset: [protocol_2_train]
test_dataset: [protocol_2_test]

compression: c23
train_batchSize: 64
test_batchSize: 64
workers: 8
frame_num: {'train': 16, 'test': 16}
resolution: 224
with_mask: false
with_landmark: false

# data augmentation
use_data_augmentation: false
data_aug:
  flip_prob: 0.5
  rotate_prob: 0.5
  rotate_limit: [-10, 10]
  blur_prob: 0.5
  blur_limit: [3, 7]
  brightness_prob: 0.5
  brightness_limit: [-0.1, 0.1]
  contrast_limit: [-0.1, 0.1]
  quality_lower: 40
  quality_upper: 100

# mean and std for normalization
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]

# optimizer config
optimizer:
  type: adam
  adam:
    lr: 0.0002
    beta1: 0.9
    beta2: 0.999
    eps: 0.00000001
    weight_decay: 0.0005
    amsgrad: false

# training config
lr_scheduler: null
nEpochs: 20
start_epoch: 0
save_epoch: 1
rec_iter: 100
logdir: ./logs
manualSeed: 1024
save_ckpt: true
save_feat: true

# loss function
loss_func: cross_entropy
losstype: null

# metric
metric_scoring: auc

# cuda
ngpu: 1
cuda: true
cudnn: true

save_avg: true
save_latest_ckpt: true

Step 2: Create the detector Python file

Create training/detectors/my_detector.py:

import torch
import torch.nn as nn

from metrics.base_metrics_class import calculate_metrics_for_train
from .base_detector import AbstractDetector
from detectors import DETECTOR
from networks import BACKBONE
from loss import LOSSFUNC


@DETECTOR.register_module(module_name='my_detector')
class MyDetector(AbstractDetector):
    def __init__(self, config):
        super().__init__()
        self.config = config
        self.backbone = self.build_backbone(config)
        self.loss_func = LOSSFUNC[config['loss_func']]()

    def build_backbone(self, config):
        backbone = BACKBONE[config['backbone_name']](config['backbone_config'])
        return backbone

    def features(self, data_dict: dict) -> torch.Tensor:
        return self.backbone(data_dict['image'])

    def classifier(self, features: torch.Tensor) -> torch.Tensor:
        return self.fc(features)

    def get_losses(self, data_dict: dict, pred_dict: dict) -> dict:
        label = data_dict['label']
        pred = pred_dict['cls']
        loss = self.loss_func(pred, label)
        return {'overall': loss}

    def get_train_metrics(self, data_dict: dict, pred_dict: dict) -> dict:
        label = data_dict['label']
        pred = pred_dict['cls']
        auc, eer, acc, ap = calculate_metrics_for_train(label.detach(), pred.detach())
        return {'acc': acc, 'auc': auc, 'eer': eer, 'ap': ap}

    def forward(self, data_dict: dict, inference=False) -> dict:
        features = self.features(data_dict)
        pred = self.classifier(features)
        prob = torch.softmax(pred, dim=1)[:, 1]
        pred_dict = {'cls': pred, 'prob': prob, 'feat': features}
        return pred_dict

Step 3: Register the detector in `init.py`

Add the following import line to training/detectors/__init__.py:

from .my_detector import MyDetector

That's it! Now you can train and test with your custom detector:

# Train
python -m torch.distributed.launch --master_port=29503 --nproc_per_node=8 training/train.py \
    --detector_path ./training/config/detector/my_detector.yaml \
    --no-save_feat --ddp

# Test
python -m torch.distributed.launch --master_port=29510 --nproc_per_node=8 training/test_pall.py --ddp \
    --test_dataset "protocol_2_test" "protocol_3_test" \
    --detector_path ./training/config/detector/my_detector.yaml \
    --weights_path logs/my_detector/<your_checkpoint_folder>

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

DeepfakeGenome
/

DeepfakeGenome_Codebase

DFG - Deepfake Genome Codebase

1. Environment Setup

2. Dataset Configuration

3. Project and Dataset Structure

4. Training

5. Testing

6. Adding a Custom Detector

Step 1: Create the detector config YAML

Step 2: Create the detector Python file

Step 3: Register the detector in `init.py`

Space using DeepfakeGenome/DeepfakeGenome_Codebase 1

DFG - Deepfake Genome Codebase

1. Environment Setup

2. Dataset Configuration

3. Project and Dataset Structure

4. Training

5. Testing

6. Adding a Custom Detector

Step 1: Create the detector config YAML

Step 2: Create the detector Python file

Step 3: Register the detector in __init__.py

Space using DeepfakeGenome/DeepfakeGenome_Codebase 1

Step 3: Register the detector in `init.py`