YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

DFG - Deepfake Genome Codebase

1. Environment Setup

Create and activate the conda environment:

# Create a new conda environment (Python 3.10 recommended)
conda create -n dfg python=3.10 -y

# Activate the environment
conda activate dfg

# Install dependencies
pip install -r requirements.txt

2. Dataset Configuration

Before training or testing, you need to update the dataset global path to match your actual data location.

Open training/dataset/abstract_dataset.py and modify the DATASET_GLOBAL_PATH variable:

# Change this to your actual dataset root path
DATASET_GLOBAL_PATH = "/your/actual/dataset/path/"

This path should point to the root directory containing your deepfake detection datasets (e.g., DeepFakeGenome, deepfake_detecton_dataset, etc.).

3. Project and Dataset Structure

DFG/
β”œβ”€β”€ preprocessing/
β”‚   └── dataset_json/          # Dataset index JSON files
β”‚       β”œβ”€β”€ protocol_2_train.json
β”‚       β”œβ”€β”€ protocol_2_test.json
β”‚       β”œβ”€β”€ protocol_3_test.json
β”‚       β”œβ”€β”€ protocol_4_test.json
β”‚       └── ...
β”œβ”€β”€ training/
β”‚   β”œβ”€β”€ config/
β”‚   β”‚   └── detector/          # Detector config YAML files
β”‚   β”œβ”€β”€ detectors/             # Detector implementations
β”‚   β”‚   β”œβ”€β”€ __init__.py        # Register all detectors here
β”‚   β”‚   β”œβ”€β”€ base_detector.py
β”‚   β”‚   └── ...
β”‚   β”œβ”€β”€ networks/              # Backbone network implementations
β”‚   β”œβ”€β”€ loss/                  # Loss function definitions
β”‚   β”œβ”€β”€ metrics/               # Evaluation metrics
β”‚   β”œβ”€β”€ train.py               # Training entry point
β”‚   └── test_pall.py           # Testing entry point
β”œβ”€β”€ train.sh                   # Training script examples
β”œβ”€β”€ test.sh                    # Testing script examples
β”œβ”€β”€ requirements.txt           # Python dependencies
└── README.md

4. Training

Refer to train.sh for all training commands. Example:

python -m torch.distributed.launch --master_port=29503 --nproc_per_node=8 training/train.py \
    --detector_path ./training/config/detector/clip_large_fft.yaml \
    --no-save_feat --ddp

Key arguments:

  • --master_port: port for distributed training (change if port conflicts occur)
  • --nproc_per_node: number of GPUs
  • --detector_path: path to the detector config YAML
  • --no-save_feat: disable feature saving during training
  • --ddp: enable DistributedDataParallel

5. Testing

Refer to test.sh for all testing commands. Example:

# Test on protocol 2 & 3
python -m torch.distributed.launch --master_port=29510 --nproc_per_node=8 training/test_pall.py --ddp \
    --test_dataset "protocol_2_test" "protocol_3_test" \
    --detector_path ./training/config/detector/clip_large_fft.yaml \
    --weights_path logs/clip_models/clip_large_fft_2025-11-08-13-56-51

# Test on protocol 4
python -m torch.distributed.launch --master_port=29512 --nproc_per_node=8 training/test_pall.py --ddp \
    --test_dataset "protocol_4_test" \
    --detector_path ./training/config/detector/clip_large_fft.yaml \
    --weights_path logs/clip_models/clip_large_fft_2025-11-08-13-56-51 \
    --test_config test_config_p4.yaml

Key arguments:

  • --test_dataset: one or more dataset names (must match JSON filenames under preprocessing/dataset_json/)
  • --weights_path: path to trained model checkpoint directory
  • --test_config: additional test configuration (required for protocol 4)

6. Adding a Custom Detector

To integrate your own detector into the framework, follow these three steps:

Step 1: Create the detector config YAML

Create a new file under training/config/detector/, e.g., my_detector.yaml:

# log dir
log_dir: logs/my_detector

# model setting
pretrained: null
model_name: my_detector
backbone_name: resnet34

# backbone setting
backbone_config:
  mode: original
  num_classes: 2
  inc: 3
  dropout: false

# dataset
all_dataset: [FaceForensics++, FF-F2F, FF-DF, FF-FS, FF-NT, FaceShifter, DeepFakeDetection, Celeb-DF-v1, Celeb-DF-v2, DFDCP, DFDC, DeeperForensics-1.0, UADFV]
train_dataset: [protocol_2_train]
test_dataset: [protocol_2_test]

compression: c23
train_batchSize: 64
test_batchSize: 64
workers: 8
frame_num: {'train': 16, 'test': 16}
resolution: 224
with_mask: false
with_landmark: false

# data augmentation
use_data_augmentation: false
data_aug:
  flip_prob: 0.5
  rotate_prob: 0.5
  rotate_limit: [-10, 10]
  blur_prob: 0.5
  blur_limit: [3, 7]
  brightness_prob: 0.5
  brightness_limit: [-0.1, 0.1]
  contrast_limit: [-0.1, 0.1]
  quality_lower: 40
  quality_upper: 100

# mean and std for normalization
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]

# optimizer config
optimizer:
  type: adam
  adam:
    lr: 0.0002
    beta1: 0.9
    beta2: 0.999
    eps: 0.00000001
    weight_decay: 0.0005
    amsgrad: false

# training config
lr_scheduler: null
nEpochs: 20
start_epoch: 0
save_epoch: 1
rec_iter: 100
logdir: ./logs
manualSeed: 1024
save_ckpt: true
save_feat: true

# loss function
loss_func: cross_entropy
losstype: null

# metric
metric_scoring: auc

# cuda
ngpu: 1
cuda: true
cudnn: true

save_avg: true
save_latest_ckpt: true

Step 2: Create the detector Python file

Create training/detectors/my_detector.py:

import torch
import torch.nn as nn

from metrics.base_metrics_class import calculate_metrics_for_train
from .base_detector import AbstractDetector
from detectors import DETECTOR
from networks import BACKBONE
from loss import LOSSFUNC


@DETECTOR.register_module(module_name='my_detector')
class MyDetector(AbstractDetector):
    def __init__(self, config):
        super().__init__()
        self.config = config
        self.backbone = self.build_backbone(config)
        self.loss_func = LOSSFUNC[config['loss_func']]()

    def build_backbone(self, config):
        backbone = BACKBONE[config['backbone_name']](config['backbone_config'])
        return backbone

    def features(self, data_dict: dict) -> torch.Tensor:
        return self.backbone(data_dict['image'])

    def classifier(self, features: torch.Tensor) -> torch.Tensor:
        return self.fc(features)

    def get_losses(self, data_dict: dict, pred_dict: dict) -> dict:
        label = data_dict['label']
        pred = pred_dict['cls']
        loss = self.loss_func(pred, label)
        return {'overall': loss}

    def get_train_metrics(self, data_dict: dict, pred_dict: dict) -> dict:
        label = data_dict['label']
        pred = pred_dict['cls']
        auc, eer, acc, ap = calculate_metrics_for_train(label.detach(), pred.detach())
        return {'acc': acc, 'auc': auc, 'eer': eer, 'ap': ap}

    def forward(self, data_dict: dict, inference=False) -> dict:
        features = self.features(data_dict)
        pred = self.classifier(features)
        prob = torch.softmax(pred, dim=1)[:, 1]
        pred_dict = {'cls': pred, 'prob': prob, 'feat': features}
        return pred_dict

Step 3: Register the detector in __init__.py

Add the following import line to training/detectors/__init__.py:

from .my_detector import MyDetector

That's it! Now you can train and test with your custom detector:

# Train
python -m torch.distributed.launch --master_port=29503 --nproc_per_node=8 training/train.py \
    --detector_path ./training/config/detector/my_detector.yaml \
    --no-save_feat --ddp

# Test
python -m torch.distributed.launch --master_port=29510 --nproc_per_node=8 training/test_pall.py --ddp \
    --test_dataset "protocol_2_test" "protocol_3_test" \
    --detector_path ./training/config/detector/my_detector.yaml \
    --weights_path logs/my_detector/<your_checkpoint_folder>
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using DeepfakeGenome/DeepfakeGenome_Codebase 1