LEGIONM36's picture
Upload 4 files
d31a75f verified

Feature Fusion Network

Model Architecture

  • Type: Multi-Modal Hybrid (CNN + Transformer)
  • Pathway 1 (Spatial): ResNet3D (r3d_18) for robust localized feature extraction.
  • Pathway 2 (Spatiotemporal): TimeSformer (Transformer) block dealing with patches and frames to capture long-range dependencies.
  • Fusion: Late fusion via concatenation of flattened feature vectors (512 features from CNN + 256 features from Transformer).
  • Classification Head: MLP mapping fused features to binary classes.

Dataset Structure

Expects Dataset folder in parent directory.

Dataset/
├── violence/
└── no-violence/

How to Run

  1. Install dependencies: torch, opencv-python, scikit-learn, numpy, torchvision.
  2. Run python train.py.