LEGIONM36's picture
Upload 4 files
d31a75f verified
# Feature Fusion Network
## Model Architecture
- **Type**: Multi-Modal Hybrid (CNN + Transformer)
- **Pathway 1 (Spatial)**: ResNet3D (r3d_18) for robust localized feature extraction.
- **Pathway 2 (Spatiotemporal)**: TimeSformer (Transformer) block dealing with patches and frames to capture long-range dependencies.
- **Fusion**: Late fusion via concatenation of flattened feature vectors (512 features from CNN + 256 features from Transformer).
- **Classification Head**: MLP mapping fused features to binary classes.
## Dataset Structure
Expects `Dataset` folder in parent directory.
```
Dataset/
β”œβ”€β”€ violence/
└── no-violence/
```
## How to Run
1. Install dependencies: `torch`, `opencv-python`, `scikit-learn`, `numpy`, `torchvision`.
2. Run `python train.py`.