File size: 802 Bytes
d31a75f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# Feature Fusion Network
## Model Architecture
- **Type**: Multi-Modal Hybrid (CNN + Transformer)
- **Pathway 1 (Spatial)**: ResNet3D (r3d_18) for robust localized feature extraction.
- **Pathway 2 (Spatiotemporal)**: TimeSformer (Transformer) block dealing with patches and frames to capture long-range dependencies.
- **Fusion**: Late fusion via concatenation of flattened feature vectors (512 features from CNN + 256 features from Transformer).
- **Classification Head**: MLP mapping fused features to binary classes.
## Dataset Structure
Expects `Dataset` folder in parent directory.
```
Dataset/
├── violence/
└── no-violence/
```
## How to Run
1. Install dependencies: `torch`, `opencv-python`, `scikit-learn`, `numpy`, `torchvision`.
2. Run `python train.py`.
|