Driver_monitoring / README.md
koreashin's picture
Upload 4 files
8c039c2 verified
|
raw
history blame
2.38 kB
# Driver Behavior Detection Model (Epoch 2)
운전자 이상행동 감지를 위한 Video Swin Transformer 기반 모델입니다.
## Model Description
- **Architecture**: Video Swin Transformer Tiny (swin3d_t)
- **Backbone Pretrained**: Kinetics-400
- **Parameters**: 27.85M
- **Input**: [B, 3, 30, 224, 224] (batch, channels, frames, height, width)
## Classes (5)
| Label | Class | F1-Score |
|:-----:|-------|:--------:|
| 0 | 정상 (Normal) | 0.93 |
| 1 | 졸음운전 (Drowsy Driving) | 0.98 |
| 2 | 물건찾기 (Reaching/Searching) | 0.90 |
| 3 | 휴대폰 사용 (Phone Usage) | 0.88 |
| 4 | 운전자 폭행 (Driver Assault) | 1.00 |
## Performance (Epoch 2)
| Metric | Value |
|--------|-------|
| **Accuracy** | 95.15% |
| **Macro F1** | 0.9392 |
| **Validation Samples** | 1,371,062 |
## Training Configuration
| Parameter | Value |
|-----------|-------|
| Hardware | 2x NVIDIA RTX A6000 (48GB) |
| Distributed | DDP (DistributedDataParallel) |
| Batch Size | 32 (16 × 2 GPU) |
| Gradient Accumulation | 4 |
| Effective Batch | 128 |
| Optimizer | AdamW (lr=1e-3, wd=0.05) |
| Scheduler | OneCycleLR |
| Mixed Precision | FP16 |
| Loss | CrossEntropy + Label Smoothing (0.1) |
| Regularization | Mixup (α=0.4), Dropout (0.3) |
## Usage
```python
import torch
from model import DriverBehaviorModel
# Load model
model = DriverBehaviorModel(num_classes=5, pretrained=False)
checkpoint = torch.load("pytorch_model.bin", map_location="cpu")
model.load_state_dict(checkpoint["model"])
model.eval()
# Inference
# input: [1, 3, 30, 224, 224] - 30 frames, 224x224, RGB normalized
with torch.no_grad():
output = model(video_tensor)
prediction = output.argmax(dim=1)
```
## Dataset
- **Total Videos**: 243,979
- **Total Samples (windows)**: 1,371,062
- **Window Size**: 30 frames
- **Stride**: 15 frames
- **Resolution**: 224×224
## Augmentation (Training)
- RandomResizedCrop (scale 0.8-1.0)
- HorizontalFlip (p=0.5)
- ColorJitter, HueSaturationValue
- Temporal Augmentation (speed change, frame drop)
- Mixup (α=0.4)
- CoarseDropout
## License
This model is for research purposes only.
## Citation
```
@misc{driver-behavior-detection-2026,
title={Driver Behavior Detection using Video Swin Transformer},
author={C-Team},
year={2026}
}
```