# Driver Behavior Detection Model (Epoch 2) 운전자 이상행동 감지를 위한 Video Swin Transformer 기반 모델입니다. ## Model Description - **Architecture**: Video Swin Transformer Tiny (swin3d_t) - **Backbone Pretrained**: Kinetics-400 - **Parameters**: 27.85M - **Input**: [B, 3, 30, 224, 224] (batch, channels, frames, height, width) ## Classes (5) | Label | Class | F1-Score | |:-----:|-------|:--------:| | 0 | 정상 (Normal) | 0.93 | | 1 | 졸음운전 (Drowsy Driving) | 0.98 | | 2 | 물건찾기 (Reaching/Searching) | 0.90 | | 3 | 휴대폰 사용 (Phone Usage) | 0.88 | | 4 | 운전자 폭행 (Driver Assault) | 1.00 | ## Performance (Epoch 2) | Metric | Value | |--------|-------| | **Accuracy** | 95.15% | | **Macro F1** | 0.9392 | | **Validation Samples** | 1,371,062 | ## Training Configuration | Parameter | Value | |-----------|-------| | Hardware | 2x NVIDIA RTX A6000 (48GB) | | Distributed | DDP (DistributedDataParallel) | | Batch Size | 32 (16 × 2 GPU) | | Gradient Accumulation | 4 | | Effective Batch | 128 | | Optimizer | AdamW (lr=1e-3, wd=0.05) | | Scheduler | OneCycleLR | | Mixed Precision | FP16 | | Loss | CrossEntropy + Label Smoothing (0.1) | | Regularization | Mixup (α=0.4), Dropout (0.3) | ## Usage ```python import torch from model import DriverBehaviorModel # Load model model = DriverBehaviorModel(num_classes=5, pretrained=False) checkpoint = torch.load("pytorch_model.bin", map_location="cpu") model.load_state_dict(checkpoint["model"]) model.eval() # Inference # input: [1, 3, 30, 224, 224] - 30 frames, 224x224, RGB normalized with torch.no_grad(): output = model(video_tensor) prediction = output.argmax(dim=1) ``` ## Dataset - **Total Videos**: 243,979 - **Total Samples (windows)**: 1,371,062 - **Window Size**: 30 frames - **Stride**: 15 frames - **Resolution**: 224×224 ## Augmentation (Training) - RandomResizedCrop (scale 0.8-1.0) - HorizontalFlip (p=0.5) - ColorJitter, HueSaturationValue - Temporal Augmentation (speed change, frame drop) - Mixup (α=0.4) - CoarseDropout ## License This model is for research purposes only. ## Citation ``` @misc{driver-behavior-detection-2026, title={Driver Behavior Detection using Video Swin Transformer}, author={C-Team}, year={2026} } ```