---
language:
- ko
license: apache-2.0
tags:
- video-classification
- driver-behavior-detection
- swin-transformer
- video-swin
- pytorch
datasets:
- custom
metrics:
- accuracy
- f1
pipeline_tag: video-classification
model-index:
- name: driver-behavior-swin3d-t
  results:
  - task:
      type: video-classification
      name: Video Classification
    metrics:
    - type: accuracy
      value: 0.9805
      name: Accuracy
    - type: f1
      value: 0.9757
      name: Macro F1
---

# Driver Behavior Detection Model (Epoch 7)

운전자 이상행동 감지를 위한 Video Swin Transformer 기반 모델입니다.

## Model Description

- **Architecture**: Video Swin Transformer Tiny (swin3d_t)
- **Backbone Pretrained**: Kinetics-400
- **Parameters**: 27.85M
- **Input**: [B, 3, 30, 224, 224] (batch, channels, frames, height, width)

## Classes (5)

| Label | Class | F1-Score |
|:-----:|-------|:--------:|
| 0 | 정상 (Normal) | 0.97 |
| 1 | 졸음운전 (Drowsy Driving) | 0.99 |
| 2 | 물건찾기 (Reaching/Searching) | 0.96 |
| 3 | 휴대폰 사용 (Phone Usage) | 0.96 |
| 4 | 운전자 폭행 (Driver Assault) | 1.00 |

## Performance (Epoch 7)

| Metric | Value |
|--------|-------|
| **Accuracy** | 98.05% |
| **Macro F1** | 0.9757 |
| **Validation Samples** | 1,371,062 |

## Training Configuration

| Parameter | Value |
|-----------|-------|
| Hardware | 2x NVIDIA RTX A6000 (48GB) |
| Distributed | DDP (DistributedDataParallel) |
| Batch Size | 32 (16 x 2 GPU) |
| Gradient Accumulation | 4 |
| Effective Batch | 128 |
| Optimizer | AdamW (lr=1e-3, wd=0.05) |
| Scheduler | OneCycleLR |
| Mixed Precision | FP16 |
| Loss | CrossEntropy + Label Smoothing (0.1) |
| Regularization | Mixup (a=0.4), Dropout (0.3) |

## Files

| File | Size | Description |
|------|:----:|-------------|
| `pytorch_model.bin` | 121 MB | PyTorch weights (FP32) |
| `model.onnx` | 164 MB | ONNX model for mobile deployment |
| `config.json` | 1.2 KB | Model configuration |
| `model.py` | 6.9 KB | Model architecture code |
| `convert_coreml_macos.py` | 2.2 KB | CoreML conversion script (macOS) |

## Platform-specific Usage

### PyTorch (Server/Desktop)

```python
import torch
from model import DriverBehaviorModel

model = DriverBehaviorModel(num_classes=5, pretrained=False)
checkpoint = torch.load("pytorch_model.bin", map_location="cpu")
model.load_state_dict(checkpoint["model"])
model.eval()
```

### iOS (CoreML)

1. Copy `model.onnx` to macOS
2. Run conversion script:
```bash
python convert_coreml_macos.py
```
3. Add generated `DriverBehavior.mlpackage` to Xcode project

### Android (ONNX Runtime)

```kotlin
// build.gradle
implementation 'com.microsoft.onnxruntime:onnxruntime-android:1.16.0'

// Kotlin
val session = OrtEnvironment.getEnvironment()
    .createSession(assetManager.open("model.onnx").readBytes())

val output = session.run(mapOf("video_input" to inputTensor))
```

## Preprocessing (All Platforms)

```
Input Shape: [1, 3, 30, 224, 224]  (batch, channels, frames, height, width)
Channel Order: RGB
Normalization: (pixel / 255.0 - mean) / std
  - mean = [0.485, 0.456, 0.406]
  - std = [0.229, 0.224, 0.225]
Resize: 224x224 (BILINEAR)
Frames: 30 frames uniformly sampled
```

## Dataset

- **Total Videos**: 243,979
- **Total Samples (windows)**: 1,371,062
- **Window Size**: 30 frames
- **Stride**: 15 frames
- **Resolution**: 224x224

## Training Progress

| Epoch | Accuracy | Macro F1 |
|:-----:|:--------:|:--------:|
| 5 | 97.35% | 0.9666 |
| 6 | 97.74% | 0.9720 |
| **7** | **98.05%** | **0.9757** |

## License

This model is for research purposes only.

## Citation

```
@misc{driver-behavior-detection-2026,
  title={Driver Behavior Detection using Video Swin Transformer},
  author={C-Team},
  year={2026}
}
```