Upload 6 files
Browse files- README.md +25 -45
- config.json +7 -7
- model.onnx +2 -2
- pytorch_model.bin +1 -1
README.md
CHANGED
|
@@ -22,14 +22,14 @@ model-index:
|
|
| 22 |
name: Video Classification
|
| 23 |
metrics:
|
| 24 |
- type: accuracy
|
| 25 |
-
value: 0.
|
| 26 |
name: Accuracy
|
| 27 |
- type: f1
|
| 28 |
-
value: 0.
|
| 29 |
name: Macro F1
|
| 30 |
---
|
| 31 |
|
| 32 |
-
# Driver Behavior Detection Model (Epoch
|
| 33 |
|
| 34 |
운전자 이상행동 감지를 위한 Video Swin Transformer 기반 모델입니다.
|
| 35 |
|
|
@@ -44,18 +44,18 @@ model-index:
|
|
| 44 |
|
| 45 |
| Label | Class | F1-Score |
|
| 46 |
|:-----:|-------|:--------:|
|
| 47 |
-
| 0 | 정상 (Normal) | 0.
|
| 48 |
| 1 | 졸음운전 (Drowsy Driving) | 0.99 |
|
| 49 |
-
| 2 | 물건찾기 (Reaching/Searching) | 0.
|
| 50 |
-
| 3 | 휴대폰 사용 (Phone Usage) | 0.
|
| 51 |
| 4 | 운전자 폭행 (Driver Assault) | 1.00 |
|
| 52 |
|
| 53 |
-
## Performance (Epoch
|
| 54 |
|
| 55 |
| Metric | Value |
|
| 56 |
|--------|-------|
|
| 57 |
-
| **Accuracy** |
|
| 58 |
-
| **Macro F1** | 0.
|
| 59 |
| **Validation Samples** | 1,371,062 |
|
| 60 |
|
| 61 |
## Training Configuration
|
|
@@ -73,42 +73,6 @@ model-index:
|
|
| 73 |
| Loss | CrossEntropy + Label Smoothing (0.1) |
|
| 74 |
| Regularization | Mixup (a=0.4), Dropout (0.3) |
|
| 75 |
|
| 76 |
-
## Usage
|
| 77 |
-
|
| 78 |
-
```python
|
| 79 |
-
import torch
|
| 80 |
-
from model import DriverBehaviorModel
|
| 81 |
-
|
| 82 |
-
# Load model
|
| 83 |
-
model = DriverBehaviorModel(num_classes=5, pretrained=False)
|
| 84 |
-
checkpoint = torch.load("pytorch_model.bin", map_location="cpu")
|
| 85 |
-
model.load_state_dict(checkpoint["model"])
|
| 86 |
-
model.eval()
|
| 87 |
-
|
| 88 |
-
# Inference
|
| 89 |
-
# input: [1, 3, 30, 224, 224] - 30 frames, 224x224, RGB normalized
|
| 90 |
-
with torch.no_grad():
|
| 91 |
-
output = model(video_tensor)
|
| 92 |
-
prediction = output.argmax(dim=1)
|
| 93 |
-
```
|
| 94 |
-
|
| 95 |
-
## Dataset
|
| 96 |
-
|
| 97 |
-
- **Total Videos**: 243,979
|
| 98 |
-
- **Total Samples (windows)**: 1,371,062
|
| 99 |
-
- **Window Size**: 30 frames
|
| 100 |
-
- **Stride**: 15 frames
|
| 101 |
-
- **Resolution**: 224x224
|
| 102 |
-
|
| 103 |
-
## Training Progress
|
| 104 |
-
|
| 105 |
-
| Epoch | Accuracy | Macro F1 |
|
| 106 |
-
|:-----:|:--------:|:--------:|
|
| 107 |
-
| 2 | 95.15% | 0.9392 |
|
| 108 |
-
| 3 | 96.56% | 0.9568 |
|
| 109 |
-
| 4 | 96.83% | 0.9600 |
|
| 110 |
-
| **5** | **97.35%** | **0.9666** |
|
| 111 |
-
|
| 112 |
## Files
|
| 113 |
|
| 114 |
| File | Size | Description |
|
|
@@ -167,6 +131,22 @@ Resize: 224x224 (BILINEAR)
|
|
| 167 |
Frames: 30 frames uniformly sampled
|
| 168 |
```
|
| 169 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 170 |
## License
|
| 171 |
|
| 172 |
This model is for research purposes only.
|
|
|
|
| 22 |
name: Video Classification
|
| 23 |
metrics:
|
| 24 |
- type: accuracy
|
| 25 |
+
value: 0.9805
|
| 26 |
name: Accuracy
|
| 27 |
- type: f1
|
| 28 |
+
value: 0.9757
|
| 29 |
name: Macro F1
|
| 30 |
---
|
| 31 |
|
| 32 |
+
# Driver Behavior Detection Model (Epoch 7)
|
| 33 |
|
| 34 |
운전자 이상행동 감지를 위한 Video Swin Transformer 기반 모델입니다.
|
| 35 |
|
|
|
|
| 44 |
|
| 45 |
| Label | Class | F1-Score |
|
| 46 |
|:-----:|-------|:--------:|
|
| 47 |
+
| 0 | 정상 (Normal) | 0.97 |
|
| 48 |
| 1 | 졸음운전 (Drowsy Driving) | 0.99 |
|
| 49 |
+
| 2 | 물건찾기 (Reaching/Searching) | 0.96 |
|
| 50 |
+
| 3 | 휴대폰 사용 (Phone Usage) | 0.96 |
|
| 51 |
| 4 | 운전자 폭행 (Driver Assault) | 1.00 |
|
| 52 |
|
| 53 |
+
## Performance (Epoch 7)
|
| 54 |
|
| 55 |
| Metric | Value |
|
| 56 |
|--------|-------|
|
| 57 |
+
| **Accuracy** | 98.05% |
|
| 58 |
+
| **Macro F1** | 0.9757 |
|
| 59 |
| **Validation Samples** | 1,371,062 |
|
| 60 |
|
| 61 |
## Training Configuration
|
|
|
|
| 73 |
| Loss | CrossEntropy + Label Smoothing (0.1) |
|
| 74 |
| Regularization | Mixup (a=0.4), Dropout (0.3) |
|
| 75 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 76 |
## Files
|
| 77 |
|
| 78 |
| File | Size | Description |
|
|
|
|
| 131 |
Frames: 30 frames uniformly sampled
|
| 132 |
```
|
| 133 |
|
| 134 |
+
## Dataset
|
| 135 |
+
|
| 136 |
+
- **Total Videos**: 243,979
|
| 137 |
+
- **Total Samples (windows)**: 1,371,062
|
| 138 |
+
- **Window Size**: 30 frames
|
| 139 |
+
- **Stride**: 15 frames
|
| 140 |
+
- **Resolution**: 224x224
|
| 141 |
+
|
| 142 |
+
## Training Progress
|
| 143 |
+
|
| 144 |
+
| Epoch | Accuracy | Macro F1 |
|
| 145 |
+
|:-----:|:--------:|:--------:|
|
| 146 |
+
| 5 | 97.35% | 0.9666 |
|
| 147 |
+
| 6 | 97.74% | 0.9720 |
|
| 148 |
+
| **7** | **98.05%** | **0.9757** |
|
| 149 |
+
|
| 150 |
## License
|
| 151 |
|
| 152 |
This model is for research purposes only.
|
config.json
CHANGED
|
@@ -11,9 +11,9 @@
|
|
| 11 |
"layers": ["LayerNorm(768)", "Dropout(0.3)", "Linear(768, 5)"]
|
| 12 |
},
|
| 13 |
"training": {
|
| 14 |
-
"epoch":
|
| 15 |
-
"accuracy": 0.
|
| 16 |
-
"macro_f1": 0.
|
| 17 |
"batch_size": 32,
|
| 18 |
"optimizer": "AdamW",
|
| 19 |
"learning_rate": 1e-3,
|
|
@@ -23,10 +23,10 @@
|
|
| 23 |
"augmentation": ["Mixup(0.4)", "RandomResizedCrop", "HorizontalFlip", "ColorJitter", "TemporalAugmentation"]
|
| 24 |
},
|
| 25 |
"performance": {
|
| 26 |
-
"정상": {"precision": 0.
|
| 27 |
-
"졸음운전": {"precision":
|
| 28 |
-
"물건찾기": {"precision": 0.
|
| 29 |
-
"휴대폰 사용": {"precision": 0.
|
| 30 |
"운전자 폭행": {"precision": 1.00, "recall": 1.00, "f1": 1.00}
|
| 31 |
}
|
| 32 |
}
|
|
|
|
| 11 |
"layers": ["LayerNorm(768)", "Dropout(0.3)", "Linear(768, 5)"]
|
| 12 |
},
|
| 13 |
"training": {
|
| 14 |
+
"epoch": 7,
|
| 15 |
+
"accuracy": 0.9805,
|
| 16 |
+
"macro_f1": 0.9757,
|
| 17 |
"batch_size": 32,
|
| 18 |
"optimizer": "AdamW",
|
| 19 |
"learning_rate": 1e-3,
|
|
|
|
| 23 |
"augmentation": ["Mixup(0.4)", "RandomResizedCrop", "HorizontalFlip", "ColorJitter", "TemporalAugmentation"]
|
| 24 |
},
|
| 25 |
"performance": {
|
| 26 |
+
"정상": {"precision": 0.97, "recall": 0.97, "f1": 0.97},
|
| 27 |
+
"졸음운전": {"precision": 1.00, "recall": 0.99, "f1": 0.99},
|
| 28 |
+
"물건찾기": {"precision": 0.95, "recall": 0.97, "f1": 0.96},
|
| 29 |
+
"휴대폰 사용": {"precision": 0.96, "recall": 0.96, "f1": 0.96},
|
| 30 |
"운전자 폭행": {"precision": 1.00, "recall": 1.00, "f1": 1.00}
|
| 31 |
}
|
| 32 |
}
|
model.onnx
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5b16e8969d749bb7754b2a42daa98a4f64a6e4c42082d028111457c3abed9759
|
| 3 |
+
size 171169172
|
pytorch_model.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 126244047
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:db2e18ab37ceb942118a6390fce0e95220774048ec44eaca90ad5713fa1dce9c
|
| 3 |
size 126244047
|