koreashin commited on
Commit
f5247c7
·
verified ·
1 Parent(s): d2f4227

Upload 6 files

Browse files
Files changed (4) hide show
  1. README.md +25 -45
  2. config.json +7 -7
  3. model.onnx +2 -2
  4. pytorch_model.bin +1 -1
README.md CHANGED
@@ -22,14 +22,14 @@ model-index:
22
  name: Video Classification
23
  metrics:
24
  - type: accuracy
25
- value: 0.9735
26
  name: Accuracy
27
  - type: f1
28
- value: 0.9666
29
  name: Macro F1
30
  ---
31
 
32
- # Driver Behavior Detection Model (Epoch 5)
33
 
34
  운전자 이상행동 감지를 위한 Video Swin Transformer 기반 모델입니다.
35
 
@@ -44,18 +44,18 @@ model-index:
44
 
45
  | Label | Class | F1-Score |
46
  |:-----:|-------|:--------:|
47
- | 0 | 정상 (Normal) | 0.95 |
48
  | 1 | 졸음운전 (Drowsy Driving) | 0.99 |
49
- | 2 | 물건찾기 (Reaching/Searching) | 0.95 |
50
- | 3 | 휴대폰 사용 (Phone Usage) | 0.94 |
51
  | 4 | 운전자 폭행 (Driver Assault) | 1.00 |
52
 
53
- ## Performance (Epoch 5)
54
 
55
  | Metric | Value |
56
  |--------|-------|
57
- | **Accuracy** | 97.35% |
58
- | **Macro F1** | 0.9666 |
59
  | **Validation Samples** | 1,371,062 |
60
 
61
  ## Training Configuration
@@ -73,42 +73,6 @@ model-index:
73
  | Loss | CrossEntropy + Label Smoothing (0.1) |
74
  | Regularization | Mixup (a=0.4), Dropout (0.3) |
75
 
76
- ## Usage
77
-
78
- ```python
79
- import torch
80
- from model import DriverBehaviorModel
81
-
82
- # Load model
83
- model = DriverBehaviorModel(num_classes=5, pretrained=False)
84
- checkpoint = torch.load("pytorch_model.bin", map_location="cpu")
85
- model.load_state_dict(checkpoint["model"])
86
- model.eval()
87
-
88
- # Inference
89
- # input: [1, 3, 30, 224, 224] - 30 frames, 224x224, RGB normalized
90
- with torch.no_grad():
91
- output = model(video_tensor)
92
- prediction = output.argmax(dim=1)
93
- ```
94
-
95
- ## Dataset
96
-
97
- - **Total Videos**: 243,979
98
- - **Total Samples (windows)**: 1,371,062
99
- - **Window Size**: 30 frames
100
- - **Stride**: 15 frames
101
- - **Resolution**: 224x224
102
-
103
- ## Training Progress
104
-
105
- | Epoch | Accuracy | Macro F1 |
106
- |:-----:|:--------:|:--------:|
107
- | 2 | 95.15% | 0.9392 |
108
- | 3 | 96.56% | 0.9568 |
109
- | 4 | 96.83% | 0.9600 |
110
- | **5** | **97.35%** | **0.9666** |
111
-
112
  ## Files
113
 
114
  | File | Size | Description |
@@ -167,6 +131,22 @@ Resize: 224x224 (BILINEAR)
167
  Frames: 30 frames uniformly sampled
168
  ```
169
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
170
  ## License
171
 
172
  This model is for research purposes only.
 
22
  name: Video Classification
23
  metrics:
24
  - type: accuracy
25
+ value: 0.9805
26
  name: Accuracy
27
  - type: f1
28
+ value: 0.9757
29
  name: Macro F1
30
  ---
31
 
32
+ # Driver Behavior Detection Model (Epoch 7)
33
 
34
  운전자 이상행동 감지를 위한 Video Swin Transformer 기반 모델입니다.
35
 
 
44
 
45
  | Label | Class | F1-Score |
46
  |:-----:|-------|:--------:|
47
+ | 0 | 정상 (Normal) | 0.97 |
48
  | 1 | 졸음운전 (Drowsy Driving) | 0.99 |
49
+ | 2 | 물건찾기 (Reaching/Searching) | 0.96 |
50
+ | 3 | 휴대폰 사용 (Phone Usage) | 0.96 |
51
  | 4 | 운전자 폭행 (Driver Assault) | 1.00 |
52
 
53
+ ## Performance (Epoch 7)
54
 
55
  | Metric | Value |
56
  |--------|-------|
57
+ | **Accuracy** | 98.05% |
58
+ | **Macro F1** | 0.9757 |
59
  | **Validation Samples** | 1,371,062 |
60
 
61
  ## Training Configuration
 
73
  | Loss | CrossEntropy + Label Smoothing (0.1) |
74
  | Regularization | Mixup (a=0.4), Dropout (0.3) |
75
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
  ## Files
77
 
78
  | File | Size | Description |
 
131
  Frames: 30 frames uniformly sampled
132
  ```
133
 
134
+ ## Dataset
135
+
136
+ - **Total Videos**: 243,979
137
+ - **Total Samples (windows)**: 1,371,062
138
+ - **Window Size**: 30 frames
139
+ - **Stride**: 15 frames
140
+ - **Resolution**: 224x224
141
+
142
+ ## Training Progress
143
+
144
+ | Epoch | Accuracy | Macro F1 |
145
+ |:-----:|:--------:|:--------:|
146
+ | 5 | 97.35% | 0.9666 |
147
+ | 6 | 97.74% | 0.9720 |
148
+ | **7** | **98.05%** | **0.9757** |
149
+
150
  ## License
151
 
152
  This model is for research purposes only.
config.json CHANGED
@@ -11,9 +11,9 @@
11
  "layers": ["LayerNorm(768)", "Dropout(0.3)", "Linear(768, 5)"]
12
  },
13
  "training": {
14
- "epoch": 5,
15
- "accuracy": 0.9735,
16
- "macro_f1": 0.9666,
17
  "batch_size": 32,
18
  "optimizer": "AdamW",
19
  "learning_rate": 1e-3,
@@ -23,10 +23,10 @@
23
  "augmentation": ["Mixup(0.4)", "RandomResizedCrop", "HorizontalFlip", "ColorJitter", "TemporalAugmentation"]
24
  },
25
  "performance": {
26
- "정상": {"precision": 0.95, "recall": 0.96, "f1": 0.95},
27
- "졸음운전": {"precision": 0.99, "recall": 0.99, "f1": 0.99},
28
- "물건찾기": {"precision": 0.94, "recall": 0.96, "f1": 0.95},
29
- "휴대폰 사용": {"precision": 0.95, "recall": 0.93, "f1": 0.94},
30
  "운전자 폭행": {"precision": 1.00, "recall": 1.00, "f1": 1.00}
31
  }
32
  }
 
11
  "layers": ["LayerNorm(768)", "Dropout(0.3)", "Linear(768, 5)"]
12
  },
13
  "training": {
14
+ "epoch": 7,
15
+ "accuracy": 0.9805,
16
+ "macro_f1": 0.9757,
17
  "batch_size": 32,
18
  "optimizer": "AdamW",
19
  "learning_rate": 1e-3,
 
23
  "augmentation": ["Mixup(0.4)", "RandomResizedCrop", "HorizontalFlip", "ColorJitter", "TemporalAugmentation"]
24
  },
25
  "performance": {
26
+ "정상": {"precision": 0.97, "recall": 0.97, "f1": 0.97},
27
+ "졸음운전": {"precision": 1.00, "recall": 0.99, "f1": 0.99},
28
+ "물건찾기": {"precision": 0.95, "recall": 0.97, "f1": 0.96},
29
+ "휴대폰 사용": {"precision": 0.96, "recall": 0.96, "f1": 0.96},
30
  "운전자 폭행": {"precision": 1.00, "recall": 1.00, "f1": 1.00}
31
  }
32
  }
model.onnx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:036ac05c4b782d2ea1e24eb61366f197e0692ba24abaddca6798d56c1a337cec
3
- size 171169182
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5b16e8969d749bb7754b2a42daa98a4f64a6e4c42082d028111457c3abed9759
3
+ size 171169172
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e6442f71cdbc76335fe751ada9eb7e9c4c6461d7beb82e52088bafa7e15107a5
3
  size 126244047
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:db2e18ab37ceb942118a6390fce0e95220774048ec44eaca90ad5713fa1dce9c
3
  size 126244047