File size: 2,271 Bytes
4cf8a14 b226236 4cf8a14 808b5c3 4cf8a14 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
---
language: ko
license: apache-2.0
tags:
- video-classification
- cnn-lstm
- pytorch
- confusion-detection
- daiSEE
datasets:
- daiSEE
metrics:
- accuracy
- f1
- precision
- recall
---
## Colab Notebook
์๋ Colab์์ ๋ชจ๋ธ์ ์ง์ ์คํํ๊ณ ํ
์คํธํ ์ ์์ต๋๋ค.
[](https://colab.research.google.com/drive/1OkrPq9Cc0kXllSUFG24C-Xz6zZm45Izx?authuser=1#scrollTo=ixU-OYlQ2o79)
## ํ์ต ๋ฐ์ดํฐ (Dataset)
https://people.iith.ac.in/vineethnb/resources/daisee/index.html
- **DAiSEE Dataset** (ํด๋์ค: Boredom, Confusion, Engagement, Frustration ์ค Confusion๋ง ์ฌ์ฉ)
- ๋ ์ด๋ธ ๋งคํ:
- Confusion = 0 โ Not Confused (0)
- Confusion = 1~3 โ Confused (1)
- ์
๋ ฅ ๋ฐ์ดํฐ: ๋น๋์ค ํ๋ ์ (sequence_length=30, image_size=112ร112)
# Face Comprehension: Confusion Binary Classification
์ด ๋ชจ๋ธ์ **DAiSEE ๋ฐ์ดํฐ์
**์ ๊ธฐ๋ฐ์ผ๋ก ์ฌ๋์ ํ์ ยทํ๋์ ๋ถ์ํ์ฌ **Confusion(ํผ๋)** ์ฌ๋ถ๋ฅผ ์ด์ง ๋ถ๋ฅํ๋ **๋น๋์ค ์ดํด ๋ชจ๋ธ**์
๋๋ค.
---
## ๋ชจ๋ธ ๊ตฌ์กฐ (Model Architecture)
- **Backbone**: MobileNetV2 (ImageNet ์ฌ์ ํ์ต ๊ฐ์ค์น ์ฌ์ฉ, ์ผ๋ถ ๋ ์ด์ด ๊ณ ์ )
- **Sequence Modeling**: LSTM (hidden_dim=256, num_layers=2)
- **Attention Mechanism**: Temporal Attention ์ ์ฉ
- **Classifier**: Fully-connected layer (2 ํด๋์ค: Not Confused / Confused)
---
---
## ํ์ต ์ ์ฐจ (Training Procedure)
- Optimizer: Adam (lr=0.001)
- Loss: Weighted CrossEntropyLoss (Class imbalance ๋ณด์ )
- Not Confused: 67.5%, Confused: 32.5% โ ๊ฐ์ค์น ๋น์จ ์ฝ 1 : 2.1
- Scheduler: ReduceLROnPlateau (patience=3, factor=0.5)
- Epochs: 3~10 (์ต์
์ ๋ฐ๋ผ ๋ณ๊ฒฝ ๊ฐ๋ฅ)
- Batch Size: 8~16
---
## ์ฌ์ฉ ๋ฐฉ๋ฒ (How to Use)
```python
import torch
from face import DAiSEEConfusionNet
# ๋ชจ๋ธ ๋ก๋
model = DAiSEEConfusionNet()
model.load_state_dict(torch.load("confusion_binary_model.pth", map_location="cpu"))
model.eval()
# ์
๋ ฅ: (batch, seq_len, C, H, W)
dummy_input = torch.randn(1, 30, 3, 112, 112)
outputs = model(dummy_input)
prediction = torch.argmax(outputs, dim=1).item()
print("์์ธก ๊ฒฐ๊ณผ:", "Confused" if prediction == 1 else "Not Confused")
|