face-comprehension / README.md
combe4259's picture
Update README.md
6aac4c9 verified
---
language: ko
license: apache-2.0
tags:
- video-classification
- cnn-lstm
- pytorch
- confusion-detection
- daiSEE
datasets:
- daiSEE
metrics:
- accuracy
- f1
- precision
- recall
---
## Colab Notebook
์•„๋ž˜ Colab์—์„œ ๋ชจ๋ธ์„ ์ง์ ‘ ์‹คํ–‰ํ•˜๊ณ  ํ…Œ์ŠคํŠธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1OkrPq9Cc0kXllSUFG24C-Xz6zZm45Izx?authuser=1#scrollTo=ixU-OYlQ2o79)
## ํ•™์Šต ๋ฐ์ดํ„ฐ (Dataset)
https://people.iith.ac.in/vineethnb/resources/daisee/index.html
- **DAiSEE Dataset** (ํด๋ž˜์Šค: Boredom, Confusion, Engagement, Frustration ์ค‘ Confusion๋งŒ ์‚ฌ์šฉ)
- ๋ ˆ์ด๋ธ” ๋งคํ•‘:
- Confusion = 0 โ†’ Not Confused (0)
- Confusion = 1~3 โ†’ Confused (1)
- ์ž…๋ ฅ ๋ฐ์ดํ„ฐ: ๋น„๋””์˜ค ํ”„๋ ˆ์ž„ (sequence_length=30, image_size=112ร—112)
# Face Comprehension: Confusion Binary Classification
์ด ๋ชจ๋ธ์€ **DAiSEE ๋ฐ์ดํ„ฐ์…‹**์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์‚ฌ๋žŒ์˜ ํ‘œ์ •ยทํ–‰๋™์„ ๋ถ„์„ํ•˜์—ฌ **Confusion(ํ˜ผ๋ž€)** ์—ฌ๋ถ€๋ฅผ ์ด์ง„ ๋ถ„๋ฅ˜ํ•˜๋Š” **๋น„๋””์˜ค ์ดํ•ด ๋ชจ๋ธ**์ž…๋‹ˆ๋‹ค.
---
## ๋ชจ๋ธ ๊ตฌ์กฐ (Model Architecture)
- **Backbone**: MobileNetV2 (ImageNet ์‚ฌ์ „ํ•™์Šต ๊ฐ€์ค‘์น˜ ์‚ฌ์šฉ, ์ผ๋ถ€ ๋ ˆ์ด์–ด ๊ณ ์ •)
- **Sequence Modeling**: LSTM (hidden_dim=256, num_layers=2)
- **Attention Mechanism**: Temporal Attention ์ ์šฉ
- **Classifier**: Fully-connected layer (2 ํด๋ž˜์Šค: Not Confused / Confused)
---
---
## ํ•™์Šต ์ ˆ์ฐจ (Training Procedure)
- Optimizer: Adam (lr=0.001)
- Loss: Weighted CrossEntropyLoss (Class imbalance ๋ณด์ •)
- Not Confused: 67.5%, Confused: 32.5% โ†’ ๊ฐ€์ค‘์น˜ ๋น„์œจ ์•ฝ 1 : 2.1
- Scheduler: ReduceLROnPlateau (patience=3, factor=0.5)
- Epochs: 3~10 (์˜ต์…˜์— ๋”ฐ๋ผ ๋ณ€๊ฒฝ ๊ฐ€๋Šฅ)
- Batch Size: 8~16
---
## ์‚ฌ์šฉ ๋ฐฉ๋ฒ• (How to Use)
```python
import torch
from face import DAiSEEConfusionNet
# ๋ชจ๋ธ ๋กœ๋“œ
model = DAiSEEConfusionNet()
model.load_state_dict(torch.load("confusion_binary_model.pth", map_location="cpu"))
model.eval()
# ์ž…๋ ฅ: (batch, seq_len, C, H, W)
dummy_input = torch.randn(1, 30, 3, 112, 112)
outputs = model(dummy_input)
prediction = torch.argmax(outputs, dim=1).item()
print("์˜ˆ์ธก ๊ฒฐ๊ณผ:", "Confused" if prediction == 1 else "Not Confused")