combe4259
/

face-comprehension

Video Classification

confusion-detection

Model card Files Files and versions

combe4259 commited on Sep 21, 2025

Commit

4cf8a14

·

verified ·

1 Parent(s): 9e44157

Create README.md

Files changed (1) hide show

README.md +79 -0

README.md ADDED Viewed

	@@ -0,0 +1,79 @@

+---
+language: ko
+license: apache-2.0
+tags:
+- video-classification
+- cnn-lstm
+- pytorch
+- confusion-detection
+- daiSEE
+datasets:
+- daiSEE
+metrics:
+- accuracy
+- f1
+- precision
+- recall
+---
+## Colab Notebook
+아래 Colab에서 모델을 직접 실행하고 테스트할 수 있습니다.
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1OkrPq9Cc0kXllSUFG24C-Xz6zZm45Izx?authuser=1#scrollTo=ixU-OYlQ2o79)
+# Face Comprehension: Confusion Binary Classification
+이 모델은 **DAiSEE 데이터셋**을 기반으로 사람의 표정·행동을 분석하여 **Confusion(혼란)** 여부를 이진 분류하는 **비디오 이해 모델**입니다.
+Confusion 레이블(0~3)을 **0: Not Confused vs 1: Confused(1~3)** 로 단순화하여 학습하였습니다.
+---
+## 모델 구조 (Model Architecture)
+- **Backbone**: MobileNetV2 (ImageNet 사전학습 가중치 사용, 일부 레이어 고정)
+- **Sequence Modeling**: LSTM (hidden_dim=256, num_layers=2)
+- **Attention Mechanism**: Temporal Attention 적용
+- **Classifier**: Fully-connected layer (2 클래스: Not Confused / Confused)
+---
+## 학습 데이터 (Dataset)
+- **DAiSEE Dataset** (클래스: Boredom, Confusion, Engagement, Frustration 중 Confusion만 사용)
+- 레이블 매핑:
+  - Confusion = 0 → Not Confused (0)
+  - Confusion = 1~3 → Confused (1)
+- 입력 데이터: 비디오 프레임 (sequence_length=30, image_size=112×112)
+---
+## 학습 절차 (Training Procedure)
+- Optimizer: Adam (lr=0.001)
+- Loss: Weighted CrossEntropyLoss (Class imbalance 보정)
+  - Not Confused: 67.5%, Confused: 32.5% → 가중치 비율 약 1 : 2.1
+- Scheduler: ReduceLROnPlateau (patience=3, factor=0.5)
+- Epochs: 3~10 (옵션에 따라 변경 가능)
+- Batch Size: 8~16
+---
+## 사용 방법 (How to Use)
+```python
+import torch
+from face import DAiSEEConfusionNet
+# 모델 로드
+model = DAiSEEConfusionNet()
+model.load_state_dict(torch.load("confusion_binary_model.pth", map_location="cpu"))
+model.eval()
+# 입력: (batch, seq_len, C, H, W)
+dummy_input = torch.randn(1, 30, 3, 112, 112)
+outputs = model(dummy_input)
+prediction = torch.argmax(outputs, dim=1).item()
+print("예측 결과:", "Confused" if prediction == 1 else "Not Confused")