combe4259
/

face-comprehension

Video Classification

confusion-detection

Model card Files Files and versions

face-comprehension / README.md

combe4259's picture

Update README.md

6aac4c9 verified 3 months ago

|

history blame contribute delete

2.27 kB

	---
	language: ko
	license: apache-2.0
	tags:
	- video-classification
	- cnn-lstm
	- pytorch
	- confusion-detection
	- daiSEE
	datasets:
	- daiSEE
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	---

	## Colab Notebook

	아래 Colab에서 모델을 직접 실행하고 테스트할 수 있습니다.

	[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1OkrPq9Cc0kXllSUFG24C-Xz6zZm45Izx?authuser=1#scrollTo=ixU-OYlQ2o79)


	## 학습 데이터 (Dataset)
	https://people.iith.ac.in/vineethnb/resources/daisee/index.html

	- DAiSEE Dataset (클래스: Boredom, Confusion, Engagement, Frustration 중 Confusion만 사용)
	- 레이블 매핑:
	- Confusion = 0 → Not Confused (0)
	- Confusion = 1~3 → Confused (1)
	- 입력 데이터: 비디오 프레임 (sequence_length=30, image_size=112×112)


	# Face Comprehension: Confusion Binary Classification

	이 모델은 DAiSEE 데이터셋을 기반으로 사람의 표정·행동을 분석하여 Confusion(혼란) 여부를 이진 분류하는 비디오 이해 모델입니다.


	---

	## 모델 구조 (Model Architecture)

	- Backbone: MobileNetV2 (ImageNet 사전학습 가중치 사용, 일부 레이어 고정)
	- Sequence Modeling: LSTM (hidden_dim=256, num_layers=2)
	- Attention Mechanism: Temporal Attention 적용
	- Classifier: Fully-connected layer (2 클래스: Not Confused / Confused)

	---


	---

	## 학습 절차 (Training Procedure)

	- Optimizer: Adam (lr=0.001)
	- Loss: Weighted CrossEntropyLoss (Class imbalance 보정)
	- Not Confused: 67.5%, Confused: 32.5% → 가중치 비율 약 1 : 2.1
	- Scheduler: ReduceLROnPlateau (patience=3, factor=0.5)
	- Epochs: 3~10 (옵션에 따라 변경 가능)
	- Batch Size: 8~16

	---

	## 사용 방법 (How to Use)

	```python
	import torch
	from face import DAiSEEConfusionNet

	# 모델 로드
	model = DAiSEEConfusionNet()
	model.load_state_dict(torch.load("confusion_binary_model.pth", map_location="cpu"))
	model.eval()

	# 입력: (batch, seq_len, C, H, W)
	dummy_input = torch.randn(1, 30, 3, 112, 112)
	outputs = model(dummy_input)

	prediction = torch.argmax(outputs, dim=1).item()
	print("예측 결과:", "Confused" if prediction == 1 else "Not Confused")