Classroom Reaction Recognition - VGG16

A VGG16 classifier trained to recognize facial reactions of students from cropped person bounding boxes in classroom lecture videos.

Classes

Index Label Description
0 Neutral No visible expression
1 Confused Furrowed brow, squinting
2 Smiling_Amused Visible smile, laughter
3 Surprised Raised eyebrows, open mouth
4 Bored_Tired Yawning, blank stare

Architecture

  • Backbone: VGG16 (ImageNet pre-trained, feature layers frozen)
  • Classifier head: nn.Linear(4096, 5)
  • Loss: Weighted CrossEntropyLoss to handle class imbalance
  • Optimizer: Adam (lr=0.001)
  • Input size: 224 x 224 RGB, ImageNet-normalized

Training

Trained on manually annotated person crops extracted from classroom lecture videos using YOLOv8-nano detection. 70/10/20 stratified train/val/test split.

Usage

import torch
from torchvision import models

model = models.vgg16(weights=None)
model.classifier[6] = torch.nn.Linear(4096, 5)
model.load_state_dict(torch.load("best_vgg16.pth", map_location="cpu"))
model.eval()
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using ummanmm/classroom-reaction-vgg16 1