TensorBoard
ONNX
English
vit-tiny-fer / README.md
deanngkl's picture
update-README.md
9522d5c verified
metadata
license: apache-2.0
language:
  - en
base_model:
  - WinKawaks/vit-tiny-patch16-224
datasets:
  - deanngkl/ferplus-7cls
  - deanngkl/raf-db-7emotions
  - deanngkl/affectnet_no_contempt
metrics:
  - accuracy

Vision Transformer for Facial Expression Classifier

A deep learning project that fine-tunes a Vision Transformer (ViT-Tiny) model for 7-class facial emotion classification using cleaned versions of FER+, AffectNet, and RAF-DB datasets.

πŸ“Œ Project Highlights

  • πŸ” 7-class emotion classification: ['anger', 'disgust', 'fear', 'happiness', 'neutral', 'sadness', 'surprise']
  • 🧠 Model: ViT-Tiny (timm implementation)
  • 🎯 Achieved 82% validation accuracy on a blended hold-out set (8 377 images)
  • πŸ“š Cleaned & uploaded datasets to Hugging Face Datasets
  • πŸ§ͺ Integrated CutMix, cosine decay scheduler, and AMP for training

πŸ“¦ Datasets

Dataset Link Notes
FER+ Hugging Face Filtered to 7 basic emotions
AffectNet Hugging Face Removed 'contempt' class
RAF-DB Hugging Face Added proper emotion labels

The total amount of datasets

Loaded 75398 training samples from 3 sources
Loaded 8377 validation samples from 3 sources
Training-set distribution:
  0: 0 : 9738
  1: 1 : 3385
  2: 2 : 4313
  3: 3 : 18315
  4: 4 : 20987
  5: 5 : 9289
  6: 6 : 9371
Emotion batch torch.Size([64, 3, 224, 224])

πŸ™‹β€β™‚οΈ Author

Dean Ng Kwan Lung

Blog : Portfolio
LinkedIn : LinkedIn
GitHub : GitHub
Email : kwanlung123@gmail.com