TuBERT / README.md
yacoubk's picture
Update README.md
7801a90 verified
metadata
license: mit
language:
  - en
metrics:
  - accuracy
  - f1
  - precision
base_model:
  - bhadresh-savani/distilbert-base-uncased-emotion
pipeline_tag: audio-classification
tags:
  - speech
  - emotion
  - ser
  - classification

TuBERT: Multimodal Speech Emotion Recognition For Real-Time Avatar Control

This project was developed for my senior thesis at Princeton University. Paper being published soon.

About

TuBERT is a multimodal speech emotion recognition model that runs in real-time and on-device. I designed it with PNGTubers in mind, but there are plenty of other applications for it as well!

To test the model for yourself using a GUI, see the GitHub repository for installation instructions.

Usage

tubert.pt is the base TuBERT model trained on MELD, described in the paper and used by default for evaluation. tubert_iemocap.pt is the version of the model fine-tuned on IEMOCAP.