TuBERT: Multimodal Speech Emotion Recognition For Real-Time Avatar Control

This project was developed for my senior thesis at Princeton University. Paper being published soon.

About

TuBERT is a multimodal speech emotion recognition model that runs in real-time and on-device. I designed it with PNGTubers in mind, but there are plenty of other applications for it as well!

To test the model for yourself using a GUI, see the GitHub repository for installation instructions.

Usage

tubert.pt is the base TuBERT model trained on MELD, described in the paper and used by default for evaluation. tubert_iemocap.pt is the version of the model fine-tuned on IEMOCAP.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for yacoubk/TuBERT

Base model

bhadresh-savani/distilbert-base-uncased-emotion

Finetuned

(6)

this model