TuBERT: Multimodal Speech Emotion Recognition For Real-Time Avatar Control

This project was developed for my senior thesis at Princeton University. Paper being published soon.

About

TuBERT is a multimodal speech emotion recognition model that runs in real-time and on-device. I designed it with PNGTubers in mind, but there are plenty of other applications for it as well!

To test the model for yourself using a GUI, see the GitHub repository for installation instructions.

Usage

tubert.pt is the base TuBERT model trained on MELD, described in the paper and used by default for evaluation. tubert_iemocap.pt is the version of the model fine-tuned on IEMOCAP.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yacoubk/TuBERT

Finetuned
(5)
this model