Update README.md
Browse files
README.md
CHANGED
|
@@ -1,26 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# TuBERT: Multimodal Speech Emotion Recognition For Real-Time Avatar Control
|
| 2 |
|
| 3 |
-
*This project was developed for my senior thesis at Princeton University.
|
| 4 |
-
|
| 5 |
-

|
| 6 |
|
| 7 |
## About
|
| 8 |
TuBERT is a multimodal speech emotion recognition model that runs in real-time and on-device. I designed it with PNGTubers in mind, but there are plenty of other applications for it as well!
|
| 9 |
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
This repository also includes the code I used to train ([instructions](./model/setup/README.md)) and test ([instructions](./model/stats/README.md)) the model.
|
| 13 |
-
|
| 14 |
-
## Installation
|
| 15 |
-
If this project gets more attention I will also consider making a packaged version of TuBERT and the GUI you can install as an application with one click. Installation should be pretty painless either way, through.
|
| 16 |
-
|
| 17 |
-
1. Clone the repository: `git clone https://github.com/YacoubKahkajian/TuBERT.git`
|
| 18 |
-
2. Open the cloned folder: `cd TuBERT`
|
| 19 |
-
3. Download a pre-trained TuBERT model from HuggingFace and place it in `/model`:
|
| 20 |
-
4. `python -m venv .venv && source .venv/bin/activate`
|
| 21 |
-
5. `pip install -e .`
|
| 22 |
|
| 23 |
## Usage
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
I have only tested the TuBERT model on Mac and Linux and the TuBERT GUI on Mac. I can't think of a reason TuBERT shouldn't be able to run on Windows, but let me know if there's any compatibility issues you run into regardless.
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
metrics:
|
| 6 |
+
- accuracy
|
| 7 |
+
- f1
|
| 8 |
+
- precision
|
| 9 |
+
base_model:
|
| 10 |
+
- bhadresh-savani/distilbert-base-uncased-emotion
|
| 11 |
+
pipeline_tag: audio-classification
|
| 12 |
+
tags:
|
| 13 |
+
- speech
|
| 14 |
+
- emotion
|
| 15 |
+
- ser
|
| 16 |
+
- classification
|
| 17 |
+
---
|
| 18 |
# TuBERT: Multimodal Speech Emotion Recognition For Real-Time Avatar Control
|
| 19 |
|
| 20 |
+
*This project was developed for my senior thesis at Princeton University. Paper being published soon.*
|
|
|
|
|
|
|
| 21 |
|
| 22 |
## About
|
| 23 |
TuBERT is a multimodal speech emotion recognition model that runs in real-time and on-device. I designed it with PNGTubers in mind, but there are plenty of other applications for it as well!
|
| 24 |
|
| 25 |
+
To test the model for yourself using a GUI, see the [GitHub repository](https://github.com/YacoubKahkajian/TuBERT) for installation instructions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
## Usage
|
| 28 |
+
`tubert.pt` is the base TuBERT model trained on MELD, described in the paper and used by default for evaluation. `tubert_iemocap.pt` is the version of the model fine-tuned on IEMOCAP.
|
|
|
|
|
|