File size: 595 Bytes
5eb5832
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

---
language: en
tags:
- audio
- speech-emotion-recognition
- pytorch
- cnn-bilstm
datasets:
- ravdess
- tess
metrics:
- accuracy
---

# Speech Emotion Recognition (CNN-BiLSTM-Attention)

This model was trained from scratch on the RAVDESS and TESS datasets.

## Model Architecture
- **Front-end**: 4-block CNN for feature extraction from Mel Spectrograms.
- **Mid-section**: Bidirectional LSTM for temporal dependencies.
- **Pooling**: Multi-head Attention pooling.
- **Back-end**: Fully connected classifier.

## Classes
0: neutral, 1: calm, 2: happy, 3: sad, 4: angry, 5: fearful, 6: disgust