Text Emotion Recognition on MELD (BERT Embeddings + MLP Classifier)

This repository contains a text-only emotion classification model trained on the MELD (Multimodal EmotionLines Dataset) using pre-extracted BERT embeddings and a lightweight MLP classifier.

The model predicts one of 7 emotion classes from a single utterance.


Model Overview

  • Base language model: bert-base-uncased
  • Embedding extraction: CLS token (no pooling)
  • Classifier: 2-layer MLP
  • Training strategy:
    • BERT encoder is frozen
    • Only the classifier is trained

πŸ—‚ Dataset

  • Name: MELD (declare-lab/MELD)
  • Modality: Text (utterances)
  • Setting: Multi-class emotion classification
  • Splits: Train / Validation / Test (official MELD splits)

Training Details

  • Loss: Cross-entropy
  • Optimizer: Adam
  • Max sequence length: 128
  • Batching: Embeddings extracted in batches
  • Evaluation metrics:
    • Accuracy
    • Macro F1-score
    • Per-class F1-score

Important Notes

  • This model does not fine-tune BERT.
  • It relies on offline embedding extraction.
  • The provided weights correspond only to the classifier.

To reproduce results, the same BERT model must be used for embedding extraction.


Intended Use

  • Baseline for multimodal emotion recognition
  • Comparison with audio-only and fusion models
  • Research and educational purposes

Limitations

  • Context between utterances is not modeled
  • Speaker identity is not used
  • Performance depends on quality of extracted embeddings

Reproducibility

To reproduce the reported results, embeddings must be extracted using the same bert-base-uncased model with identical tokenization and maximum sequence length.


Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train PetraMicanovic/meld-text-baseline