Update README.md
Browse files
README.md
CHANGED
|
@@ -9,6 +9,86 @@ pipeline_tag: text-classification
|
|
| 9 |
|
| 10 |
This is the repo for Gen AI final project
|
| 11 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
## Info
|
| 13 |
License: Mit
|
| 14 |
|
|
|
|
| 9 |
|
| 10 |
This is the repo for Gen AI final project
|
| 11 |
|
| 12 |
+
# Transformer with Emotion Classification
|
| 13 |
+
|
| 14 |
+
## Overview
|
| 15 |
+
|
| 16 |
+
This is a Transformer-based model designed for **emotion classification** and **dialogue act recognition** on the [DailyDialog](http://yanran.li/dailydialog) dataset. It processes multi-turn dialogues to predict emotional states and communication intentions. A **Stacked Autoencoder (SAE)** is included to regularize node usage, encouraging sparsity in feature representations.
|
| 17 |
+
|
| 18 |
+
While the model successfully predicts dialogue acts, it faces challenges in emotion classification, often outputting binary labels (0 or 1) due to imbalanced data or other limitations.
|
| 19 |
+
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
## Model Details
|
| 23 |
+
|
| 24 |
+
### Model Architecture
|
| 25 |
+
- **Transformer Encoder**: A standard Transformer encoder serves as the backbone for extracting contextual features from dialogues.
|
| 26 |
+
- **Batch Normalization**: Applied to normalize extracted features.
|
| 27 |
+
- **Dropout**: Used to reduce overfitting.
|
| 28 |
+
- **Stacked Autoencoder (SAE)**: Regularizes feature representations by encouraging sparsity, adding KL divergence loss during training.
|
| 29 |
+
- **Classification Heads**:
|
| 30 |
+
- **Dialogue Act Classifier**: Predicts communication intentions (e.g., inform, question).
|
| 31 |
+
- **Emotion Classifier**: Predicts one of the annotated emotions (e.g., happiness, sadness, anger).
|
| 32 |
+
|
| 33 |
+
### Input
|
| 34 |
+
- **input_ids**: Tokenized input sequences of dialogues.
|
| 35 |
+
- **attention_mask**: Binary mask indicating valid tokens in the input sequence.
|
| 36 |
+
|
| 37 |
+
### Output
|
| 38 |
+
- **act_output**: Predicted dialogue act class.
|
| 39 |
+
- **emotion_output**: Predicted emotion class.
|
| 40 |
+
- **kl_div**: KL divergence loss for SAE regularization.
|
| 41 |
+
|
| 42 |
+
---
|
| 43 |
+
|
| 44 |
+
## Dataset: DailyDialog
|
| 45 |
+
|
| 46 |
+
The model is trained and evaluated on the [DailyDialog](http://yanran.li/dailydialog) dataset.
|
| 47 |
+
|
| 48 |
+
### Dataset Features
|
| 49 |
+
- **Size**: 13,118 dialogues with an average of 8 turns per dialogue.
|
| 50 |
+
- **Annotations**:
|
| 51 |
+
- **Dialogue Acts**: Intentions like inform, question, directive, and commissive.
|
| 52 |
+
- **Emotions**: Labels such as happiness, sadness, anger, surprise, and no emotion.
|
| 53 |
+
- **License**: [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).
|
| 54 |
+
|
| 55 |
+
---
|
| 56 |
+
|
| 57 |
+
## Usage
|
| 58 |
+
|
| 59 |
+
### Training
|
| 60 |
+
1. **Dataset Preparation**: Use tokenizers to preprocess the DailyDialog dataset into `input_ids` and `attention_mask`.
|
| 61 |
+
2. **Training Steps**:
|
| 62 |
+
- Forward pass the input through the model.
|
| 63 |
+
- Compute cross-entropy loss for the dialogue act and emotion classifiers.
|
| 64 |
+
- Add KL divergence loss for SAE regularization.
|
| 65 |
+
- Backpropagate and update parameters.
|
| 66 |
+
|
| 67 |
+
### Inference
|
| 68 |
+
- Input: Tokenized text sequences and attention masks.
|
| 69 |
+
- Output: Predicted dialogue acts and emotion classes.
|
| 70 |
+
|
| 71 |
+
---
|
| 72 |
+
|
| 73 |
+
## Limitations
|
| 74 |
+
- **Emotion Classification**: The model struggles to predict diverse emotional states, often outputting binary values (0 or 1).
|
| 75 |
+
- **Imbalanced Dataset**: Emotion labels in the DailyDialog dataset are not evenly distributed, which impacts model performance.
|
| 76 |
+
- **Limited Domain**: The dataset is focused on daily conversations, so the model may not generalize well to other dialogue contexts.
|
| 77 |
+
|
| 78 |
+
---
|
| 79 |
+
|
| 80 |
+
## Citation
|
| 81 |
+
|
| 82 |
+
If you use this model or the DailyDialog dataset, please cite:
|
| 83 |
+
|
| 84 |
+
```bibtex
|
| 85 |
+
@inproceedings{li2017dailydialog,
|
| 86 |
+
title={DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset},
|
| 87 |
+
author={Li, Yanran and others},
|
| 88 |
+
booktitle={Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI)},
|
| 89 |
+
year={2017}
|
| 90 |
+
}
|
| 91 |
+
|
| 92 |
## Info
|
| 93 |
License: Mit
|
| 94 |
|