# RoBERTa Dialogue Sentiment Analysis — EmoWOZ Fine-Tuning Fine-tunes `roberta-base` (or `roberta-large`) on the EmoWOZ dataset for **per-utterance emotion classification** in task-oriented dialogue, using a configurable sliding window of preceding dialogue history as context. ## Emotion Labels | ID | Label | Notes | |----|--------------|------------------------------| | -1 | system | Filtered out — not predicted | | 0 | neutral | | | 1 | fearful | | | 2 | dissatisfied | | | 3 | apologetic | | | 4 | abusive | | | 5 | excited | | | 6 | satisfied | | Only **user utterances** (emotion ≠ -1) are classified. System turns are retained as context but not predicted. ## Project Layout ``` roberta_emowoz/ ├── configs/ │ └── default.yaml # All hyperparameters (edit this) ├── data/ │ ├── dataset.py # DialogueDataset + collator │ └── preprocessing.py # JSON → flat utterance samples ├── models/ │ ├── model.py # RoBERTa wrapper with classification head │ └── focal_loss.py # Class-imbalance-aware loss ├── scripts/ │ ├── train.py # Main training entry point │ ├── evaluate.py # Full eval with per-class metrics │ └── predict.py # Interactive / batch inference ├── outputs/ # Checkpoints, logs, predictions (gitignored) └── requirements.txt ``` ## Quick Start NEED CONDA, install it first! research with chatgpt if u need to know what it is ```bash conda create -n nst_v4 python=3.11 conda activate nst_v4 pip install -r requirements_normal.txt pip install --index-url https://download.pytorch.org/whl/cu121 -r requirements_torch.txt # 2. Place your data files in the project root (or update config paths) # set1_train.json set1_val.json set1_test.json # 3. Train (uses defaults from DEFAULT_CONFIG in train.py, override any value) # For my RTX 3070 this takes 2 hours to complete python train.py # Or with custom parameters: python train.py --epochs 10 --batch_size 32 --history_window 4 --loss focal # 4. Evaluate on test set # python evaluate.py --checkpoint outputs/best_model # 5. Interactive prediction # python predict.py --checkpoint outputs/best_model --history_window 3 ``` ## Key Hyperparameter — `history_window` `history_window` controls how many **preceding turns** (both user and system) are prepended as context before the current utterance. ``` history_window = 0 → [CLS] [SEP] history_window = 2 → [CLS] [SEP] [SEP] [SEP] history_window = 4 → [CLS] [SEP] [SEP] [SEP] [SEP] [SEP] ``` Turns are ordered oldest → newest. System turns are prefixed with `"SYS:"`, user turns with `"USR:"` to give the model speaker role signals. Recommended sweep: `[0, 2, 4, 6]`. ## Class Imbalance EmoWOZ is heavily skewed toward class 0 (neutral). Two mitigation strategies are included and can be toggled in `configs/default.yaml`: - **Focal Loss** (`loss: focal`) — down-weights easy neutral examples. - **Weighted Cross-Entropy** (`loss: weighted_ce`) — per-class inverse frequency weights computed from the training set. ## Outputs After training, `outputs/` contains: - `best_model/` — best checkpoint by macro-F1 on validation - `last_model/` — final epoch checkpoint - `training_log.jsonl` — epoch-level metrics - `test_results.json` — per-class precision / recall / F1 + confusion matrix