File size: 4,597 Bytes
0ad5229 0ccbd04 fdae419 16b66b9 0ccbd04 ca4dd76 b6fc959 d8cbf09 0b82e3e ca4dd76 0ad5229 ca4dd76 0ad5229 ca4dd76 0ccbd04 ca4dd76 0d5a1ae 0ad5229 0ccbd04 0ad5229 0d5a1ae 0ad5229 0ccbd04 16e814d 0ccbd04 0ad5229 ec8ac23 0ccbd04 0ad5229 0ccbd04 0ad5229 6abe469 0ad5229 0ccbd04 0ad5229 0ccbd04 0ad5229 0ccbd04 0ad5229 0ccbd04 0ad5229 0ccbd04 0ad5229 0ccbd04 0ad5229 0ccbd04 0ad5229 0ccbd04 0ad5229 0ccbd04 6abe469 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 | ---
language: en
license: mit
pipeline_tag: text-classification
tags:
- text-classification
- transformers
- pytorch
- multi-label-classification
- multi-class-classification
- emotion
- bert
- go_emotions
- emotion-classification
datasets:
- google-research-datasets/go_emotions
metrics:
- f1
- precision
- recall
widget:
- text: I’m just chilling today.
example_title: Neutral Example
- text: Thank you for saving my life!
example_title: Gratitude Example
- text: I’m nervous about my exam tomorrow.
example_title: Nervousness Example
base_model:
- google-bert/bert-base-uncased
---
# GoEmotions BERT Classifier
Fine-tuned [BERT-base-uncased](https://huggingface.co/bert-base-uncased) on [go_emotions](https://huggingface.co/datasets/go_emotions) for multi-label classification (28 emotions).
## Model Details
- **Architecture**: BERT-base-uncased (110M parameters)
- **Training Data**: [GoEmotions](https://huggingface.co/datasets/google-research-datasets/go_emotions) (58k Reddit comments, 28 emotions)
- **Loss Function**: Focal Loss (gamma=2)
- **Optimizer**: AdamW (lr=2e-5, weight_decay=0.01)
- **Epochs**: 5
- **Hardware**: Kaggle T4 x2 GPUs
## Try It Out
For accurate predictions with optimized thresholds, use the [Gradio demo](https://logasanjeev-goemotions-bert-demo.hf.space).
## Performance
- **Micro F1**: 0.6025 (optimized thresholds)
- **Macro F1**: 0.5266
- **Precision**: 0.5425
- **Recall**: 0.6775
- **Hamming Loss**: 0.0372
- **Avg Positive Predictions**: 1.4564
### Class-Wise Performance
The following table shows per-class metrics on the test set using optimized thresholds (see `thresholds.json`):
| Emotion | F1 Score | Precision | Recall | Support |
|----------------|----------|-----------|--------|---------|
| admiration | 0.7022 | 0.6980 | 0.7063 | 504 |
| amusement | 0.8171 | 0.7692 | 0.8712 | 264 |
| anger | 0.5123 | 0.5000 | 0.5253 | 198 |
| annoyance | 0.3820 | 0.2908 | 0.5563 | 320 |
| approval | 0.4112 | 0.3485 | 0.5014 | 351 |
| caring | 0.4601 | 0.4045 | 0.5333 | 135 |
| confusion | 0.4488 | 0.4533 | 0.4444 | 153 |
| curiosity | 0.5721 | 0.4402 | 0.8169 | 284 |
| desire | 0.4068 | 0.6857 | 0.2892 | 83 |
| disappointment | 0.3476 | 0.3220 | 0.3775 | 151 |
| disapproval | 0.4126 | 0.3433 | 0.5169 | 267 |
| disgust | 0.4950 | 0.6329 | 0.4065 | 123 |
| embarrassment | 0.5000 | 0.7368 | 0.3784 | 37 |
| excitement | 0.4084 | 0.4432 | 0.3786 | 103 |
| fear | 0.6311 | 0.5078 | 0.8333 | 78 |
| gratitude | 0.9173 | 0.9744 | 0.8665 | 352 |
| grief | 0.2500 | 0.5000 | 0.1667 | 6 |
| joy | 0.6246 | 0.5798 | 0.6770 | 161 |
| love | 0.8110 | 0.7630 | 0.8655 | 238 |
| nervousness | 0.3830 | 0.3750 | 0.3913 | 23 |
| optimism | 0.5777 | 0.5856 | 0.5699 | 186 |
| pride | 0.4138 | 0.4615 | 0.3750 | 16 |
| realization | 0.2421 | 0.5111 | 0.1586 | 145 |
| relief | 0.5385 | 0.4667 | 0.6364 | 11 |
| remorse | 0.6797 | 0.5361 | 0.9286 | 56 |
| sadness | 0.5391 | 0.6900 | 0.4423 | 156 |
| surprise | 0.5724 | 0.5570 | 0.5887 | 141 |
| neutral | 0.6895 | 0.5826 | 0.8444 | 1787 |
## Usage
The model uses optimized thresholds stored in `thresholds.json` for predictions. Example in Python:
```python
from transformers import BertForSequenceClassification, BertTokenizer
import torch
import json
import requests
# Load model and tokenizer
repo_id = "logasanjeev/goemotions-bert"
model = BertForSequenceClassification.from_pretrained(repo_id)
tokenizer = BertTokenizer.from_pretrained(repo_id)
# Load thresholds
thresholds_url = f"https://huggingface.co/{repo_id}/raw/main/thresholds.json"
thresholds_data = json.loads(requests.get(thresholds_url).text)
emotion_labels = thresholds_data["emotion_labels"]
thresholds = thresholds_data["thresholds"]
# Predict
text = "I’m just chilling today."
encodings = tokenizer(text, padding='max_length', truncation=True, max_length=128, return_tensors='pt')
with torch.no_grad():
logits = torch.sigmoid(model(**encodings).logits).numpy()[0]
predictions = [(emotion_labels[i], logit) for i, (logit, thresh) in enumerate(zip(logits, thresholds)) if logit >= thresh]
print(sorted(predictions, key=lambda x: x[1], reverse=True))
# Output: [('neutral', 0.8147)] |