File size: 4,902 Bytes
1b9f5ae
 
 
 
 
 
 
 
 
 
22b4564
 
 
 
d3d8120
5862f52
0293307
d3d8120
 
ead5fb9
0293307
d3d8120
22b4564
 
 
5862f52
22b4564
d3d8120
 
 
 
 
 
 
 
 
 
 
 
22b4564
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ead5fb9
22b4564
 
 
13bc370
 
22b4564
13bc370
22b4564
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
license: mit
datasets:
- google-research-datasets/go_emotions
language:
- en
base_model:
- google/mobilebert-uncased
pipeline_tag: text-classification
library_name: transformers
---

#### Overview


Model trained from [mobileBert](https://huggingface.co/google/mobilebert-uncased) on the [go_emotions](https://huggingface.co/datasets/google-research-datasets/go_emotions) dataset for multi-label classification.
<div>
  <a href=https://github.com/04AR/Senti target="_blank"><img src=https://img.shields.io/badge/Code-black.svg?logo=github height=22px></a>
  <a href=https://huggingface.co/AR04/Senti target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
  <a href=https://github.com/04AR/Senti/blob/main/Senti.ipynb target="_blank"><img src="https://img.shields.io/badge/Training%20Code-%20jupyter-orange"></a>
  <a href=https://huggingface.co/spaces/AR04/SentiDemo target="_blank"><img src="https://img.shields.io/badge/Demo-%20HuggingFaceDemo-orange"></a>  
</div>

#### Dataset used for the model

[go_emotions](https://huggingface.co/datasets/google-research-datasets/go_emotions) is based on Reddit data and has 28 labels. It is a multi-label dataset where one or multiple labels may apply for any given input text, hence this model is a multi-label classification model with 28 'probability' float outputs for any given input text. Typically a threshold of 0.5 is applied to the probabilities for the prediction for each label.

## 🏷️ Emotion Labels

| ID  | Emotion         | ID  | Emotion         | ID  | Emotion         | ID  | Emotion     |
|-----|----------------|-----|----------------|-----|----------------|-----|----------------|
| 0   | admiration     | 1   | amusement      | 2   | anger          | 3   | annoyance      |
| 4   | approval       | 5   | caring         | 6   | confusion      | 7   | curiosity      |
| 8   | desire         | 9   | disappointment | 10  | disapproval    | 11  | disgust        |
| 12  | embarrassment  | 13  | excitement     | 14  | fear           | 15  | gratitude      |
| 16  | grief          | 17  | joy            | 18  | love           | 19  | nervousness    |
| 20  | optimism       | 21  | pride          | 22  | realization    | 23  | relief         |
| 24  | remorse        | 25  | sadness        | 26  | surprise       | 27  | neutral        |

#### How the model was created

The model was trained using `AutoModelForSequenceClassification.from_pretrained` with `problem_type="multi_label_classification"` for 3 epochs with a learning rate of 2e-5 and weight decay of 0.01.

#### Inference

There are multiple ways to use this model in Huggingface Transformers. Possibly the simplest is using a pipeline:

### 1. Install dependencies

```bash
pip install torch transformers
```


```python
from transformers import pipeline
classifier = pipeline(task="text-classification", model="AR04/Senti", top_k=None)
sentences = ["hi! u r looking beautiful today dear"]
model_outputs = classifier(sentences)
print(model_outputs[0])
# produces a list of dicts for each of the labels
```
Output:

```bash
[{'label': 'admiration', 'score': 0.9517803192138672}, {'label': 'love', 'score': 0.18317067623138428}, {'label': 'joy', 'score': 0.03131399303674698}, {'label': 'neutral', 'score': 0.01567094214260578}, {'label': 'surprise', 'score': 0.009232419542968273}, {'label': 'approval', 'score': 0.007308646105229855}, {'label': 'excitement', 'score': 0.006345656234771013}, {'label': 'pride', 'score': 0.004945244640111923}, {'label': 'caring', 'score': 0.0038624939043074846}, {'label': 'realization', 'score': 0.0023580112028867006}, {'label': 'desire', 'score': 0.0017759536858648062}, {'label': 'optimism', 'score': 0.0013220690889284015}, {'label': 'sadness', 'score': 0.001188945840112865}, {'label': 'disappointment', 'score': 0.0009136834414675832}, {'label': 'gratitude', 'score': 0.0008250900427810848}, {'label': 'relief', 'score': 0.0005154621903784573}, {'label': 'amusement', 'score': 0.0004376845608931035}, {'label': 'fear', 'score': 0.00038696840056218207}, {'label': 'embarrassment', 'score': 0.0003084330528508872}, {'label': 'grief', 'score': 0.00019462488126009703}, {'label': 'confusion', 'score': 0.00018893269589170814}, {'label': 'annoyance', 'score': 0.0001587819424457848}, {'label': 'curiosity', 'score': 0.0001355114800389856}, {'label': 'remorse', 'score': 0.00011744408402591944}, {'label': 'anger', 'score': 0.00010586195276118815}, {'label': 'disgust', 'score': 9.386352030560374e-05}, {'label': 'nervousness', 'score': 7.547048153355718e-05}, {'label': 'disapproval', 'score': 3.7117086321813986e-05}]
```

#### Evaluation / metrics

Here are the evaluation results of **Senti** on the GoEmotions validation set:

| Metric       | Value |
|--------------|-------|
| Loss         | 0.085 |
| F1-score     | 0.586 |
| ROC AUC      | 0.752 |
| Accuracy     | 0.460 |