all-MiniLM-L12-v2-go_emotions-classifier-onnx
Browse files- all-MiniLM-L12-v2-go_emotions-classifier-onnx/.gitattributes +35 -0
- all-MiniLM-L12-v2-go_emotions-classifier-onnx/README.md +124 -0
- all-MiniLM-L12-v2-go_emotions-classifier-onnx/model.onnx +3 -0
- all-MiniLM-L12-v2-go_emotions-classifier-onnx/source.txt +1 -0
- all-MiniLM-L12-v2-go_emotions-classifier-onnx/thresholds.json +1 -0
all-MiniLM-L12-v2-go_emotions-classifier-onnx/.gitattributes
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
all-MiniLM-L12-v2-go_emotions-classifier-onnx/README.md
ADDED
|
@@ -0,0 +1,124 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
tags:
|
| 4 |
+
- text-classification
|
| 5 |
+
- onnx
|
| 6 |
+
- emotions
|
| 7 |
+
- multi-class-classification
|
| 8 |
+
- multi-label-classification
|
| 9 |
+
datasets:
|
| 10 |
+
- go_emotions
|
| 11 |
+
license: mit
|
| 12 |
+
inference: false
|
| 13 |
+
widget:
|
| 14 |
+
- text: ONNX is so much faster, its very handy!
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
### Overview
|
| 18 |
+
|
| 19 |
+
This is a multi-label, multi-class linear classifer for emotions that works with [sentence-transformers/all-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2), having been trained on the [go_emotions](https://huggingface.co/datasets/go_emotions) dataset.
|
| 20 |
+
|
| 21 |
+
### Labels
|
| 22 |
+
|
| 23 |
+
The 28 labels from the [go_emotions](https://huggingface.co/datasets/go_emotions) dataset are:
|
| 24 |
+
```
|
| 25 |
+
['admiration', 'amusement', 'anger', 'annoyance', 'approval', 'caring', 'confusion', 'curiosity', 'desire', 'disappointment', 'disapproval', 'disgust', 'embarrassment', 'excitement', 'fear', 'gratitude', 'grief', 'joy', 'love', 'nervousness', 'optimism', 'pride', 'realization', 'relief', 'remorse', 'sadness', 'surprise', 'neutral']
|
| 26 |
+
```
|
| 27 |
+
|
| 28 |
+
### Metrics (exact match of labels per item)
|
| 29 |
+
|
| 30 |
+
This is a multi-label, multi-class dataset, so each label is effectively a separate binary classification. Evaluating across all labels per item in the go_emotions test split the metrics are shown below.
|
| 31 |
+
|
| 32 |
+
Optimising the threshold per label to optimise the F1 metric, the metrics (evaluated on the go_emotions test split) are:
|
| 33 |
+
|
| 34 |
+
- Precision: 0.378
|
| 35 |
+
- Recall: 0.438
|
| 36 |
+
- F1: 0.394
|
| 37 |
+
|
| 38 |
+
Weighted by the relative support of each label in the dataset, this is:
|
| 39 |
+
|
| 40 |
+
- Precision: 0.424
|
| 41 |
+
- Recall: 0.590
|
| 42 |
+
- F1: 0.481
|
| 43 |
+
|
| 44 |
+
Using a fixed threshold of 0.5 to convert the scores to binary predictions for each label, the metrics (evaluated on the go_emotions test split, and unweighted by support) are:
|
| 45 |
+
|
| 46 |
+
- Precision: 0.568
|
| 47 |
+
- Recall: 0.214
|
| 48 |
+
- F1: 0.260
|
| 49 |
+
|
| 50 |
+
### Metrics (per-label)
|
| 51 |
+
|
| 52 |
+
This is a multi-label, multi-class dataset, so each label is effectively a separate binary classification and metrics are better measured per label.
|
| 53 |
+
|
| 54 |
+
Optimising the threshold per label to optimise the F1 metric, the metrics (evaluated on the go_emotions test split) are:
|
| 55 |
+
| | f1 | precision | recall | support | threshold |
|
| 56 |
+
| -------------- | ----- | --------- | ------ | ------- | --------- |
|
| 57 |
+
| admiration | 0.540 | 0.463 | 0.649 | 504 | 0.20 |
|
| 58 |
+
| amusement | 0.686 | 0.669 | 0.705 | 264 | 0.25 |
|
| 59 |
+
| anger | 0.419 | 0.373 | 0.480 | 198 | 0.15 |
|
| 60 |
+
| annoyance | 0.276 | 0.189 | 0.512 | 320 | 0.10 |
|
| 61 |
+
| approval | 0.299 | 0.260 | 0.350 | 351 | 0.15 |
|
| 62 |
+
| caring | 0.303 | 0.219 | 0.489 | 135 | 0.10 |
|
| 63 |
+
| confusion | 0.284 | 0.269 | 0.301 | 153 | 0.15 |
|
| 64 |
+
| curiosity | 0.365 | 0.310 | 0.444 | 284 | 0.15 |
|
| 65 |
+
| desire | 0.274 | 0.237 | 0.325 | 83 | 0.15 |
|
| 66 |
+
| disappointment | 0.188 | 0.292 | 0.139 | 151 | 0.20 |
|
| 67 |
+
| disapproval | 0.305 | 0.257 | 0.375 | 267 | 0.15 |
|
| 68 |
+
| disgust | 0.450 | 0.462 | 0.439 | 123 | 0.20 |
|
| 69 |
+
| embarrassment | 0.348 | 0.375 | 0.324 | 37 | 0.30 |
|
| 70 |
+
| excitement | 0.313 | 0.306 | 0.320 | 103 | 0.20 |
|
| 71 |
+
| fear | 0.550 | 0.505 | 0.603 | 78 | 0.25 |
|
| 72 |
+
| gratitude | 0.776 | 0.774 | 0.778 | 352 | 0.30 |
|
| 73 |
+
| grief | 0.353 | 0.273 | 0.500 | 6 | 0.70 |
|
| 74 |
+
| joy | 0.370 | 0.361 | 0.379 | 161 | 0.20 |
|
| 75 |
+
| love | 0.626 | 0.717 | 0.555 | 238 | 0.35 |
|
| 76 |
+
| nervousness | 0.308 | 0.276 | 0.348 | 23 | 0.55 |
|
| 77 |
+
| optimism | 0.436 | 0.432 | 0.441 | 186 | 0.20 |
|
| 78 |
+
| pride | 0.444 | 0.545 | 0.375 | 16 | 0.60 |
|
| 79 |
+
| realization | 0.171 | 0.146 | 0.207 | 145 | 0.10 |
|
| 80 |
+
| relief | 0.133 | 0.250 | 0.091 | 11 | 0.60 |
|
| 81 |
+
| remorse | 0.468 | 0.426 | 0.518 | 56 | 0.30 |
|
| 82 |
+
| sadness | 0.413 | 0.409 | 0.417 | 156 | 0.20 |
|
| 83 |
+
| surprise | 0.314 | 0.303 | 0.326 | 141 | 0.15 |
|
| 84 |
+
| neutral | 0.622 | 0.482 | 0.879 | 1787 | 0.25 |
|
| 85 |
+
|
| 86 |
+
The thesholds are stored in `thresholds.json`.
|
| 87 |
+
|
| 88 |
+
### Use with ONNXRuntime
|
| 89 |
+
|
| 90 |
+
The input to the model is called `logits`, and there is one output per label. Each output produces a 2d array, with 1 row per input row, and each row having 2 columns - the first being a proba output for the negative case, and the second being a proba output for the positive case.
|
| 91 |
+
|
| 92 |
+
```python
|
| 93 |
+
# Assuming you have embeddings from all-MiniLM-L12-v2 for the input sentences
|
| 94 |
+
# E.g. produced from sentence-transformers such as:
|
| 95 |
+
# huggingface.co/sentence-transformers/all-MiniLM-L12-v2
|
| 96 |
+
# or from an ONNX version E.g. huggingface.co/Xenova/all-MiniLM-L12-v2
|
| 97 |
+
|
| 98 |
+
print(embeddings.shape) # E.g. a batch of 1 sentence
|
| 99 |
+
> (1, 384)
|
| 100 |
+
|
| 101 |
+
import onnxruntime as ort
|
| 102 |
+
|
| 103 |
+
sess = ort.InferenceSession("path_to_model_dot_onnx", providers=['CPUExecutionProvider'])
|
| 104 |
+
|
| 105 |
+
outputs = [o.name for o in sess.get_outputs()] # list of labels, in the order of the outputs
|
| 106 |
+
preds_onnx = sess.run(_outputs, {'logits': embeddings})
|
| 107 |
+
# preds_onnx is a list with 28 entries, one per label,
|
| 108 |
+
# each with a numpy array of shape (1, 2) given the input was a batch of 1
|
| 109 |
+
|
| 110 |
+
print(outputs[0])
|
| 111 |
+
> surprise
|
| 112 |
+
print(preds_onnx[0])
|
| 113 |
+
> array([[0.97136074, 0.02863926]], dtype=float32)
|
| 114 |
+
|
| 115 |
+
# load thresholds.json and use that (per label) to convert the positive case score to a binary prediction
|
| 116 |
+
```
|
| 117 |
+
|
| 118 |
+
### Commentary on the dataset
|
| 119 |
+
|
| 120 |
+
Some labels (E.g. gratitude) when considered independently perform very strongly, whilst others (E.g. relief) perform very poorly.
|
| 121 |
+
|
| 122 |
+
This is a challenging dataset. Labels such as relief do have much fewer examples in the training data (less than 100 out of the 40k+, and only 11 in the test split).
|
| 123 |
+
|
| 124 |
+
But there is also some ambiguity and/or labelling errors visible in the training data of go_emotions that is suspected to constrain the performance. Data cleaning on the dataset to reduce some of the mistakes, ambiguity, conflicts and duplication in the labelling would produce a higher performing model.
|
all-MiniLM-L12-v2-go_emotions-classifier-onnx/model.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:da12e05903f911fe4e3792dbac3f7ff40163066aa6b92b7f73d5569cbea95ba2
|
| 3 |
+
size 117509
|
all-MiniLM-L12-v2-go_emotions-classifier-onnx/source.txt
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
https://huggingface.co/SamLowe/all-MiniLM-L12-v2-go_emotions-classifier-onnx
|
all-MiniLM-L12-v2-go_emotions-classifier-onnx/thresholds.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"admiration": 0.25, "amusement": 0.25, "anger": 0.15, "annoyance": 0.1, "approval": 0.15, "caring": 0.1, "confusion": 0.15, "curiosity": 0.15, "desire": 0.15, "disappointment": 0.15, "disapproval": 0.15, "disgust": 0.25, "embarrassment": 0.5, "excitement": 0.2, "fear": 0.35, "gratitude": 0.3, "grief": 0.55, "joy": 0.2, "love": 0.35, "nervousness": 0.5, "optimism": 0.25, "pride": 0.55, "realization": 0.1, "relief": 0.35, "remorse": 0.35, "sadness": 0.15, "surprise": 0.2, "neutral": 0.25}
|