Latvian Basic Emotion Classifier
A fine-tuned version of LVBERT for multi-label text classification of six basic emotions (+neutral) in Latvian, as defined by Ekman’s theory.
The model is trained on a combined dataset of go_emotions-lv and twitter_emotions-lv.
Predicted labels:
0: anger
1: disgust
2: fear
3: joy
4: sadness
5: surprise
6: neutral
The random seed used for initialization was 42:
def set_seed(seed=42):
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
if torch.cuda.is_available():
torch.cuda.manual_seed_all(seed)
Training parameters:
max_length: null
batch_size: 32
shuffle: True
num_workers: 4
pin_memory: False
drop_last: False
optimizer: adam
lr: 0.000005
weight_decay: 0
problem_type: multi_label_classification
num_epochs: 3
Evaluation
Evaluation results on the test split of go_emotions-lv:
|
Precision |
Recall |
F1-score |
Support |
| anger |
0.57 |
0.36 |
0.44 |
726 |
| disgust |
0.42 |
0.29 |
0.35 |
123 |
| fear |
0.59 |
0.43 |
0.50 |
98 |
| joy |
0.78 |
0.80 |
0.79 |
2104 |
| sadness |
0.65 |
0.42 |
0.51 |
379 |
| surprise |
0.62 |
0.38 |
0.47 |
677 |
| neutral |
0.66 |
0.58 |
0.62 |
1787 |
| micro avg |
0.70 |
0.59 |
0.64 |
5894 |
| macro avg |
0.61 |
0.46 |
0.52 |
5894 |
| weighted avg |
0.68 |
0.59 |
0.63 |
5894 |
| samples avg |
0.62 |
0.61 |
0.61 |
5894 |
Evaluation results on the test split of twitter_emotions-lv:
|
Precision |
Recall |
F1-score |
Support |
| anger |
0.94 |
0.87 |
0.90 |
12013 |
| disgust |
0.92 |
0.92 |
0.92 |
14117 |
| fear |
0.74 |
0.80 |
0.77 |
3342 |
| joy |
0.87 |
0.88 |
0.87 |
5913 |
| sadness |
0.81 |
0.80 |
0.81 |
4786 |
| surprise |
0.93 |
0.57 |
0.71 |
1510 |
| micro avg |
0.89 |
0.87 |
0.88 |
41681 |
| macro avg |
0.74 |
0.69 |
0.71 |
41681 |
| weighted avg |
0.89 |
0.87 |
0.88 |
41681 |
| samples avg |
0.86 |
0.87 |
0.86 |
41681 |
See also
https://huggingface.co/AiLab-IMCS-UL/mbert-lv-emotions-ekman
Acknowledgements
This work was supported by the EU Recovery and Resilience Facility project Language Technology Initiative (2.3.1.1.i.0/1/22/I/CFLA/002).