File size: 3,316 Bytes
f54f3e6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
---
license: apache-2.0
base_model: vinai/bartpho-syllable
tags:
- vietnamese
- emotion-recognition
- text-classification
- VSMEC
datasets:
- VSMEC
metrics:
- accuracy
- macro-f1
model-index:
- name: bartpho
  results:
  - task:
      type: text-classification
      name: Emotion Recognition
    dataset:
      name: VSMEC
      type: VSMEC
    metrics:
      - type: accuracy
        value: 0.6378066378066378
      - type: macro-f1
        value: 0.6288407005570578
---

# bartpho: Emotion Recognition for Vietnamese Text

This model is a fine-tuned version of [vinai/bartpho-syllable](https://huggingface.co/vinai/bartpho-syllable) on the **VSMEC** dataset for emotion recognition in Vietnamese text.

## Model Details

* **Base Model**: vinai/bartpho-syllable
* **Description**: BartPho - Vietnamese BART
* **Dataset**: VSMEC (Vietnamese Social Media Emotion Corpus)
* **Fine-tuning Framework**: HuggingFace Transformers
* **Task**: Emotion Classification (7 classes)

### Hyperparameters

* Batch size: `32`
* Learning rate: `2e-5`
* Epochs: `100`
* Max sequence length: `256`
* Weight decay: `0.01`
* Warmup steps: `500`

## Dataset

The model was trained on the **VSMEC** dataset, which contains 6,927 Vietnamese social media text samples annotated with emotion labels. The dataset includes the following emotion categories:

* **Enjoyment** (0): Positive emotions, joy, happiness
* **Sadness** (1): Sad, disappointed, gloomy feelings
* **Anger** (2): Angry, frustrated, irritated
* **Fear** (3): Scared, anxious, worried
* **Disgust** (4): Disgusted, repelled
* **Surprise** (5): Surprised, shocked, amazed
* **Other** (6): Neutral or unclassified emotions

## Results

The model was evaluated using the following metrics:

* **Accuracy**: `0.6378`
* **Macro-F1**: `0.6288`
* **Macro-Precision**: `0.6464`
* **Macro-Recall**: `0.6326`

## Usage

You can use this model for emotion recognition in Vietnamese text. Below is an example of how to use it with the HuggingFace Transformers library:

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(f"visolex/{model_key}")
model = AutoModelForSequenceClassification.from_pretrained(f"visolex/{model_key}")

# Example text
text = "Tôi rất vui vì hôm nay trời đẹp!"

# Tokenize
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)

# Predict
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=-1).item()

# Map to emotion name
emotion_map = {{
    0: "Enjoyment",
    1: "Sadness",
    2: "Anger",
    3: "Fear",
    4: "Disgust",
    5: "Surprise",
    6: "Other"
}}

predicted_emotion = emotion_map[predicted_class]
print(f"Text: {{text}}")
print(f"Predicted emotion: {{predicted_emotion}}")
```

## Citation

If you use this model, please cite:

```bibtex
@misc{{visolex_emotion_{model_key},
  title={{ {description} for Vietnamese Emotion Recognition}},
  author={{ViSoLex Team}},
  year={{2024}},
  url={{https://huggingface.co/visolex/{model_key}}}
}}
```

## License

This model is released under the Apache-2.0 license.

## Acknowledgments

* Base model: [{base_model}](https://huggingface.co/{base_model})
* Dataset: VSMEC (Vietnamese Social Media Emotion Corpus)
* ViSoLex Toolkit