File size: 3,316 Bytes
f54f3e6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 |
---
license: apache-2.0
base_model: vinai/bartpho-syllable
tags:
- vietnamese
- emotion-recognition
- text-classification
- VSMEC
datasets:
- VSMEC
metrics:
- accuracy
- macro-f1
model-index:
- name: bartpho
results:
- task:
type: text-classification
name: Emotion Recognition
dataset:
name: VSMEC
type: VSMEC
metrics:
- type: accuracy
value: 0.6378066378066378
- type: macro-f1
value: 0.6288407005570578
---
# bartpho: Emotion Recognition for Vietnamese Text
This model is a fine-tuned version of [vinai/bartpho-syllable](https://huggingface.co/vinai/bartpho-syllable) on the **VSMEC** dataset for emotion recognition in Vietnamese text.
## Model Details
* **Base Model**: vinai/bartpho-syllable
* **Description**: BartPho - Vietnamese BART
* **Dataset**: VSMEC (Vietnamese Social Media Emotion Corpus)
* **Fine-tuning Framework**: HuggingFace Transformers
* **Task**: Emotion Classification (7 classes)
### Hyperparameters
* Batch size: `32`
* Learning rate: `2e-5`
* Epochs: `100`
* Max sequence length: `256`
* Weight decay: `0.01`
* Warmup steps: `500`
## Dataset
The model was trained on the **VSMEC** dataset, which contains 6,927 Vietnamese social media text samples annotated with emotion labels. The dataset includes the following emotion categories:
* **Enjoyment** (0): Positive emotions, joy, happiness
* **Sadness** (1): Sad, disappointed, gloomy feelings
* **Anger** (2): Angry, frustrated, irritated
* **Fear** (3): Scared, anxious, worried
* **Disgust** (4): Disgusted, repelled
* **Surprise** (5): Surprised, shocked, amazed
* **Other** (6): Neutral or unclassified emotions
## Results
The model was evaluated using the following metrics:
* **Accuracy**: `0.6378`
* **Macro-F1**: `0.6288`
* **Macro-Precision**: `0.6464`
* **Macro-Recall**: `0.6326`
## Usage
You can use this model for emotion recognition in Vietnamese text. Below is an example of how to use it with the HuggingFace Transformers library:
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(f"visolex/{model_key}")
model = AutoModelForSequenceClassification.from_pretrained(f"visolex/{model_key}")
# Example text
text = "Tôi rất vui vì hôm nay trời đẹp!"
# Tokenize
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)
# Predict
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=-1).item()
# Map to emotion name
emotion_map = {{
0: "Enjoyment",
1: "Sadness",
2: "Anger",
3: "Fear",
4: "Disgust",
5: "Surprise",
6: "Other"
}}
predicted_emotion = emotion_map[predicted_class]
print(f"Text: {{text}}")
print(f"Predicted emotion: {{predicted_emotion}}")
```
## Citation
If you use this model, please cite:
```bibtex
@misc{{visolex_emotion_{model_key},
title={{ {description} for Vietnamese Emotion Recognition}},
author={{ViSoLex Team}},
year={{2024}},
url={{https://huggingface.co/visolex/{model_key}}}
}}
```
## License
This model is released under the Apache-2.0 license.
## Acknowledgments
* Base model: [{base_model}](https://huggingface.co/{base_model})
* Dataset: VSMEC (Vietnamese Social Media Emotion Corpus)
* ViSoLex Toolkit
|