|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: vinai/bartpho-syllable |
|
|
tags: |
|
|
- vietnamese |
|
|
- emotion-recognition |
|
|
- text-classification |
|
|
- VSMEC |
|
|
datasets: |
|
|
- VSMEC |
|
|
metrics: |
|
|
- accuracy |
|
|
- macro-f1 |
|
|
model-index: |
|
|
- name: bartpho |
|
|
results: |
|
|
- task: |
|
|
type: text-classification |
|
|
name: Emotion Recognition |
|
|
dataset: |
|
|
name: VSMEC |
|
|
type: VSMEC |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 0.6378066378066378 |
|
|
- type: macro-f1 |
|
|
value: 0.6288407005570578 |
|
|
--- |
|
|
|
|
|
# bartpho: Emotion Recognition for Vietnamese Text |
|
|
|
|
|
This model is a fine-tuned version of [vinai/bartpho-syllable](https://huggingface.co/vinai/bartpho-syllable) on the **VSMEC** dataset for emotion recognition in Vietnamese text. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
* **Base Model**: vinai/bartpho-syllable |
|
|
* **Description**: BartPho - Vietnamese BART |
|
|
* **Dataset**: VSMEC (Vietnamese Social Media Emotion Corpus) |
|
|
* **Fine-tuning Framework**: HuggingFace Transformers |
|
|
* **Task**: Emotion Classification (7 classes) |
|
|
|
|
|
### Hyperparameters |
|
|
|
|
|
* Batch size: `32` |
|
|
* Learning rate: `2e-5` |
|
|
* Epochs: `100` |
|
|
* Max sequence length: `256` |
|
|
* Weight decay: `0.01` |
|
|
* Warmup steps: `500` |
|
|
|
|
|
## Dataset |
|
|
|
|
|
The model was trained on the **VSMEC** dataset, which contains 6,927 Vietnamese social media text samples annotated with emotion labels. The dataset includes the following emotion categories: |
|
|
|
|
|
* **Enjoyment** (0): Positive emotions, joy, happiness |
|
|
* **Sadness** (1): Sad, disappointed, gloomy feelings |
|
|
* **Anger** (2): Angry, frustrated, irritated |
|
|
* **Fear** (3): Scared, anxious, worried |
|
|
* **Disgust** (4): Disgusted, repelled |
|
|
* **Surprise** (5): Surprised, shocked, amazed |
|
|
* **Other** (6): Neutral or unclassified emotions |
|
|
|
|
|
## Results |
|
|
|
|
|
The model was evaluated using the following metrics: |
|
|
|
|
|
* **Accuracy**: `0.6378` |
|
|
* **Macro-F1**: `0.6288` |
|
|
* **Macro-Precision**: `0.6464` |
|
|
* **Macro-Recall**: `0.6326` |
|
|
|
|
|
## Usage |
|
|
|
|
|
You can use this model for emotion recognition in Vietnamese text. Below is an example of how to use it with the HuggingFace Transformers library: |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch |
|
|
|
|
|
# Load model and tokenizer |
|
|
tokenizer = AutoTokenizer.from_pretrained(f"visolex/{model_key}") |
|
|
model = AutoModelForSequenceClassification.from_pretrained(f"visolex/{model_key}") |
|
|
|
|
|
# Example text |
|
|
text = "Tôi rất vui vì hôm nay trời đẹp!" |
|
|
|
|
|
# Tokenize |
|
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256) |
|
|
|
|
|
# Predict |
|
|
outputs = model(**inputs) |
|
|
predicted_class = outputs.logits.argmax(dim=-1).item() |
|
|
|
|
|
# Map to emotion name |
|
|
emotion_map = {{ |
|
|
0: "Enjoyment", |
|
|
1: "Sadness", |
|
|
2: "Anger", |
|
|
3: "Fear", |
|
|
4: "Disgust", |
|
|
5: "Surprise", |
|
|
6: "Other" |
|
|
}} |
|
|
|
|
|
predicted_emotion = emotion_map[predicted_class] |
|
|
print(f"Text: {{text}}") |
|
|
print(f"Predicted emotion: {{predicted_emotion}}") |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{{visolex_emotion_{model_key}, |
|
|
title={{ {description} for Vietnamese Emotion Recognition}}, |
|
|
author={{ViSoLex Team}}, |
|
|
year={{2024}}, |
|
|
url={{https://huggingface.co/visolex/{model_key}}} |
|
|
}} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the Apache-2.0 license. |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
* Base model: [{base_model}](https://huggingface.co/{base_model}) |
|
|
* Dataset: VSMEC (Vietnamese Social Media Emotion Corpus) |
|
|
* ViSoLex Toolkit |
|
|
|