File size: 3,316 Bytes

f54f3e6

---
license: apache-2.0
base_model: vinai/bartpho-syllable
tags:
- vietnamese
- emotion-recognition
- text-classification
- VSMEC
datasets:
- VSMEC
metrics:
- accuracy
- macro-f1
model-index:
- name: bartpho
  results:
  - task:
      type: text-classification
      name: Emotion Recognition
    dataset:
      name: VSMEC
      type: VSMEC
    metrics:
      - type: accuracy
        value: 0.6378066378066378
      - type: macro-f1
        value: 0.6288407005570578
---

# bartpho: Emotion Recognition for Vietnamese Text

This model is a fine-tuned version of [vinai/bartpho-syllable](https://huggingface.co/vinai/bartpho-syllable) on the **VSMEC** dataset for emotion recognition in Vietnamese text.

## Model Details

* **Base Model**: vinai/bartpho-syllable
* **Description**: BartPho - Vietnamese BART
* **Dataset**: VSMEC (Vietnamese Social Media Emotion Corpus)
* **Fine-tuning Framework**: HuggingFace Transformers
* **Task**: Emotion Classification (7 classes)

### Hyperparameters

* Batch size: `32`
* Learning rate: `2e-5`
* Epochs: `100`
* Max sequence length: `256`
* Weight decay: `0.01`
* Warmup steps: `500`

## Dataset

The model was trained on the **VSMEC** dataset, which contains 6,927 Vietnamese social media text samples annotated with emotion labels. The dataset includes the following emotion categories:

* **Enjoyment** (0): Positive emotions, joy, happiness
* **Sadness** (1): Sad, disappointed, gloomy feelings
* **Anger** (2): Angry, frustrated, irritated
* **Fear** (3): Scared, anxious, worried
* **Disgust** (4): Disgusted, repelled
* **Surprise** (5): Surprised, shocked, amazed
* **Other** (6): Neutral or unclassified emotions

## Results

The model was evaluated using the following metrics:

* **Accuracy**: `0.6378`
* **Macro-F1**: `0.6288`
* **Macro-Precision**: `0.6464`
* **Macro-Recall**: `0.6326`

## Usage

You can use this model for emotion recognition in Vietnamese text. Below is an example of how to use it with the HuggingFace Transformers library:

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(f"visolex/{model_key}")
model = AutoModelForSequenceClassification.from_pretrained(f"visolex/{model_key}")

# Example text
text = "Tôi rất vui vì hôm nay trời đẹp!"

# Tokenize
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)

# Predict
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=-1).item()

# Map to emotion name
emotion_map = {{
    0: "Enjoyment",
    1: "Sadness",
    2: "Anger",
    3: "Fear",
    4: "Disgust",
    5: "Surprise",
    6: "Other"
}}

predicted_emotion = emotion_map[predicted_class]
print(f"Text: {{text}}")
print(f"Predicted emotion: {{predicted_emotion}}")
```

## Citation

If you use this model, please cite:

```bibtex
@misc{{visolex_emotion_{model_key},
  title={{ {description} for Vietnamese Emotion Recognition}},
  author={{ViSoLex Team}},
  year={{2024}},
  url={{https://huggingface.co/visolex/{model_key}}}
}}
```

## License

This model is released under the Apache-2.0 license.

## Acknowledgments

* Base model: [{base_model}](https://huggingface.co/{base_model})
* Dataset: VSMEC (Vietnamese Social Media Emotion Corpus)
* ViSoLex Toolkit