AnnyNguyen commited on
Commit
f54f3e6
verified
1 Parent(s): 094b4d6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +131 -0
README.md ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: vinai/bartpho-syllable
4
+ tags:
5
+ - vietnamese
6
+ - emotion-recognition
7
+ - text-classification
8
+ - VSMEC
9
+ datasets:
10
+ - VSMEC
11
+ metrics:
12
+ - accuracy
13
+ - macro-f1
14
+ model-index:
15
+ - name: bartpho
16
+ results:
17
+ - task:
18
+ type: text-classification
19
+ name: Emotion Recognition
20
+ dataset:
21
+ name: VSMEC
22
+ type: VSMEC
23
+ metrics:
24
+ - type: accuracy
25
+ value: 0.6378066378066378
26
+ - type: macro-f1
27
+ value: 0.6288407005570578
28
+ ---
29
+
30
+ # bartpho: Emotion Recognition for Vietnamese Text
31
+
32
+ This model is a fine-tuned version of [vinai/bartpho-syllable](https://huggingface.co/vinai/bartpho-syllable) on the **VSMEC** dataset for emotion recognition in Vietnamese text.
33
+
34
+ ## Model Details
35
+
36
+ * **Base Model**: vinai/bartpho-syllable
37
+ * **Description**: BartPho - Vietnamese BART
38
+ * **Dataset**: VSMEC (Vietnamese Social Media Emotion Corpus)
39
+ * **Fine-tuning Framework**: HuggingFace Transformers
40
+ * **Task**: Emotion Classification (7 classes)
41
+
42
+ ### Hyperparameters
43
+
44
+ * Batch size: `32`
45
+ * Learning rate: `2e-5`
46
+ * Epochs: `100`
47
+ * Max sequence length: `256`
48
+ * Weight decay: `0.01`
49
+ * Warmup steps: `500`
50
+
51
+ ## Dataset
52
+
53
+ The model was trained on the **VSMEC** dataset, which contains 6,927 Vietnamese social media text samples annotated with emotion labels. The dataset includes the following emotion categories:
54
+
55
+ * **Enjoyment** (0): Positive emotions, joy, happiness
56
+ * **Sadness** (1): Sad, disappointed, gloomy feelings
57
+ * **Anger** (2): Angry, frustrated, irritated
58
+ * **Fear** (3): Scared, anxious, worried
59
+ * **Disgust** (4): Disgusted, repelled
60
+ * **Surprise** (5): Surprised, shocked, amazed
61
+ * **Other** (6): Neutral or unclassified emotions
62
+
63
+ ## Results
64
+
65
+ The model was evaluated using the following metrics:
66
+
67
+ * **Accuracy**: `0.6378`
68
+ * **Macro-F1**: `0.6288`
69
+ * **Macro-Precision**: `0.6464`
70
+ * **Macro-Recall**: `0.6326`
71
+
72
+ ## Usage
73
+
74
+ You can use this model for emotion recognition in Vietnamese text. Below is an example of how to use it with the HuggingFace Transformers library:
75
+
76
+ ```python
77
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
78
+ import torch
79
+
80
+ # Load model and tokenizer
81
+ tokenizer = AutoTokenizer.from_pretrained(f"visolex/{model_key}")
82
+ model = AutoModelForSequenceClassification.from_pretrained(f"visolex/{model_key}")
83
+
84
+ # Example text
85
+ text = "T么i r岷 vui v矛 h么m nay tr峄漣 膽岷筽!"
86
+
87
+ # Tokenize
88
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)
89
+
90
+ # Predict
91
+ outputs = model(**inputs)
92
+ predicted_class = outputs.logits.argmax(dim=-1).item()
93
+
94
+ # Map to emotion name
95
+ emotion_map = {{
96
+ 0: "Enjoyment",
97
+ 1: "Sadness",
98
+ 2: "Anger",
99
+ 3: "Fear",
100
+ 4: "Disgust",
101
+ 5: "Surprise",
102
+ 6: "Other"
103
+ }}
104
+
105
+ predicted_emotion = emotion_map[predicted_class]
106
+ print(f"Text: {{text}}")
107
+ print(f"Predicted emotion: {{predicted_emotion}}")
108
+ ```
109
+
110
+ ## Citation
111
+
112
+ If you use this model, please cite:
113
+
114
+ ```bibtex
115
+ @misc{{visolex_emotion_{model_key},
116
+ title={{ {description} for Vietnamese Emotion Recognition}},
117
+ author={{ViSoLex Team}},
118
+ year={{2024}},
119
+ url={{https://huggingface.co/visolex/{model_key}}}
120
+ }}
121
+ ```
122
+
123
+ ## License
124
+
125
+ This model is released under the Apache-2.0 license.
126
+
127
+ ## Acknowledgments
128
+
129
+ * Base model: [{base_model}](https://huggingface.co/{base_model})
130
+ * Dataset: VSMEC (Vietnamese Social Media Emotion Corpus)
131
+ * ViSoLex Toolkit