tomasBernal commited on
Commit
9824a34
·
verified ·
1 Parent(s): f9a08d8

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +217 -0
README.md ADDED
@@ -0,0 +1,217 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ library_name: transformers
6
+ pipeline_tag: text-classification
7
+ tags:
8
+ - emotion-recognition
9
+ - speech-emotion-recognition
10
+ - text-classification
11
+ - english
12
+ - affective-computing
13
+ - umuteam
14
+ datasets:
15
+ - dair-ai/emotion
16
+ - go_emotions
17
+ - MELD
18
+ - ISEAR
19
+ metrics:
20
+ - accuracy
21
+ - f1
22
+
23
+ model-index:
24
+ - name: UMUTeam/roberta-emotion-en
25
+ results:
26
+ - task:
27
+ type: text-classification
28
+ name: Emotion Classification
29
+ dataset:
30
+ name: English Emotion Recognition Benchmark
31
+ type: custom
32
+ metrics:
33
+ - type: accuracy
34
+ value: 76.0842
35
+ name: Accuracy
36
+ - type: weighted-f1
37
+ value: 75.6852
38
+ name: Weighted F1
39
+ - type: macro-f1
40
+ value: 68.0266
41
+ name: Macro F1
42
+ ---
43
+
44
+ # UMUTeam/roberta-emotion-en
45
+
46
+ ## Model description
47
+
48
+ `UMUTeam/roberta-emotion-en` is an English text-based emotion recognition model developed as part of **speech-emotion**, an open-source multilingual and multimodal toolkit for emotion recognition from speech, text, and multimodal inputs.
49
+
50
+ This model performs **emotion classification from English text**.
51
+
52
+ The model is based on the RoBERTa Transformer architecture and was fine-tuned for emotion classification tasks in English.
53
+
54
+ It is designed to be used either as a standalone text-only classifier or as part of the broader `speech-emotion` framework, where textual representations can be combined with acoustic representations for multimodal emotion recognition.
55
+
56
+ The model predicts one of the following emotion labels:
57
+
58
+ - `angry`
59
+ - `disgust`
60
+ - `fear`
61
+ - `happy`
62
+ - `neutral`
63
+ - `sad`
64
+ - `surprise`
65
+
66
+ ## Intended use
67
+
68
+ This model is intended for research and applied scenarios involving English emotion recognition from text, such as:
69
+
70
+ - emotion analysis in transcribed speech
71
+ - conversational analysis
72
+ - affective computing research
73
+ - human-computer interaction
74
+ - educational or exploratory emotion analysis tools
75
+ - integration into multimodal speech emotion recognition pipelines
76
+
77
+ It can be used directly with the Hugging Face `transformers` library or through the `speech-emotion` toolkit.
78
+
79
+ ## Out-of-scope use
80
+
81
+ This model should not be used as the sole basis for high-stakes decisions, including but not limited to:
82
+
83
+ - clinical diagnosis
84
+ - mental health assessment
85
+ - employment, legal, or educational decisions
86
+ - biometric profiling or surveillance
87
+ - automated decisions affecting individuals without human oversight
88
+
89
+ Emotion recognition is inherently uncertain and context-dependent. Predictions should be interpreted as model estimates, not as definitive assessments of a person's emotional state.
90
+
91
+ ## Training data
92
+
93
+ The model was trained on the English text datasets used in the `speech-emotion` project.
94
+
95
+ The training data combines multiple publicly available English emotion recognition datasets, including:
96
+
97
+ - CARER
98
+ - GoEmotions
99
+ - ISEAR
100
+ - MELD
101
+
102
+ Because the original datasets use different emotion taxonomies, all datasets were harmonized into a unified seven-class emotion taxonomy:
103
+
104
+ - `angry`
105
+ - `disgust`
106
+ - `fear`
107
+ - `happy`
108
+ - `neutral`
109
+ - `sad`
110
+ - `surprise`
111
+
112
+ For the English text-based emotion recognition setup:
113
+
114
+ - Training samples: 93,525
115
+ - Validation samples: 11,691
116
+ - Test samples: 11,691
117
+
118
+ More details about the dataset preprocessing and label harmonization pipeline are available in the project repository:
119
+
120
+ https://github.com/NLP-UMUTeam/umuteam-speech-emotion
121
+
122
+ ## Evaluation
123
+
124
+ The model was evaluated on the English held-out test set used in the `speech-emotion` toolkit.
125
+
126
+ ### Performance comparison on English emotion recognition
127
+
128
+ | Configuration | Accuracy | Weighted Precision | Weighted F1 | Macro F1 |
129
+ |---|---:|---:|---:|---:|
130
+ | Speech-only | 95.1435 | 95.2700 | 95.1575 | 95.1679 |
131
+ | Text-only | 76.0842 | 75.5723 | 75.6852 | 68.0266 |
132
+ | Multimodal (Concat) | **96.0462** | **96.0880** | **96.0257** | **96.0462** |
133
+ | Multimodal (Mean) | 90.2870 | 90.5162 | 90.2334 | 90.2589 |
134
+ | Multimodal (Multihead) | 93.1567 | 93.2715 | 93.1898 | 93.2115 |
135
+
136
+ These results show that text-only emotion recognition is effective for English emotion analysis, although multimodal approaches combining acoustic and linguistic representations achieve higher overall performance.
137
+
138
+ ## How to use
139
+
140
+ ```python
141
+ from transformers import pipeline
142
+
143
+ classifier = pipeline(
144
+ "text-classification",
145
+ model="UMUTeam/roberta-emotion-en",
146
+ top_k=None
147
+ )
148
+
149
+ text = "I was really happy to see you again."
150
+
151
+ predictions = classifier(text)
152
+
153
+ print(predictions)
154
+ ```
155
+
156
+ You can also use this model through the `speech-emotion` toolkit:
157
+
158
+ ```bash
159
+ pip install speech-emotion
160
+ ```
161
+
162
+ ```python
163
+ from speech_emotion import predict_emotion
164
+
165
+ emotion = predict_emotion(
166
+ text="I was really happy to see you again.",
167
+ language="en",
168
+ mode="text",
169
+ model_config_path="model.json"
170
+ )
171
+
172
+ print("Detected emotion:", emotion)
173
+ ```
174
+
175
+ Repository:
176
+
177
+ https://github.com/NLP-UMUTeam/umuteam-speech-emotion
178
+
179
+ ## Limitations
180
+
181
+ - The model is designed for English text and may not perform reliably on other languages.
182
+ - It predicts a single label from a fixed set of seven emotions.
183
+ - Emotion expression is subjective and highly context-dependent.
184
+ - Text-only emotion recognition may miss relevant acoustic or visual cues such as tone of voice, pauses, intensity, facial expressions, or interaction context.
185
+ - Performance may decrease on noisy transcriptions, informal language, code-switching, domain-specific language, or texts that differ substantially from the training data.
186
+
187
+ ## Bias and ethical considerations
188
+
189
+ Emotion recognition systems may reflect biases present in their training data, including differences related to language variety, register, demographics, topic, or annotation subjectivity.
190
+
191
+ Users should avoid interpreting predictions as objective truths about a person's internal emotional state. The model should be used with transparency, appropriate consent, and human oversight, especially in sensitive contexts.
192
+
193
+ ## Citation
194
+
195
+ If you use this model in your research, please cite the following works:
196
+
197
+ ### speech-emotion toolkit
198
+
199
+ ```bibtex
200
+ @article{PAN2026102677,
201
+ title = {speech-emotion: A multilingual and multimodal toolkit for emotion recognition from speech},
202
+ journal = {SoftwareX},
203
+ volume = {34},
204
+ pages = {102677},
205
+ year = {2026},
206
+ issn = {2352-7110},
207
+ doi = {https://doi.org/10.1016/j.softx.2026.102677},
208
+ url = {https://www.sciencedirect.com/science/article/pii/S235271102600169X},
209
+ author = {Ronghao Pan and Tomás Bernal-Beltrán and José Antonio García-Díaz and Rafael Valencia-García},
210
+ }
211
+ ```
212
+
213
+ ## Acknowledgments
214
+
215
+ This work is part of the research project LaTe4PoliticES (PID2022-138099OB-I00), funded by MICIU/AEI/10.13039/501100011033 and the European Regional Development Fund (ERDF/EU - FEDER/UE), “A way of making Europe”.
216
+
217
+ Mr. Tomás Bernal-Beltrán is supported by the University of Murcia through the predoctoral programme.