arnab04 commited on
Commit
8ef7e6b
Β·
verified Β·
1 Parent(s): d490d1f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +167 -1
README.md CHANGED
@@ -1,9 +1,175 @@
1
  ---
2
  license: apache-2.0
 
 
3
  language:
4
  - en
5
  metrics:
6
  - f1
 
 
7
  pipeline_tag: text-classification
8
  library_name: transformers
9
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - alex-shvets/EmoPillars
5
  language:
6
  - en
7
  metrics:
8
  - f1
9
+ - precision
10
+ - recall
11
  pipeline_tag: text-classification
12
  library_name: transformers
13
+ tags:
14
+ - multi-label-classification
15
+ - fine-grained
16
+ - emotion-classification
17
+ model-index:
18
+ - name: roberta-base-goemotions
19
+ results:
20
+ - task:
21
+ type: text-classification
22
+ name: Multi-label Fine-Grained Emotion Classification
23
+ dataset:
24
+ type: multi-class-classification
25
+ name: GoEmotions
26
+ split: test
27
+ metrics:
28
+ - type: accuracy
29
+ value: 0.85
30
+ name: Accuracy(Hamming)
31
+ - type: recall
32
+ value: 0.68
33
+ name: Recall-macro
34
+ - type: f1
35
+ value: 0.70
36
+ name: F1-macro
37
+ ---
38
+
39
+
40
+ ## 🏷️ Model Details
41
+ This model is finetuned and optimized for fine-grained multi-label emotion classification task from text.
42
+ The model employs a hybrid training objective that integrates similarity-based contrastive learning with a classification objective, instead of using the conventional binary cross-entropy (BCE) loss alone.
43
+ This approach enables the model to capture both semantic alignment between text and emotion concepts and label-specific decision boundaries, resulting in improved performance on the EmoPillars dataset.
44
+
45
+ *This model is the Model II (Classifier-based) variant accounding in our paper, which has achieved the best performance. Please read your work for more details of the model architecture and training objectives used.*
46
+
47
+ - **Developed by:** Subinoy Bera and Arnab Karmakar
48
+ - **Model type:** Transformers | RoBERTa-base
49
+ - **Language (NLP):** English
50
+ - **License:** Apache-2.0
51
+ - **Repository:** [GitHub](https://github.com/Hidden-States-AI-Labs/EmoAxis)
52
+ - **Research Paper:** [Do We Need a Classifier? Dual Objectives Go Beyond Baselines in Fine-Grained Emotion Classification.](https://zenodo.org/records/18123882?token=eyJhbGciOiJIUzUxMiJ9.eyJpZCI6IjhjNmQwMTYzLWFiYzEtNDBiZi05NTFkLTI2Mzg1YzhiYThhZSIsImRhdGEiOnt9LCJyYW5kb20iOiI5MDE1MDM1MTYxMTg1MzEyMTY3ZmY2YzNmY2NlYTM4OSJ9.JgOX4GlmZ8ad-PtjytzioPUPSJSGYp8wochqpTgMO78SE1oBq9R6yUor2_36oOaSUO04OPP0MJqBiYK0JK0NHA)
53
+
54
+
55
+ ## βœ… Intended Usage
56
+ The model is specifically intended for **fine-grained multi-label emotion classification from text** in both practical and research settings.
57
+ It can be used to detect emotions from short to medium-length textual content such as social media posts, user comments, online discussions, reviews, and conversational text, where identifying fine-grained emotion categories give better insights.
58
+
59
+ The model is suitable for **local and offline deployment** for tasks such as emotion-aware text analysis, affective computing research, and downstream NLP applications that benefit from fine-grained emotion signals.
60
+
61
+
62
+ ## πŸ“Š Dataset Used
63
+ [**EmoPillars**](https://huggingface.co/datasets/alex-shvets/EmoPillars)(2025): A large-scale multi-label emotion classification dataset, consisting of 300K English synthetic comments, annotated with 27 emotion categories plus a neutral label. The dataset is diverse and representative of real-world emotional language, consisting of informal grammar, sarcasm, and ambiguous or context-dependent cues. In this work, we adopt the full 28-label GoEmotions taxonomy for training & used a preprocessed subset of 100K examples.
64
+
65
+
66
+ ## πŸ“Œ Model Performance (on Test)
67
+ The model is evaluated using standard multi-label metrics, with a focus on Macro-F1, which is widely regarded as the most informative metric for such imbalanced, multi-label emotion classification tasks.
68
+
69
+ - Macro-F1 : 0.70<br>
70
+ - Micro-F1: 0.78<br>
71
+ - Precision: 0.78<br>
72
+ - Recall: 0.68<br>
73
+ - Accuracy (Hamming): 0.85
74
+
75
+ πŸ† **Given the absence of prior RoBERTa-based models trained on the EmoPillars dataset, our model <u>outperforms existing baselines</u> and achieves <u>*state-of-the-art*</u> performance among open-source methods!** πŸ₯‡
76
+
77
+
78
+ ## πŸš€ Get Started with the Model
79
+
80
+ ```bash
81
+ import torch
82
+ from transformers import AutoTokenizer, AutoModel
83
+ from transformers import logging as transformers_logging
84
+ import warnings
85
+ warnings.filterwarnings("ignore")
86
+ transformers_logging.set_verbosity_error()
87
+
88
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
89
+
90
+ model_id = "Hidden-States/roberta-base-emopillars-contextless"
91
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
92
+ model = AutoModel.from_pretrained(model_id, trust_remote_code=True)
93
+ model.to(device).eval()
94
+
95
+ emotion_labels = [
96
+ "admiration", "amusement", "anger", "annoyance", "approval", "caring",
97
+ "confusion", "curiosity", "desire", "disappointment", "disapproval",
98
+ "disgust", "embarrassment", "excitement", "fear", "gratitude", "grief",
99
+ "joy", "love", "nervousness", "optimism", "pride", "realization",
100
+ "relief", "remorse", "sadness", "surprise", "neutral"
101
+ ]
102
+
103
+ def predict_emotions(text):
104
+ inputs = tokenizer(text, truncation=True, max_length=128,
105
+ padding=True, return_attention_mask=True, return_tensors="pt"
106
+ ).to(device)
107
+ _, logits = model(**inputs)
108
+
109
+ probs = torch.sigmoid(logits)
110
+ preds = (probs >= 0.5).int()[0]
111
+
112
+ predicted_emotions = [
113
+ emotion_labels[i]
114
+ for i, v in enumerate(preds)
115
+ if v.item() == 1
116
+ ]
117
+ print(predicted_emotions)
118
+
119
+ text = "Honestly, same. I was miserable at my admin asst job."
120
+ predict_emotions(text)
121
+
122
+ #output: ['annoyance', 'disappointment', 'sadness']
123
+ ```
124
+
125
+ ## πŸ› οΈ Training Hyperparameters and Details
126
+
127
+ | Parameter | Value |
128
+ |-----------|-------|
129
+ | encoder lr-rate | 2.5e-5 |
130
+ | classifier lr-rate | 1.5e-4 |
131
+ | optimizer | AdamW |
132
+ | lr-scheduler | cosine with warmup |
133
+ | weight decay | 0.001 |
134
+ | warmup ratio | 0.1 |
135
+ | temperature | 0.05 |
136
+ | clipping constant | 0.05 |
137
+ | batch size | 64 |
138
+ | epochs | 8 |
139
+ | threshold | 0.5 (fixed) |
140
+
141
+ Check out our paper for complete training details and objectives used: [Visit ↗️](https://zenodo.org/records/18123882?token=eyJhbGciOiJIUzUxMiJ9.eyJpZCI6IjhjNmQwMTYzLWFiYzEtNDBiZi05NTFkLTI2Mzg1YzhiYThhZSIsImRhdGEiOnt9LCJyYW5kb20iOiI5MDE1MDM1MTYxMTg1MzEyMTY3ZmY2YzNmY2NlYTM4OSJ9.JgOX4GlmZ8ad-PtjytzioPUPSJSGYp8wochqpTgMO78SE1oBq9R6yUor2_36oOaSUO04OPP0MJqBiYK0JK0NHA)
142
+
143
+
144
+ ## πŸ’» Compute Infrastructure
145
+ - **Inference**: Any modern x86 CPU with minimum 4 GB RAM. GPU is optional, not required for inference.
146
+
147
+ - **Training/ Fine-Tuning**: Must use GPU with at least 8–10 GB of VRAM. This model has been trained in Google Colab environment with single T4 GPU.
148
+
149
+ - **Libraries/ Modules**
150
+ 1. Transformers : 4.57.3
151
+ 2. Pytorch : 2.8.0+cu129
152
+ 3. Datasets : 4.4.1
153
+ 4. Scikit-learn : 1.8.0
154
+ 5. Numpy : 2.3.5
155
+
156
+
157
+ ## ⚠️ Out-of-Scope Use
158
+
159
+ The model cannot be directly used for detecting emotions from multi-lingual or multi-modal data/text, and cannot predict emotions beyond the 28-label GoEmotions-taxonomy.
160
+ While the proposed approach demonstrates strong empirical performance on benchmark datasets, it is not designed, evaluated, or validated for deployment in high-stakes or safety-critical applications.
161
+ The model may reflect dataset-specific biases, annotation subjectivity, and cultural limitations inherent in emotion datasets. Predictions should therefore be interpreted as approximate signals rather than definitive emotional states.
162
+
163
+ Users are responsible for ensuring that any downstream application complies with relevant ethical guidelines, legal regulations, and domain-specific standards.
164
+ <br>
165
+
166
+ ## πŸŽ—οΈ Community Support & Citation
167
+
168
+ **If you find this model useful, please consider liking this repository and also give a star to our GitHub repository.
169
+ Your support helps us improve and maintain this work!** ⭐
170
+
171
+ πŸ“ **If you use our work in academic or research settings, please cite our work accordingly.** πŸ™πŸ˜ƒ <br>
172
+ <br>
173
+
174
+ THANK YOU!! πŸ§‘πŸ€πŸ’š<br>
175
+ *- with regards*: Hidden States AI Labs