sagteam/cedr_v1
Viewer • Updated • 18.8k • 448 • 6
Multi-label emotion classifier for Russian text with a Mixture of Experts classification head.
A standard BERT encoder augmented with an MoE head instead of a plain linear classifier:
BERT encoder (rubert-tiny2)
↓
[CLS] embedding (312-dim)
↓
Router (linear) → softmax → top-2 of 4 experts
↓
Expert 1 │ Expert 2 │ Expert 3 │ Expert 4
(Linear → GELU → Dropout → Linear)
↓
Weighted sum of expert outputs
↓
Logits → sigmoid → multi-label output
The Router dynamically selects 2 of 4 experts for each input. Auxiliary loss penalizes uneven expert load, ensuring all experts train equally. Weighted BCE loss addresses class imbalance (anger occurs ~4× less often than joy).
| Parameter | Value |
|---|---|
| Base model | cointegrated/rubert-tiny2 |
| Total parameters | ~29.5M |
| Number of experts | 4 |
| Active experts (top-k) | 2 |
| Expert hidden dim | 256 |
| Dataset | sagteam/cedr_v1 |
import torch
from transformers import AutoTokenizer
# Copy BertMoEForMultiLabelClassification and MoEClassificationHead
# from the repository (modeling.py)
LABELS = ['joy', 'sadness', 'surprise', 'fear', 'anger']
THRESHOLD = 0.5
tokenizer = AutoTokenizer.from_pretrained("ilyali034/rubert-emotion-moe-ru")
model = BertMoEForMultiLabelClassification.from_pretrained("ilyali034/rubert-emotion-moe-ru")
model.eval()
def predict(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
probs = torch.sigmoid(model(**inputs).logits)[0]
return [
(LABELS[i], round(float(p), 3))
for i, p in enumerate(probs) if p > THRESHOLD
]
print(predict("Я очень рад, но немного боюсь!"))
# [('joy', 0.821), ('fear', 0.743)]
| Metric | Value |
|---|---|
| F1 micro | 0.7349 |
| F1 macro | 0.6903 |
| F1 weighted | 0.7442 |
| Precision micro | 0.6467 |
| Recall micro | 0.8510 |
| Class | F1 |
|---|---|
😄 joy |
0.8373 |
😢 sadness |
0.8091 |
😮 surprise |
0.6780 |
😨 fear |
0.6796 |
😠 anger |
0.4478 |
CEDR — Russian-language corpus with emotion annotations. Train: 7528 | Test: 1882 examples.
| Class | Emotion | Train examples |
|---|---|---|
joy |
😄 Joy | 1569 |
sadness |
😢 Sadness | 1417 |
surprise |
😮 Surprise | 607 |
fear |
😨 Fear | 589 |
anger |
😠 Anger | 411 |
If you use this model, please cite the CEDR dataset:
@inproceedings{cedr2021,
title={CEDR: Corpus for Emotions Detection in Russian},
author={Sboev, Alexander and Naumov, Artem and Rybka, Roman},
year={2021}
}
Base model
cointegrated/rubert-tiny2