Text Classification
Transformers
Safetensors
English
emcoder
feature-extraction
emotion-recognition
bayesian-deep-learning
mc-dropout
uncertainty-quantification
multi-label-classification
custom_code
Eval Results (legacy)
Instructions to use yezdata/EmCoder with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use yezdata/EmCoder with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="yezdata/EmCoder", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("yezdata/EmCoder", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
| language: | |
| - en | |
| license: cc-by-nc-nd-4.0 | |
| library_name: generic | |
| tags: | |
| - emotion-recognition | |
| - bayesian-deep-learning | |
| - mc-dropout | |
| - uncertainty-quantification | |
| - multi-label-classification | |
| datasets: | |
| - go_emotions | |
| metrics: | |
| - precision | |
| - recall | |
| - f1 | |
| model-index: | |
| - name: EmCoder (v1) | |
| results: | |
| - task: | |
| type: text-classification | |
| name: Multi-label Emotion Classification | |
| dataset: | |
| name: GoEmotions | |
| type: go_emotions | |
| split: test | |
| metrics: | |
| - name: Macro F1 | |
| type: f1 | |
| value: 0.440 | |
| - name: Macro Precision | |
| type: precision | |
| value: 0.408 | |
| - name: Macro Recall | |
| type: recall | |
| value: 0.495 | |
| # EmCoder | |
| > **Probabilistic Emotion Recognition & Uncertainty Quantification**<br>**28 Emotion multi-label classifier trained with MC Dropout methodology** | |
| Unlike standard classifiers, EmCoder quantifies what it doesn't know using Monte Carlo Dropout, making it suitable for high-stakes AI pipelines.<br> | |
| EmCoder is optimized for **MC Dropout inference**. | |
| ## SOTA benchmark | |
| ### Evaluation on the GoEmotions test split (macro avg metrics) | |
| EmCoder achieves competitive F1-scores while being ~35% smaller than RoBERTa-base and ~45% smaller than ModernBERT, offering a superior efficiency-to-uncertainty ratio. | |
| | Model | Precision | Recall | F1-Score | Params | | |
| | :--- | :--- | :--- | :--- | :--- | | |
| | **EmCoder (v1)** | **0.408** | **0.495** | **0.440** | **82.1M** | | |
| | Google BERT (Original) | 0.400 | 0.630 | 0.460 | 110M | | |
| | RoBERTa-base | 0.575 | 0.396 | 0.450 | 125M | | |
| | ModernBERT-base | 0.652 | 0.443 | 0.500 | 149M | | |
| ## How to use | |
| > Since `.safetensors` files only store model weights and not the class logic, you need to use the provided `emcoder.py` to enable **MC Dropout inference**.<br>EmCoder v1.0 requires the `roberta-base` tokenizer for correct token-to-embedding mapping. | |
| ### 1. Setup & Tokenization | |
| ```python | |
| from transformers import AutoTokenizer | |
| from emcoder import EmCoder # Ensure emcoder.py is in your directory | |
| # Load the same tokenizer used during training | |
| tokenizer = AutoTokenizer.from_pretrained("roberta-base") | |
| EMCODER_PATH = "path/to/emcoder" | |
| # Initialize with same config as training | |
| model = EmCoder.from_pretrained(EMCODER_PATH) | |
| ``` | |
| ### 2. Bayesian inference | |
| To obtain probabilistic outputs and uncertainty metrics, use the mc_forward method: | |
| ```python | |
| import torch | |
| # Perform 50 stochastic passes | |
| N_SAMPLES = 50 | |
| model.eval() | |
| inputs = tokenizer("I am so happy you are here!", return_tensors="pt") | |
| logits_mc = model.mc_forward(inputs['input_ids'], inputs['attention_mask'], n_samples=N_SAMPLES) # Automatically keeps Dropout active, even when in model.eval | |
| # Bayesian Post-processing | |
| # logits_mc shape: (n_samples, batch_size, 28) | |
| probs_all = torch.sigmoid(logits_mc) | |
| mean_probs = probs_all.mean(dim=0) # Mean Predicted Probability | |
| uncertainty = probs_all.std(dim=0) # Epistemic Uncertainty (Standard Deviation) | |
| ``` | |
| ## Model Architecture | |
| ```mermaid | |
| flowchart LR | |
| subgraph InputGroup["Input Operations"] | |
| direction TB | |
| MCD_Loop(["MC-Inference Loop: N_samples"]):::LoopNode | |
| ids["Batch IDs"] | |
| mask["Batch Mask"] | |
| end | |
| subgraph EmCoderCore["EmCoder Core"] | |
| direction LR | |
| tok_emb["Token Embedding"] | |
| ln_in["Input LayerNorm"] | |
| Transformer["Transformer Encoder"] | |
| final_norm["Final LayerNorm"] | |
| Dropout1[("MC-Dropout")] | |
| Dropout2[("MC-Dropout")] | |
| end | |
| subgraph Row1[" "] | |
| direction LR | |
| InputGroup | |
| EmCoderCore | |
| end | |
| subgraph MLP["Classifier MLP"] | |
| L_lin["Linear 1"] | |
| Dropout3[("MC-Dropout")] | |
| GELU["GELU"] | |
| F_lin["Final Linear"] | |
| end | |
| subgraph ClassifierHead[" "] | |
| direction TB | |
| pool["Masked Mean Pooling"] | |
| MLP | |
| end | |
| subgraph Row2[" "] | |
| direction LR | |
| ClassifierHead | |
| Out(["Class LogitsMC | |
| (n_samples, B, 28)"]) | |
| Avg["Bayesian Post-processing"] | |
| end | |
| tok_emb ==> ln_in | |
| ln_in -.-> Dropout1 | |
| Dropout1 ==> Transformer | |
| Transformer -.-> Dropout2 | |
| Dropout2 ==> final_norm | |
| MCD_Loop -.-> ids | |
| ids ==> tok_emb | |
| final_norm ==> pool | |
| mask ==> pool | |
| pool ==> L_lin | |
| L_lin -.-> Dropout3 | |
| Dropout3 ==> GELU | |
| GELU ==> F_lin | |
| F_lin ==> Out | |
| Out ==> Avg | |
| mask ==> Transformer | |
| classDef MCD fill:#424242,stroke:#fbc02d,stroke-width:2px,stroke-dasharray: 5 5,color:#fff | |
| classDef OutNode fill:#0d47a1,stroke:#1976d2,stroke-width:3px,color:#fff,font-weight:bold | |
| classDef BayesNode fill:#3e2723,stroke:#8d6e63,stroke-width:2px,stroke-dasharray: 3 3,color:#fff | |
| classDef LoopNode fill:#263238,stroke:#78909c,stroke-width:2px,color:#fff,font-style:italic | |
| classDef LightNode fill:#212121,stroke:#90a4ae,color:#fff | |
| class MCD_Loop LoopNode | |
| class ids,mask,tok_emb,ln_in,Transformer,final_norm,L_lin,GELU,F_lin,pool LightNode | |
| class Dropout1,Dropout2,Dropout3 MCD | |
| class Out OutNode | |
| class Avg BayesNode | |
| style InputGroup fill:#1a1a1a,stroke:#444,color:#fff | |
| style EmCoderCore fill:#2d1a2d,stroke:#6a1b9a,color:#fff | |
| style MLP fill:#212121,stroke:#455a64,color:#fff | |
| style ClassifierHead fill:#012a4a,stroke:#01497c,color:#fff | |
| style Row1 fill:none,stroke:none | |
| style Row2 fill:none,stroke:none | |
| linkStyle 2 stroke:#fbc02d,stroke-width:2px,fill:none | |
| linkStyle 5 stroke:#fbc02d,stroke-width:2px,fill:none | |
| linkStyle 11 stroke:#fbc02d,stroke-width:2px,fill:none | |
| ``` | |
| ### Optimization | |
| The model is trained using a Weighted Bayesian Binary Cross Entropy loss: | |
| $$ | |
| \mathcal{L}_{Bayesian} = \frac{1}{T} \sum_{t=1}^{T} \text{BCEWithLogits}(z^{(t)}, y; w) | |
| $$ | |
| Where weights $w$ are calculated using a logarithmic class-balancing scale to handle extreme label imbalance: | |
| $$ | |
| w_{c} = \max\left( 0.1, \min\left( 20, 1 + \ln \left( \frac{N_{neg,c} + \epsilon}{N_{pos,c} + \epsilon} \right) \right) \right) | |
| $$ | |
| ## Performance | |
| **Using threshold of 0.5 for binarizing predictions** | |
| | | precision | recall | f1-score | support | | |
| |:---------------|------------:|---------:|-----------:|----------:| | |
| | micro avg | 0.494 | 0.596 | 0.54 | 6329 | | |
| | macro avg | 0.408 | 0.495 | 0.44 | 6329 | | |
| | weighted avg | 0.492 | 0.596 | 0.535 | 6329 | | |
| | samples avg | 0.525 | 0.616 | 0.544 | 6329 | | |
| |----------------|-------------|----------|------------|-----------| | |
| | admiration | 0.541 | 0.673 | 0.599 | 504 | | |
| | amusement | 0.688 | 0.909 | 0.783 | 264 | | |
| | anger | 0.419 | 0.47 | 0.443 | 198 | | |
| | annoyance | 0.31 | 0.25 | 0.277 | 320 | | |
| | approval | 0.304 | 0.271 | 0.287 | 351 | | |
| | caring | 0.229 | 0.281 | 0.252 | 135 | | |
| | confusion | 0.26 | 0.497 | 0.342 | 153 | | |
| | curiosity | 0.432 | 0.764 | 0.552 | 284 | | |
| | desire | 0.453 | 0.518 | 0.483 | 83 | | |
| | disappointment | 0.176 | 0.152 | 0.163 | 151 | | |
| | disapproval | 0.279 | 0.404 | 0.33 | 267 | | |
| | disgust | 0.447 | 0.545 | 0.491 | 123 | | |
| | embarrassment | 0.325 | 0.351 | 0.338 | 37 | | |
| | excitement | 0.288 | 0.427 | 0.344 | 103 | | |
| | fear | 0.47 | 0.692 | 0.56 | 78 | | |
| | gratitude | 0.834 | 0.943 | 0.885 | 352 | | |
| | grief | 0 | 0 | 0 | 6 | | |
| | joy | 0.445 | 0.652 | 0.529 | 161 | | |
| | love | 0.724 | 0.895 | 0.801 | 238 | | |
| | nervousness | 0.24 | 0.261 | 0.25 | 23 | | |
| | optimism | 0.483 | 0.543 | 0.511 | 186 | | |
| | pride | 0.667 | 0.375 | 0.48 | 16 | | |
| | realization | 0.226 | 0.166 | 0.191 | 145 | | |
| | relief | 0.222 | 0.182 | 0.2 | 11 | | |
| | remorse | 0.516 | 0.857 | 0.644 | 56 | | |
| | sadness | 0.405 | 0.545 | 0.464 | 156 | | |
| | surprise | 0.429 | 0.539 | 0.478 | 141 | | |
| | neutral | 0.602 | 0.695 | 0.645 | 1787 | | |
| **Model uncertainty estimation** | |
|  | |
| **Confusion matrix** | |
|  | |
| ## Workflow | |
| ```mermaid | |
| flowchart LR | |
| classDef StageNode fill:#121212,stroke:#546e7a,color:#fff; | |
| classDef HighlightNode fill:#4e342e,stroke:#ff7043,stroke-width:2px,color:#fff,font-weight:bold; | |
| subgraph PT ["Phase 1: Pre-training"] | |
| direction TB | |
| OWT[(OpenWebText)]:::StageNode --> MLM[Masked Language Modeling]:::StageNode | |
| MLM --> Core[Save EmCoderCore]:::StageNode | |
| end | |
| subgraph FT ["Phase 2: Fine-tuning"] | |
| direction TB | |
| Core --> Init[Init ClassificationHead]:::StageNode | |
| GE[(GoEmotions)]:::StageNode --> WBT[Bayesian Fine-tuning]:::HighlightNode | |
| WBT --> LogW[Log-weighted BCE Loss]:::StageNode | |
| LogW --> Freeze[Step 0-500: Encoder Frozen]:::StageNode | |
| end | |
| subgraph EV ["Phase 3: Testing & Inference"] | |
| direction TB | |
| Freeze --> MCD[MC Dropout Inference]:::HighlightNode | |
| MCD --> Unc[Uncertainty Estimation]:::HighlightNode | |
| subgraph Metrics ["Analysis"] | |
| Unc --> EPI[Epistemic: Model Confidence]:::StageNode | |
| Unc --> ALE[Aleatoric: Data Ambiguity]:::StageNode | |
| Unc --> CM[Test set metrics]:::StageNode | |
| end | |
| end | |
| style PT fill:#0d1b2a,stroke:#1b263b,color:#fff | |
| style FT fill:#2e1500,stroke:#5d2a00,color:#fff | |
| style EV fill:#1b2e1b,stroke:#2d4a2d,color:#fff | |
| style Metrics fill:#000,stroke:#333,color:#fff | |
| linkStyle default stroke:#aaa,stroke-width:2px; | |
| ``` | |
| ### Note | |
| Note that this model was trained on GoEmotions dataset (social networks domain) and it may not generalize well to other domains. | |
| ## Citation | |
| If you use this model, please cite it as follows: | |
| ```bibtex | |
| @software{jez2026emcoder, | |
| author = {Václav Jež}, | |
| title = {EmCoder: Probabilistic Emotion Recognition & Uncertainty Quantification}, | |
| year = {2026}, | |
| publisher = {GitHub}, | |
| journal = {GitHub repository}, | |
| howpublished = {\url{https://github.com/yezdata/emcoder}}, | |
| version = {1.0.0} | |
| } | |
| ``` |