Text Classification
Transformers
Safetensors
English
emcoder
emotion-recognition
bayesian-deep-learning
mc-dropout
uncertainty-quantification
multi-label-classification
custom_code
Eval Results (legacy)
Instructions to use yezdata/EmCoder with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use yezdata/EmCoder with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="yezdata/EmCoder", trust_remote_code=True)# Load model directly from transformers import AutoModelForSequenceClassification model = AutoModelForSequenceClassification.from_pretrained("yezdata/EmCoder", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -64,7 +64,8 @@ EmCoder achieves competitive F1-score with its compact size (~35% smaller than R
|
|
| 64 |
|
| 65 |
## How to use
|
| 66 |
### 1. Setup & Tokenization
|
| 67 |
-
EmCoder uses the `roberta-base` tokenizer for correct token-to-embedding mapping.
|
|
|
|
| 68 |
```python
|
| 69 |
import torch
|
| 70 |
from transformers import AutoModel, AutoTokenizer
|
|
@@ -77,7 +78,6 @@ tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
|
|
| 77 |
# Initialize with same config as training
|
| 78 |
model = AutoModel.from_pretrained(repo_id, trust_remote_code=True)
|
| 79 |
```
|
| 80 |
-
|
| 81 |
### 2. Bayesian inference
|
| 82 |
To obtain probabilistic outputs and uncertainty metrics, use the `mc_forward` method:
|
| 83 |
```python
|
|
@@ -91,8 +91,7 @@ model.eval()
|
|
| 91 |
with torch.no_grad():
|
| 92 |
# Automatically keeps Dropout active, even when in model.eval
|
| 93 |
mc_logits = model.mc_forward(
|
| 94 |
-
inputs
|
| 95 |
-
inputs['attention_mask'],
|
| 96 |
n_samples=N_SAMPLES,
|
| 97 |
max_batch_size=MAX_BATCH_SIZE
|
| 98 |
)
|
|
@@ -179,7 +178,7 @@ $$
|
|
| 179 |
### Entropy-based uncertainty quantification
|
| 180 |
|
| 181 |
**Model uncertainty quantification on GoEmotions test set**
|
| 182 |
-
Flattened emotion predictions
|
| 183 |
| Mean probability vs Epistemic | Mean probability vs Aleatoric |
|
| 184 |
| :---: | :---: |
|
| 185 |
|  |  |
|
|
|
|
| 64 |
|
| 65 |
## How to use
|
| 66 |
### 1. Setup & Tokenization
|
| 67 |
+
EmCoder uses the `roberta-base` tokenizer for correct token-to-embedding mapping.
|
| 68 |
+
Ensure you allow remote code execution since it's a custom architecture.
|
| 69 |
```python
|
| 70 |
import torch
|
| 71 |
from transformers import AutoModel, AutoTokenizer
|
|
|
|
| 78 |
# Initialize with same config as training
|
| 79 |
model = AutoModel.from_pretrained(repo_id, trust_remote_code=True)
|
| 80 |
```
|
|
|
|
| 81 |
### 2. Bayesian inference
|
| 82 |
To obtain probabilistic outputs and uncertainty metrics, use the `mc_forward` method:
|
| 83 |
```python
|
|
|
|
| 91 |
with torch.no_grad():
|
| 92 |
# Automatically keeps Dropout active, even when in model.eval
|
| 93 |
mc_logits = model.mc_forward(
|
| 94 |
+
**inputs,
|
|
|
|
| 95 |
n_samples=N_SAMPLES,
|
| 96 |
max_batch_size=MAX_BATCH_SIZE
|
| 97 |
)
|
|
|
|
| 178 |
### Entropy-based uncertainty quantification
|
| 179 |
|
| 180 |
**Model uncertainty quantification on GoEmotions test set**
|
| 181 |
+
Flattened emotion predictions
|
| 182 |
| Mean probability vs Epistemic | Mean probability vs Aleatoric |
|
| 183 |
| :---: | :---: |
|
| 184 |
|  |  |
|