Polish Twitter Emotion Classifier (ONNX)
This is the ONNX FP32 version of yazoniak/twitter-emotion-pl-classifier
This model is an ONNX-converted version of the Polish Twitter Emotion Classifier, optimized for 2x faster inference while maintaining full accuracy. The model was converted using Hugging Face Optimum.
Quick Links
- 🔗 Original PyTorch Model: yazoniak/twitter-emotion-pl-classifier
- 🚀 INT8 Quantized Version: yazoniak/twitter-emotion-pl-classifier-onnx-int8 (75% smaller, 3x faster)
- 📊 Dataset: yazoniak/TwitterEmo-PL-Refined
Model Description
This model predicts 8 emotion and sentiment labels simultaneously for Polish text:
- Emotions:
radość(joy),wstręt(disgust),gniew(anger),przeczuwanie(anticipation) - Sentiment:
pozytywny(positive),negatywny(negative),neutralny(neutral) - Special:
sarkazm(sarcasm)
Model Details
| Attribute | Value |
|---|---|
| Base Model | PKOBP/polish-roberta-8k |
| Original Model | yazoniak/twitter-emotion-pl-classifier |
| Architecture | RoBERTa for Sequence Classification |
| Task | Multi-label text classification |
| Language | Polish |
| Format | ONNX (FP32) |
| ONNX Opset | 18 |
| Model Size | 1.7 GB |
| License | GPL-3.0 |
Performance
ONNX vs PyTorch Comparison
| Metric | PyTorch | ONNX FP32 | Improvement |
|---|---|---|---|
| Mean Latency (CPU) | 110.71 ms | 55.28 ms | 2.00x faster |
| P95 Latency | 116.11 ms | 56.70 ms | 2.05x faster |
| Throughput | 9.03/sec | 18.09/sec | 2.00x |
| Std Deviation | 5.25 ms | 0.69 ms | 7.6x more consistent |
| Model Size | 1.7 GB | 1.7 GB | Same |
Note: ONNX has slower cold start (2.6s vs 0.3s) but significantly faster inference.
Model Accuracy
The ONNX model maintains the same accuracy as the original PyTorch model:
| Metric | Score |
|---|---|
| F1 Macro | 0.8500 |
| F1 Micro | 0.8900 |
| F1 Weighted | 0.8895 |
| Exact Match Accuracy | 0.5125 |
For detailed per-label performance, see the original model card.
Numerical Validation
- ✅ Structural Validation: Passed ONNX checker
- ✅ Numerical Accuracy: All tests passed
- Max absolute difference: 5.65e-06
- Max relative difference: 1.93e-04
Installation
pip install optimum[onnxruntime] transformers numpy
For GPU support:
pip install optimum[onnxruntime-gpu] transformers numpy
Usage
Quick Start (Command Line)
# Download the inference scripts
wget https://huggingface.co/yazoniak/twitter-emotion-pl-classifier-onnx/resolve/main/predict.py
wget https://huggingface.co/yazoniak/twitter-emotion-pl-classifier-onnx/resolve/main/predict_calibrated.py
# Basic inference
python predict.py "Wspaniały dzień! Jestem bardzo szczęśliwy :)"
# Calibrated inference (recommended for best accuracy)
python predict_calibrated.py "Wspaniały dzień! Jestem bardzo szczęśliwy :)"
Python API - Basic Inference
from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer
import numpy as np
import re
# Load model and tokenizer
model_name = "yazoniak/twitter-emotion-pl-classifier-onnx"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = ORTModelForSequenceClassification.from_pretrained(
model_name,
provider="CPUExecutionProvider" # or "CUDAExecutionProvider" for GPU
)
# Preprocess text (anonymize @mentions - IMPORTANT!)
def preprocess_text(text):
return re.sub(r"@\w+", "@anonymized_account", text)
text = "@user To jest wspaniały dzień!"
processed_text = preprocess_text(text)
# Tokenize and run inference
inputs = tokenizer(processed_text, return_tensors="pt", truncation=True, max_length=8192)
outputs = model(**inputs)
# Get probabilities (sigmoid for multi-label)
logits = outputs.logits.squeeze().numpy()
probabilities = 1 / (1 + np.exp(-logits))
# Get labels above threshold
labels = [model.config.id2label[i] for i in range(model.config.num_labels)]
threshold = 0.5
predictions = {labels[i]: float(probabilities[i])
for i in range(len(labels)) if probabilities[i] > threshold}
print(predictions)
# Output: {'radość': 0.9758, 'pozytywny': 0.9856}
Python API - Calibrated Inference (Recommended)
For improved accuracy, use temperature scaling and optimal thresholds:
import json
from huggingface_hub import hf_hub_download
# Download calibration artifacts
calib_path = hf_hub_download(
repo_id="yazoniak/twitter-emotion-pl-classifier-onnx",
filename="calibration_artifacts.json"
)
with open(calib_path) as f:
calib = json.load(f)
temperatures = calib["temperatures"]
optimal_thresholds = calib["optimal_thresholds"]
# Apply temperature scaling and optimal thresholds
calibrated_probs = {}
for i, label in enumerate(labels):
temp = temperatures[label]
thresh = optimal_thresholds[label]
# Temperature scaling
calibrated_logit = logits[i] / temp
prob = 1 / (1 + np.exp(-calibrated_logit))
if prob > thresh:
calibrated_probs[label] = float(prob)
print(calibrated_probs)
GPU Inference
model = ORTModelForSequenceClassification.from_pretrained(
"yazoniak/twitter-emotion-pl-classifier-onnx",
provider="CUDAExecutionProvider"
)
When to Use This Model
Use ONNX FP32 when:
- You need 2x faster inference than PyTorch
- You want full FP32 precision
- You're deploying on CPU servers
- You need cross-platform compatibility
Consider alternatives:
- Original PyTorch: For fine-tuning or GPU training
- ONNX INT8: For even faster inference (3x) and smaller size (75% reduction)
Important Notes
Text Preprocessing
⚠️ The model expects @mentions to be anonymized!
The model was trained with anonymized Twitter mentions. Always preprocess text:
text = re.sub(r"@\w+", "@anonymized_account", text)
The provided scripts (predict.py, predict_calibrated.py) handle this automatically.
Calibration
For best accuracy, use calibrated inference with:
- Temperature scaling (per-label)
- Optimized thresholds (per-label)
See predict_calibrated.py or the calibrated inference example above.
Limitations
- Twitter-specific: Optimized for informal Polish social media text
- Sarcasm detection: Lower performance (F1: 0.53) - inherently difficult
- Context length: Optimal for tweet-length texts (up to 8,192 tokens)
- Formal text: May not generalize well to news or academic writing
For detailed limitations, see the original model card.
Files in This Repository
| File | Size | Description |
|---|---|---|
model.onnx |
1.7 GB | ONNX model weights (FP32) |
config.json |
2 KB | Model configuration |
tokenizer.json |
8.2 MB | Tokenizer vocabulary |
tokenizer_config.json |
12 KB | Tokenizer settings |
calibration_artifacts.json |
1 KB | Temperature scaling & optimal thresholds |
predict.py |
4 KB | Simple inference script |
predict_calibrated.py |
5 KB | Calibrated inference script (recommended) |
Citation
@model{yazoniak2025twitteremotionpl,
title={Polish Twitter Emotion Classifier (RoBERTa-8k)},
author={yazoniak},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/yazoniak/twitter-emotion-pl-classifier}
}
Also cite the dataset and base model:
@dataset{yazoniak_twitteremo_pl_refined_2025,
title={TwitterEmo-PL-Refined: Polish Twitter Emotions (8 labels, refined)},
author={yazoniak},
year={2025},
url={https://huggingface.co/datasets/yazoniak/TwitterEmo-PL-Refined}
}
@inproceedings{bogdanowicz2023twitteremo,
title={TwitterEmo: Annotating Emotions and Sentiment in Polish Twitter},
author={Bogdanowicz, S. and Cwynar, H. and Zwierzchowska, A. and Klamra, C. and Kiera{\'s}, W. and Kobyli{\'n}ski, {\L}.},
booktitle={Computational Science -- ICCS 2023},
series={Lecture Notes in Computer Science},
volume={14074},
publisher={Springer, Cham},
year={2023},
doi={10.1007/978-3-031-36021-3_20}
}
License
This model is released under the GNU General Public License v3.0 (GPL-3.0), inherited from the training dataset.
License Chain:
- Base Model (PKOBP/polish-roberta-8k): Apache-2.0
- Training Dataset (TwitterEmo-PL-Refined): GPL-3.0
- Original Model (yazoniak/twitter-emotion-pl-classifier): GPL-3.0
- This ONNX Model: GPL-3.0
Acknowledgments
- Original Model: yazoniak/twitter-emotion-pl-classifier
- Base Model: PKOBP/polish-roberta-8k
- Dataset: CLARIN-PL TwitterEmo
- Conversion: Hugging Face Optimum
Model Version: v1.0-onnx
Last Updated: 2026-01-29
- Downloads last month
- 10
Model tree for yazoniak/twitter-emotion-pl-classifier-ONNX
Base model
PKOBP/polish-roberta-8kEvaluation results
- F1 Macro on TwitterEmo-PL-Refinedvalidation set self-reported0.850
- F1 Micro on TwitterEmo-PL-Refinedvalidation set self-reported0.890
- F1 Weighted on TwitterEmo-PL-Refinedvalidation set self-reported0.889
- Exact Match Accuracy on TwitterEmo-PL-Refinedvalidation set self-reported0.512
- Subset Accuracy on TwitterEmo-PL-Refinedvalidation set self-reported0.890