Polish Twitter Emotion Classifier (ONNX)

This is the ONNX FP32 version of yazoniak/twitter-emotion-pl-classifier

This model is an ONNX-converted version of the Polish Twitter Emotion Classifier, optimized for 2x faster inference while maintaining full accuracy. The model was converted using Hugging Face Optimum.

Quick Links

🔗 Original PyTorch Model: yazoniak/twitter-emotion-pl-classifier
🚀 INT8 Quantized Version: yazoniak/twitter-emotion-pl-classifier-onnx-int8 (75% smaller, 3x faster)
📊 Dataset: yazoniak/TwitterEmo-PL-Refined

Model Description

This model predicts 8 emotion and sentiment labels simultaneously for Polish text:

Emotions: radość (joy), wstręt (disgust), gniew (anger), przeczuwanie (anticipation)
Sentiment: pozytywny (positive), negatywny (negative), neutralny (neutral)
Special: sarkazm (sarcasm)

Model Details

Attribute	Value
Base Model	PKOBP/polish-roberta-8k
Original Model	yazoniak/twitter-emotion-pl-classifier
Architecture	RoBERTa for Sequence Classification
Task	Multi-label text classification
Language	Polish
Format	ONNX (FP32)
ONNX Opset	18
Model Size	1.7 GB
License	GPL-3.0

Performance

ONNX vs PyTorch Comparison

Metric	PyTorch	ONNX FP32	Improvement
Mean Latency (CPU)	110.71 ms	55.28 ms	2.00x faster
P95 Latency	116.11 ms	56.70 ms	2.05x faster
Throughput	9.03/sec	18.09/sec	2.00x
Std Deviation	5.25 ms	0.69 ms	7.6x more consistent
Model Size	1.7 GB	1.7 GB	Same

Note: ONNX has slower cold start (2.6s vs 0.3s) but significantly faster inference.

Model Accuracy

The ONNX model maintains the same accuracy as the original PyTorch model:

Metric	Score
F1 Macro	0.8500
F1 Micro	0.8900
F1 Weighted	0.8895
Exact Match Accuracy	0.5125

For detailed per-label performance, see the original model card.

Numerical Validation

✅ Structural Validation: Passed ONNX checker
✅ Numerical Accuracy: All tests passed
- Max absolute difference: 5.65e-06
- Max relative difference: 1.93e-04

Installation

pip install optimum[onnxruntime] transformers numpy

For GPU support:

pip install optimum[onnxruntime-gpu] transformers numpy

Usage

Quick Start (Command Line)

# Download the inference scripts
wget https://huggingface.co/yazoniak/twitter-emotion-pl-classifier-onnx/resolve/main/predict.py
wget https://huggingface.co/yazoniak/twitter-emotion-pl-classifier-onnx/resolve/main/predict_calibrated.py

# Basic inference
python predict.py "Wspaniały dzień! Jestem bardzo szczęśliwy :)"

# Calibrated inference (recommended for best accuracy)
python predict_calibrated.py "Wspaniały dzień! Jestem bardzo szczęśliwy :)"

Python API - Basic Inference

from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer
import numpy as np
import re

# Load model and tokenizer
model_name = "yazoniak/twitter-emotion-pl-classifier-onnx"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = ORTModelForSequenceClassification.from_pretrained(
    model_name,
    provider="CPUExecutionProvider"  # or "CUDAExecutionProvider" for GPU
)

# Preprocess text (anonymize @mentions - IMPORTANT!)
def preprocess_text(text):
    return re.sub(r"@\w+", "@anonymized_account", text)

text = "@user To jest wspaniały dzień!"
processed_text = preprocess_text(text)

# Tokenize and run inference
inputs = tokenizer(processed_text, return_tensors="pt", truncation=True, max_length=8192)
outputs = model(**inputs)

# Get probabilities (sigmoid for multi-label)
logits = outputs.logits.squeeze().numpy()
probabilities = 1 / (1 + np.exp(-logits))

# Get labels above threshold
labels = [model.config.id2label[i] for i in range(model.config.num_labels)]
threshold = 0.5
predictions = {labels[i]: float(probabilities[i]) 
               for i in range(len(labels)) if probabilities[i] > threshold}

print(predictions)
# Output: {'radość': 0.9758, 'pozytywny': 0.9856}

Python API - Calibrated Inference (Recommended)

For improved accuracy, use temperature scaling and optimal thresholds:

import json
from huggingface_hub import hf_hub_download

# Download calibration artifacts
calib_path = hf_hub_download(
    repo_id="yazoniak/twitter-emotion-pl-classifier-onnx",
    filename="calibration_artifacts.json"
)

with open(calib_path) as f:
    calib = json.load(f)

temperatures = calib["temperatures"]
optimal_thresholds = calib["optimal_thresholds"]

# Apply temperature scaling and optimal thresholds
calibrated_probs = {}
for i, label in enumerate(labels):
    temp = temperatures[label]
    thresh = optimal_thresholds[label]
    
    # Temperature scaling
    calibrated_logit = logits[i] / temp
    prob = 1 / (1 + np.exp(-calibrated_logit))
    
    if prob > thresh:
        calibrated_probs[label] = float(prob)

print(calibrated_probs)

GPU Inference

model = ORTModelForSequenceClassification.from_pretrained(
    "yazoniak/twitter-emotion-pl-classifier-onnx",
    provider="CUDAExecutionProvider"
)

When to Use This Model

Use ONNX FP32 when:

You need 2x faster inference than PyTorch
You want full FP32 precision
You're deploying on CPU servers
You need cross-platform compatibility

Consider alternatives:

Original PyTorch: For fine-tuning or GPU training
ONNX INT8: For even faster inference (3x) and smaller size (75% reduction)

Important Notes

Text Preprocessing

⚠️ The model expects @mentions to be anonymized!

The model was trained with anonymized Twitter mentions. Always preprocess text:

text = re.sub(r"@\w+", "@anonymized_account", text)

The provided scripts (predict.py, predict_calibrated.py) handle this automatically.

Calibration

For best accuracy, use calibrated inference with:

Temperature scaling (per-label)
Optimized thresholds (per-label)

See predict_calibrated.py or the calibrated inference example above.

Limitations

Twitter-specific: Optimized for informal Polish social media text
Sarcasm detection: Lower performance (F1: 0.53) - inherently difficult
Context length: Optimal for tweet-length texts (up to 8,192 tokens)
Formal text: May not generalize well to news or academic writing

For detailed limitations, see the original model card.

Files in This Repository

File	Size	Description
`model.onnx`	1.7 GB	ONNX model weights (FP32)
`config.json`	2 KB	Model configuration
`tokenizer.json`	8.2 MB	Tokenizer vocabulary
`tokenizer_config.json`	12 KB	Tokenizer settings
`calibration_artifacts.json`	1 KB	Temperature scaling & optimal thresholds
`predict.py`	4 KB	Simple inference script
`predict_calibrated.py`	5 KB	Calibrated inference script (recommended)

Citation

@model{yazoniak2025twitteremotionpl,
  title={Polish Twitter Emotion Classifier (RoBERTa-8k)},
  author={yazoniak},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/yazoniak/twitter-emotion-pl-classifier}
}

Also cite the dataset and base model:

@dataset{yazoniak_twitteremo_pl_refined_2025,
  title={TwitterEmo-PL-Refined: Polish Twitter Emotions (8 labels, refined)},
  author={yazoniak},
  year={2025},
  url={https://huggingface.co/datasets/yazoniak/TwitterEmo-PL-Refined}
}

@inproceedings{bogdanowicz2023twitteremo,
  title={TwitterEmo: Annotating Emotions and Sentiment in Polish Twitter},
  author={Bogdanowicz, S. and Cwynar, H. and Zwierzchowska, A. and Klamra, C. and Kiera{\'s}, W. and Kobyli{\'n}ski, {\L}.},
  booktitle={Computational Science -- ICCS 2023},
  series={Lecture Notes in Computer Science},
  volume={14074},
  publisher={Springer, Cham},
  year={2023},
  doi={10.1007/978-3-031-36021-3_20}
}

License

This model is released under the GNU General Public License v3.0 (GPL-3.0), inherited from the training dataset.

License Chain:

Base Model (PKOBP/polish-roberta-8k): Apache-2.0
Training Dataset (TwitterEmo-PL-Refined): GPL-3.0
Original Model (yazoniak/twitter-emotion-pl-classifier): GPL-3.0
This ONNX Model: GPL-3.0

Acknowledgments

Original Model: yazoniak/twitter-emotion-pl-classifier
Base Model: PKOBP/polish-roberta-8k
Dataset: CLARIN-PL TwitterEmo
Conversion: Hugging Face Optimum

Model Version: v1.0-onnx
Last Updated: 2026-01-29

Downloads last month: 2

Model tree for yazoniak/twitter-emotion-pl-classifier-ONNX

Base model

PKOBP/polish-roberta-8k

Finetuned

yazoniak/twitter-emotion-pl-classifier

Quantized

(2)

this model

Evaluation results

F1 Macro on TwitterEmo-PL-Refined
validation set self-reported

0.850
F1 Micro on TwitterEmo-PL-Refined
validation set self-reported

0.890
F1 Weighted on TwitterEmo-PL-Refined
validation set self-reported

0.889
Exact Match Accuracy on TwitterEmo-PL-Refined
validation set self-reported

0.512
Subset Accuracy on TwitterEmo-PL-Refined
validation set self-reported

0.890