How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-classification", model="bardsai/twitter-sentiment-pl-base")
# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("bardsai/twitter-sentiment-pl-base")
model = AutoModelForSequenceClassification.from_pretrained("bardsai/twitter-sentiment-pl-base")
Quick Links

Twitter Sentiment PL (base)

Twitter Sentiment PL (base) is a Polish-language sentiment analysis model fine-tuned from allegro/herbert-base-cased on a Polish translation of the TweetEval dataset (Barbieri et al., 2020). It predicts one of three sentiment classes for short, tweet-style Polish text.

Model Details

  • Developed by: bards.ai
  • Model type: Transformer encoder (BERT-style) fine-tuned for sequence classification
  • Language: Polish (pl)
  • License: CC BY 4.0 (inherited from the base model)
  • Finetuned from: allegro/herbert-base-cased
  • Labels: positive, negative, neutral

Intended Uses & Limitations

Intended uses

  • Sentiment analysis of Polish short-form social media text (tweets, comments, short posts).
  • Research and prototyping for Polish-language NLP applications.

Out-of-scope / limitations

  • The model was trained on a machine-translated version of TweetEval, so it inherits translation artifacts and may underperform on idiomatic Polish that differs in style from the translated training data.
  • Performance on long-form text, formal Polish (news, legal, medical), or non-Twitter domains is not guaranteed.
  • Like any sentiment model trained on social media, predictions may reflect biases present in the source data. Do not use as the sole signal in moderation, hiring, or other high-stakes decisions.

How to Use

With the pipeline API:

from transformers import pipeline

nlp = pipeline("sentiment-analysis", model="bardsai/twitter-sentiment-pl-base")
nlp("Nigdy przegrana nie sprawiła mi takiej radości. Szczęście i Opatrzność mają znaczenie Gratuluje @pzpn_pl")
# [{'label': 'positive', 'score': 0.9997233748435974}]

Or loading the model and tokenizer directly:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("bardsai/twitter-sentiment-pl-base")
model = AutoModelForSequenceClassification.from_pretrained("bardsai/twitter-sentiment-pl-base")

Training

Evaluation

Evaluated on the held-out test split (translated TweetEval, sentiment task) on an RTX 3090.

Metric Value
F1 (macro) 0.658
Precision (macro) 0.655
Recall (macro) 0.662
Accuracy 0.662
Samples per second 129.9

License

This model is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) license, inherited from the base model allegro/herbert-base-cased, which is also distributed under CC BY 4.0.

You are free to share and adapt the model, including for commercial use, provided you give appropriate credit to:

  • HerBERT — Allegro ML Research and the Linguistic Engineering Group at the Institute of Computer Science, Polish Academy of Sciences.
  • Twitter Sentiment PL (base) — bards.ai.

Citation

If you use this model, please cite HerBERT and TweetEval:

@inproceedings{mroczkowski-etal-2021-herbert,
    title = "{H}er{BERT}: Efficiently Pretrained Transformer-based Language Model for {P}olish",
    author = "Mroczkowski, Robert and Rybak, Piotr and Wr{\'o}blewska, Alina and Gawlik, Ireneusz",
    booktitle = "Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing",
    year = "2021",
    publisher = "Association for Computational Linguistics",
    pages = "1--10",
}

@inproceedings{barbieri-etal-2020-tweeteval,
    title = "{T}weet{E}val: Unified Benchmark and Comparative Evaluation for Tweet Classification",
    author = "Barbieri, Francesco and Camacho-Collados, Jose and Espinosa Anke, Luis and Neves, Leonardo",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    year = "2020",
    publisher = "Association for Computational Linguistics",
    pages = "1644--1650",
}

Changelog

  • 2022-12-01 — Initial release
  • 2023-07-19 — Improvement of translation quality
  • 2026-05-25 — Model card updated: license metadata (CC BY 4.0) and structure aligned with Hugging Face model card guidelines

About bards.ai

At bards.ai we focus on providing machine learning expertise to our partners, particularly in NLP, computer vision and time series analysis. Our team is based in Wrocław, Poland.

If you use our model we'd love to hear about it. For questions or collaboration, contact us at info@bards.ai.

Downloads last month
1,235
Safetensors
Model size
0.1B params
Tensor type
I64
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bardsai/twitter-sentiment-pl-base

Finetuned
(12)
this model

Dataset used to train bardsai/twitter-sentiment-pl-base

Collection including bardsai/twitter-sentiment-pl-base

Evaluation results

  • F1 (macro) on TweetEval (translated to Polish)
    self-reported
    0.658
  • Precision (macro) on TweetEval (translated to Polish)
    self-reported
    0.655
  • Recall (macro) on TweetEval (translated to Polish)
    self-reported
    0.662
  • Accuracy on TweetEval (translated to Polish)
    self-reported
    0.662