Instructions to use bardsai/twitter-sentiment-pl-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bardsai/twitter-sentiment-pl-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="bardsai/twitter-sentiment-pl-base")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("bardsai/twitter-sentiment-pl-base") model = AutoModelForSequenceClassification.from_pretrained("bardsai/twitter-sentiment-pl-base") - Notebooks
- Google Colab
- Kaggle
# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("bardsai/twitter-sentiment-pl-base")
model = AutoModelForSequenceClassification.from_pretrained("bardsai/twitter-sentiment-pl-base")Twitter Sentiment PL (base)
Twitter Sentiment PL (base) is a Polish-language sentiment analysis model fine-tuned from allegro/herbert-base-cased on a Polish translation of the TweetEval dataset (Barbieri et al., 2020). It predicts one of three sentiment classes for short, tweet-style Polish text.
Model Details
- Developed by: bards.ai
- Model type: Transformer encoder (BERT-style) fine-tuned for sequence classification
- Language: Polish (
pl) - License: CC BY 4.0 (inherited from the base model)
- Finetuned from: allegro/herbert-base-cased
- Labels:
positive,negative,neutral
Intended Uses & Limitations
Intended uses
- Sentiment analysis of Polish short-form social media text (tweets, comments, short posts).
- Research and prototyping for Polish-language NLP applications.
Out-of-scope / limitations
- The model was trained on a machine-translated version of TweetEval, so it inherits translation artifacts and may underperform on idiomatic Polish that differs in style from the translated training data.
- Performance on long-form text, formal Polish (news, legal, medical), or non-Twitter domains is not guaranteed.
- Like any sentiment model trained on social media, predictions may reflect biases present in the source data. Do not use as the sole signal in moderation, hiring, or other high-stakes decisions.
How to Use
With the pipeline API:
from transformers import pipeline
nlp = pipeline("sentiment-analysis", model="bardsai/twitter-sentiment-pl-base")
nlp("Nigdy przegrana nie sprawiła mi takiej radości. Szczęście i Opatrzność mają znaczenie Gratuluje @pzpn_pl")
# [{'label': 'positive', 'score': 0.9997233748435974}]
Or loading the model and tokenizer directly:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("bardsai/twitter-sentiment-pl-base")
model = AutoModelForSequenceClassification.from_pretrained("bardsai/twitter-sentiment-pl-base")
Training
- Base model: allegro/herbert-base-cased
- Training data: TweetEval (sentiment subset) machine-translated into Polish.
- Epochs: 10
- Hardware: Single NVIDIA RTX 3090
Evaluation
Evaluated on the held-out test split (translated TweetEval, sentiment task) on an RTX 3090.
| Metric | Value |
|---|---|
| F1 (macro) | 0.658 |
| Precision (macro) | 0.655 |
| Recall (macro) | 0.662 |
| Accuracy | 0.662 |
| Samples per second | 129.9 |
License
This model is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) license, inherited from the base model allegro/herbert-base-cased, which is also distributed under CC BY 4.0.
You are free to share and adapt the model, including for commercial use, provided you give appropriate credit to:
- HerBERT — Allegro ML Research and the Linguistic Engineering Group at the Institute of Computer Science, Polish Academy of Sciences.
- Twitter Sentiment PL (base) — bards.ai.
Citation
If you use this model, please cite HerBERT and TweetEval:
@inproceedings{mroczkowski-etal-2021-herbert,
title = "{H}er{BERT}: Efficiently Pretrained Transformer-based Language Model for {P}olish",
author = "Mroczkowski, Robert and Rybak, Piotr and Wr{\'o}blewska, Alina and Gawlik, Ireneusz",
booktitle = "Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing",
year = "2021",
publisher = "Association for Computational Linguistics",
pages = "1--10",
}
@inproceedings{barbieri-etal-2020-tweeteval,
title = "{T}weet{E}val: Unified Benchmark and Comparative Evaluation for Tweet Classification",
author = "Barbieri, Francesco and Camacho-Collados, Jose and Espinosa Anke, Luis and Neves, Leonardo",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
year = "2020",
publisher = "Association for Computational Linguistics",
pages = "1644--1650",
}
Changelog
- 2022-12-01 — Initial release
- 2023-07-19 — Improvement of translation quality
- 2026-05-25 — Model card updated: license metadata (CC BY 4.0) and structure aligned with Hugging Face model card guidelines
About bards.ai
At bards.ai we focus on providing machine learning expertise to our partners, particularly in NLP, computer vision and time series analysis. Our team is based in Wrocław, Poland.
If you use our model we'd love to hear about it. For questions or collaboration, contact us at info@bards.ai.
- Downloads last month
- 1,235
Model tree for bardsai/twitter-sentiment-pl-base
Base model
allegro/herbert-base-casedDataset used to train bardsai/twitter-sentiment-pl-base
Collection including bardsai/twitter-sentiment-pl-base
Evaluation results
- F1 (macro) on TweetEval (translated to Polish)self-reported0.658
- Precision (macro) on TweetEval (translated to Polish)self-reported0.655
- Recall (macro) on TweetEval (translated to Polish)self-reported0.662
- Accuracy on TweetEval (translated to Polish)self-reported0.662
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="bardsai/twitter-sentiment-pl-base")