Instructions to use abjasser/bayan-arabic-bias-arbertv2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use abjasser/bayan-arabic-bias-arbertv2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="abjasser/bayan-arabic-bias-arbertv2")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("abjasser/bayan-arabic-bias-arbertv2") model = AutoModelForSequenceClassification.from_pretrained("abjasser/bayan-arabic-bias-arbertv2") - Notebooks
- Google Colab
- Kaggle
Bayan — Arabic Media Bias Classifier (ARBERTv2)
Bayan is a fine-tuned Arabic BERT model that classifies Arabic news
sentences as Biased or Non-biased. It is the inference model
behind the Bayan graduation project, which surfaces media-bias signals
inside a Chrome extension and a FastAPI service.
| Base model | UBC-NLP/ARBERTv2 |
| Task | Binary text classification (Arabic media bias) |
| Labels | 0 = Non-biased, 1 = Biased |
| Max sequence length | 128 tokens |
| Test accuracy | 0.818 |
| Test F1 (macro) | 0.817 |
| Training samples | 1,171 |
| Test samples | 297 |
How it was built
Cross-lingual transfer from English
There is no large, high-quality Arabic media-bias dataset publicly available. Bayan addresses this with a cross-lingual transfer approach:
- We started from established English media-bias datasets where each sentence is labelled biased / non-biased.
- Each English sentence was translated into Arabic by multiple LLM translators (GPT-4-class and Claude-class systems among others).
- We compared translations and kept only sentences where the strongest translators agreed — a consensus filter that trades dataset size for label fidelity.
- The resulting "GPT + Claude consensus" Arabic split was used to fine-tune ARBERTv2 with standard sequence-classification training.
Why GPT + Claude
We benchmarked several translator combinations during the project. The GPT + Claude consensus split produced the best downstream macro-F1, so it was used to train the final model released here.
Intended use
- Research and educational analysis of Arabic news writing.
- A signal, not a verdict, surfaced inside reading tools (e.g. browser extensions) to help readers reflect on framing.
- Inference-time downstream pipelines that already include human review.
Out of scope
- Automated moderation, suspension, ranking, or labelling of news outlets or journalists.
- Legal, defamation, or compliance decisions.
- Languages other than Modern Standard Arabic.
- Long-form documents — the model is trained at the sentence level (≤ 128 tokens). Pass longer text by splitting into sentences and aggregating.
Limitations
- Conservative bias. The model has very high precision on
Biasedbut lower recall — it tends to sayNon-biasedwhen uncertain. TreatNon-biasedoutputs as "no clear signal", not "definitely neutral". - Translation artefacts. Training data originated in English; subtle rhetorical bias unique to Arabic (e.g. classical register choices, dialect framing) may be under-represented.
- Sentence-level scope. It does not score whole articles — sentence-level predictions can disagree with a holistic read of a piece.
- Domain shift. Trained primarily on news-style sentences; performance degrades on social-media or opinion-blog text.
Quick start
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
REPO_ID = "abjasser/bayan-arabic-bias-arbertv2"
tok = AutoTokenizer.from_pretrained(REPO_ID)
model = AutoModelForSequenceClassification.from_pretrained(REPO_ID).eval()
text = "أصر الوزير على وصف السياسات بالناجحة رغم المؤشرات التي تشير إلى العكس."
enc = tok(text, return_tensors="pt", truncation=True, padding="max_length", max_length=128)
with torch.no_grad():
probs = torch.softmax(model(**enc).logits, dim=-1).squeeze().tolist()
id2label = model.config.id2label
print({id2label[i]: round(p, 4) for i, p in enumerate(probs)})
# -> {'Non-biased': 0.0623, 'Biased': 0.9377}
Evaluation
Held-out test set (297 sentences):
| Metric | Score |
|---|---|
| Accuracy | 0.8182 |
| Macro F1 | 0.8174 |
| Final train loss | 0.2051 |
(See model_card.json in this repo for the original training metadata.)
Citation
If you use Bayan in academic work, please cite the project as:
@misc{bayan2026,
title = {Bayan: An Arabic Media Bias Classifier via Cross-Lingual Consensus Translation},
author = {Jasser, Abdulrahman B.},
year = {2026},
howpublished = {Hugging Face model repository},
url = {https://huggingface.co/abjasser/bayan-arabic-bias-arbertv2}
}
License
Released under the Apache 2.0 license. You're free to use, modify, and redistribute the model — please retain the citation and the limitations section.
The base model UBC-NLP/ARBERTv2 is governed by its own license; consult
its model card for details.
- Downloads last month
- 21
Model tree for abjasser/bayan-arabic-bias-arbertv2
Base model
UBC-NLP/ARBERTv2