mediabiasgroup
/

roberta-babe-ft

Text Classification

paper:2209.14557

text-embeddings-inference

Model card Files Files and versions

roberta-babe-ft / README.md

bitwise31337's picture

Update README.md

e8fa6c6 verified 5 months ago

|

history blame contribute delete

2.44 kB

	---
	license: cc-by-nc-4.0
	pipeline_tag: text-classification
	library_name: transformers
	language: [en]
	tags:
	- media-bias
	- lexical-bias
	- babe
	- paper:2209.14557
	datasets:
	- mediabiasgroup/BABE
	base_model: roberta-base
	---

	# RoBERTa — BABE — HA-FT

	This repository provides a RoBERTa-base model fine-tuned on the BABE (Bias Annotations By Experts) dataset for sentence-level lexical/loaded-language bias detection in English news text. BABE was introduced in the paper [Neural Media Bias Detection Using Distant Supervision With BABE – Bias Annotations By Experts](https://arxiv.org/abs/2209.14557).

	Labels
	- `0` → neutral / non-lexical-bias
	- `1` → lexical-bias

	## Intended use & limitations
	- Intended use: research and benchmarking of lexical bias at the sentence level on news-like English text.
	- Out-of-scope: detection of informational/selection bias, stance, political leaning, or factuality; production deployments without human oversight.

	## How to use
	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	m = "mediabiasgroup/roberta-babe-ft"
	tok = AutoTokenizer.from_pretrained(m)
	model = AutoModelForSequenceClassification.from_pretrained(m)

	text = "Democrats shamelessly rammed the bill through Congress."
	probs = model(**tok(text, return_tensors="pt")).logits.softmax(-1).tolist()[0]
	print({"neutral": probs[0], "lexical_bias": probs[1]})
	```

	## Training data & setup
	- Data: BABE (expert-annotated, sentence-level lexical bias).
	- Backbone: `roberta-base` with a standard sequence-classification head.
	- Training: single-run fine-tuning; standard hyperparameters (update with your exact config if desired).

	## Safety, bias & ethics
	Media-bias perception is subjective and context-dependent. This model may over-flag emotionally charged wording. Keep a human in the loop and avoid punitive or outlet-level decisions without careful validation.

	## Citation
	If you use this model or the dataset, please cite:

	```bibtex
	@article{spinde2022neural,
	title = {Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By Experts},
	author = {Spinde, Timo and Plank, Manuel and Krieger, Jan-David and Ruas, Terry and Gipp, Bela and Aizawa, Akiko},
	journal = {arXiv preprint arXiv:2209.14557},
	year = {2022},
	url = {https://arxiv.org/abs/2209.14557}
	}
	```