MaximilianWeiland

update model card

a99d025 about 1 month ago

3.66 kB

	---
	language:
	- en
	license: mit
	tags:
	- roberta
	- token-classification
	- sequence-labeling
	- political-science
	- social-groups
	- parliamentary-debates
	base_model: FacebookAI/roberta-base
	pipeline_tag: token-classification
	library_name: transformers
	metrics:
	- seqeval
	---

	# RoBERTa Group Mention Detector

	A RoBERTa-base token classification model fine-tuned to detect social group mentions in political text.

	This model is part of the [`group-appeal-detector`](https://github.com/MaximilianWeiland/group_appeal_detector) package, which also provides stance classification and mention clustering.

	## Model Details

	- Base model: `roberta-base`
	- Task: Sequence Labeling (BIO tagging)
	- Labels: `B-socialgroup`, `I-socialgroup`, `O`
	- Training data: 5,000 manually annotated sentences from UK House of Commons parliamentary debates (2010–2019), augmented with 25% synthetic paraphrases generated by a GPT-5-nano model

	## Performance

	Cross-validated performance evaluated with [seqeval](https://github.com/chakki-works/seqeval) (95% confidence intervals based on the estimated standard error across folds in brackets):

	\| Seqeval-Metric \| Score \|
	\|-------------------\|-------------\|
	\| F1 \| 0.82 [0.82, 0.83] \|
	\| Precision \| 0.80 [0.79, 0.81] \|
	\| Recall \| 0.84 [0.83, 0.85] \|

	## Usage

	### Via `group-appeal-detector` package (recommended)

	```bash
	pip install group-appeal-detector
	```

	```python
	from group_appeal_detector import GroupAppealDetector

	detector = GroupAppealDetector(device="cpu")

	sentence = "Our party supports the interests of young people and working families."
	mentions = detector.detect_mentions(sentence)

	for m in mentions:
	print(m["span"], m["start"], m["end"])
	```

	For batch processing:

	```python
	sentences = [
	"Farmers must earn more money.",
	"The government must do more to protect the women living in this country.",
	]
	results_df = detector.detect_mentions_batch(sentences, batch_size=8, as_df=True)
	```

	### Direct usage with Transformers

	```python
	from transformers import pipeline

	pipe = pipeline(
	"token-classification",
	model="maxwlnd/roberta_group_mention_detector",
	aggregation_strategy="simple",
	)

	sentence = "Our party supports the interests of young people and working families."
	results = pipe(sentence)

	for r in results:
	print(r["word"], r["entity_group"], round(r["score"], 3))
	```

	## Related Models

	This model is one of three models in the group appeal detection pipeline:

	\| Model \| Task \|
	\|---\|---\|
	\| [`maxwlnd/roberta_group_mention_detector`](https://huggingface.co/maxwlnd/roberta_group_mention_detector) \| Detect social group mentions (this model) \|
	\| [`maxwlnd/socialgroup_stance_classification_nli`](https://huggingface.co/maxwlnd/socialgroup_stance_classification_nli) \| Classify stance toward a group as positive, negative, or neutral \|
	\| [`maxwlnd/cl_mention_embedding`](https://huggingface.co/maxwlnd/cl_mention_embedding) \| Embed mentions for clustering into qualitative categories \|

	## Conceptual Background

	The definitions of social group and social group appeal are inspired by [Lena Maria Huber and Alona O. Dolinsky](https://osf.io/preprints/osf/szaqw_v1) and [Will Horne, Alona O. Dolinsky and Lena Maria Huber](https://osf.io/preprints/osf/fp2h3_v3).

	A social group is a segment of society whose members share common sociodemographic traits that are ascriptive and/or acquired. A reference to a social group in text is a group mention. A group appeal is an intentional act by a political actor that associates them with a social group in a supportive or critical manner.

	## License

	MIT