MaximilianWeiland
update model card
a99d025
---
language:
- en
license: mit
tags:
- roberta
- token-classification
- sequence-labeling
- political-science
- social-groups
- parliamentary-debates
base_model: FacebookAI/roberta-base
pipeline_tag: token-classification
library_name: transformers
metrics:
- seqeval
---
# RoBERTa Group Mention Detector
A RoBERTa-base token classification model fine-tuned to detect **social group mentions** in political text.
This model is part of the [`group-appeal-detector`](https://github.com/MaximilianWeiland/group_appeal_detector) package, which also provides stance classification and mention clustering.
## Model Details
- **Base model:** `roberta-base`
- **Task:** Sequence Labeling (BIO tagging)
- **Labels:** `B-socialgroup`, `I-socialgroup`, `O`
- **Training data:** 5,000 manually annotated sentences from UK House of Commons parliamentary debates (2010–2019), augmented with 25% synthetic paraphrases generated by a GPT-5-nano model
## Performance
Cross-validated performance evaluated with [seqeval](https://github.com/chakki-works/seqeval) (95% confidence intervals based on the estimated standard error across folds in brackets):
| Seqeval-Metric | Score |
|-------------------|-------------|
| F1 | 0.82 [0.82, 0.83] |
| Precision | 0.80 [0.79, 0.81] |
| Recall | 0.84 [0.83, 0.85] |
## Usage
### Via `group-appeal-detector` package (recommended)
```bash
pip install group-appeal-detector
```
```python
from group_appeal_detector import GroupAppealDetector
detector = GroupAppealDetector(device="cpu")
sentence = "Our party supports the interests of young people and working families."
mentions = detector.detect_mentions(sentence)
for m in mentions:
print(m["span"], m["start"], m["end"])
```
For batch processing:
```python
sentences = [
"Farmers must earn more money.",
"The government must do more to protect the women living in this country.",
]
results_df = detector.detect_mentions_batch(sentences, batch_size=8, as_df=True)
```
### Direct usage with Transformers
```python
from transformers import pipeline
pipe = pipeline(
"token-classification",
model="maxwlnd/roberta_group_mention_detector",
aggregation_strategy="simple",
)
sentence = "Our party supports the interests of young people and working families."
results = pipe(sentence)
for r in results:
print(r["word"], r["entity_group"], round(r["score"], 3))
```
## Related Models
This model is one of three models in the group appeal detection pipeline:
| Model | Task |
|---|---|
| [`maxwlnd/roberta_group_mention_detector`](https://huggingface.co/maxwlnd/roberta_group_mention_detector) | Detect social group mentions (this model) |
| [`maxwlnd/socialgroup_stance_classification_nli`](https://huggingface.co/maxwlnd/socialgroup_stance_classification_nli) | Classify stance toward a group as positive, negative, or neutral |
| [`maxwlnd/cl_mention_embedding`](https://huggingface.co/maxwlnd/cl_mention_embedding) | Embed mentions for clustering into qualitative categories |
## Conceptual Background
The definitions of social group and social group appeal are inspired by [Lena Maria Huber and Alona O. Dolinsky](https://osf.io/preprints/osf/szaqw_v1) and [Will Horne, Alona O. Dolinsky and Lena Maria Huber](https://osf.io/preprints/osf/fp2h3_v3).
A **social group** is a segment of society whose members share common sociodemographic traits that are ascriptive and/or acquired. A reference to a social group in text is a **group mention**. A **group appeal** is an intentional act by a political actor that associates them with a social group in a supportive or critical manner.
## License
MIT