File size: 3,662 Bytes

---
language:
  - en
license: mit
tags:
  - roberta
  - token-classification
  - sequence-labeling
  - political-science
  - social-groups
  - parliamentary-debates
base_model: FacebookAI/roberta-base
pipeline_tag: token-classification
library_name: transformers
metrics:
  - seqeval
---

# RoBERTa Group Mention Detector

A RoBERTa-base token classification model fine-tuned to detect **social group mentions** in political text.

This model is part of the [`group-appeal-detector`](https://github.com/MaximilianWeiland/group_appeal_detector) package, which also provides stance classification and mention clustering.

## Model Details

- **Base model:** `roberta-base`
- **Task:** Sequence Labeling (BIO tagging)
- **Labels:** `B-socialgroup`, `I-socialgroup`, `O`
- **Training data:** 5,000 manually annotated sentences from UK House of Commons parliamentary debates (2010–2019), augmented with 25% synthetic paraphrases generated by a GPT-5-nano model

## Performance

Cross-validated performance evaluated with [seqeval](https://github.com/chakki-works/seqeval) (95% confidence intervals based on the estimated standard error across folds in brackets):

| Seqeval-Metric    | Score       |
|-------------------|-------------|
| F1        | 0.82 [0.82, 0.83] |
| Precision | 0.80 [0.79, 0.81] |
| Recall    | 0.84 [0.83, 0.85] |

## Usage

### Via `group-appeal-detector` package (recommended)

```bash
pip install group-appeal-detector
```

```python
from group_appeal_detector import GroupAppealDetector

detector = GroupAppealDetector(device="cpu")

sentence = "Our party supports the interests of young people and working families."
mentions = detector.detect_mentions(sentence)

for m in mentions:
    print(m["span"], m["start"], m["end"])
```

For batch processing:

```python
sentences = [
    "Farmers must earn more money.",
    "The government must do more to protect the women living in this country.",
]
results_df = detector.detect_mentions_batch(sentences, batch_size=8, as_df=True)
```

### Direct usage with Transformers

```python
from transformers import pipeline

pipe = pipeline(
    "token-classification",
    model="maxwlnd/roberta_group_mention_detector",
    aggregation_strategy="simple",
)

sentence = "Our party supports the interests of young people and working families."
results = pipe(sentence)

for r in results:
    print(r["word"], r["entity_group"], round(r["score"], 3))
```

## Related Models

This model is one of three models in the group appeal detection pipeline:

| Model | Task |
|---|---|
| [`maxwlnd/roberta_group_mention_detector`](https://huggingface.co/maxwlnd/roberta_group_mention_detector) | Detect social group mentions (this model) |
| [`maxwlnd/socialgroup_stance_classification_nli`](https://huggingface.co/maxwlnd/socialgroup_stance_classification_nli) | Classify stance toward a group as positive, negative, or neutral |
| [`maxwlnd/cl_mention_embedding`](https://huggingface.co/maxwlnd/cl_mention_embedding) | Embed mentions for clustering into qualitative categories |

## Conceptual Background

The definitions of social group and social group appeal are inspired by [Lena Maria Huber and Alona O. Dolinsky](https://osf.io/preprints/osf/szaqw_v1) and [Will Horne, Alona O. Dolinsky and Lena Maria Huber](https://osf.io/preprints/osf/fp2h3_v3).

A **social group** is a segment of society whose members share common sociodemographic traits that are ascriptive and/or acquired. A reference to a social group in text is a **group mention**. A **group appeal** is an intentional act by a political actor that associates them with a social group in a supportive or critical manner.

## License

MIT