--- language: - en license: mit tags: - roberta - token-classification - sequence-labeling - political-science - social-groups - parliamentary-debates base_model: FacebookAI/roberta-base pipeline_tag: token-classification library_name: transformers metrics: - seqeval --- # RoBERTa Group Mention Detector A RoBERTa-base token classification model fine-tuned to detect **social group mentions** in political text. This model is part of the [`group-appeal-detector`](https://github.com/MaximilianWeiland/group_appeal_detector) package, which also provides stance classification and mention clustering. ## Model Details - **Base model:** `roberta-base` - **Task:** Sequence Labeling (BIO tagging) - **Labels:** `B-socialgroup`, `I-socialgroup`, `O` - **Training data:** 5,000 manually annotated sentences from UK House of Commons parliamentary debates (2010–2019), augmented with 25% synthetic paraphrases generated by a GPT-5-nano model ## Performance Cross-validated performance evaluated with [seqeval](https://github.com/chakki-works/seqeval) (95% confidence intervals based on the estimated standard error across folds in brackets): | Seqeval-Metric | Score | |-------------------|-------------| | F1 | 0.82 [0.82, 0.83] | | Precision | 0.80 [0.79, 0.81] | | Recall | 0.84 [0.83, 0.85] | ## Usage ### Via `group-appeal-detector` package (recommended) ```bash pip install group-appeal-detector ``` ```python from group_appeal_detector import GroupAppealDetector detector = GroupAppealDetector(device="cpu") sentence = "Our party supports the interests of young people and working families." mentions = detector.detect_mentions(sentence) for m in mentions: print(m["span"], m["start"], m["end"]) ``` For batch processing: ```python sentences = [ "Farmers must earn more money.", "The government must do more to protect the women living in this country.", ] results_df = detector.detect_mentions_batch(sentences, batch_size=8, as_df=True) ``` ### Direct usage with Transformers ```python from transformers import pipeline pipe = pipeline( "token-classification", model="maxwlnd/roberta_group_mention_detector", aggregation_strategy="simple", ) sentence = "Our party supports the interests of young people and working families." results = pipe(sentence) for r in results: print(r["word"], r["entity_group"], round(r["score"], 3)) ``` ## Related Models This model is one of three models in the group appeal detection pipeline: | Model | Task | |---|---| | [`maxwlnd/roberta_group_mention_detector`](https://huggingface.co/maxwlnd/roberta_group_mention_detector) | Detect social group mentions (this model) | | [`maxwlnd/socialgroup_stance_classification_nli`](https://huggingface.co/maxwlnd/socialgroup_stance_classification_nli) | Classify stance toward a group as positive, negative, or neutral | | [`maxwlnd/cl_mention_embedding`](https://huggingface.co/maxwlnd/cl_mention_embedding) | Embed mentions for clustering into qualitative categories | ## Conceptual Background The definitions of social group and social group appeal are inspired by [Lena Maria Huber and Alona O. Dolinsky](https://osf.io/preprints/osf/szaqw_v1) and [Will Horne, Alona O. Dolinsky and Lena Maria Huber](https://osf.io/preprints/osf/fp2h3_v3). A **social group** is a segment of society whose members share common sociodemographic traits that are ascriptive and/or acquired. A reference to a social group in text is a **group mention**. A **group appeal** is an intentional act by a political actor that associates them with a social group in a supportive or critical manner. ## License MIT