Token Classification
Transformers
Safetensors
English
roberta
sequence-labeling
political-science
social-groups
parliamentary-debates
Instructions to use maxwlnd/roberta_group_mention_detector with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use maxwlnd/roberta_group_mention_detector with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="maxwlnd/roberta_group_mention_detector")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("maxwlnd/roberta_group_mention_detector") model = AutoModelForTokenClassification.from_pretrained("maxwlnd/roberta_group_mention_detector") - Notebooks
- Google Colab
- Kaggle
| language: | |
| - en | |
| license: mit | |
| tags: | |
| - roberta | |
| - token-classification | |
| - sequence-labeling | |
| - political-science | |
| - social-groups | |
| - parliamentary-debates | |
| base_model: FacebookAI/roberta-base | |
| pipeline_tag: token-classification | |
| library_name: transformers | |
| metrics: | |
| - seqeval | |
| # RoBERTa Group Mention Detector | |
| A RoBERTa-base token classification model fine-tuned to detect **social group mentions** in political text. | |
| This model is part of the [`group-appeal-detector`](https://github.com/MaximilianWeiland/group_appeal_detector) package, which also provides stance classification and mention clustering. | |
| ## Model Details | |
| - **Base model:** `roberta-base` | |
| - **Task:** Sequence Labeling (BIO tagging) | |
| - **Labels:** `B-socialgroup`, `I-socialgroup`, `O` | |
| - **Training data:** 5,000 manually annotated sentences from UK House of Commons parliamentary debates (2010–2019), augmented with 25% synthetic paraphrases generated by a GPT-5-nano model | |
| ## Performance | |
| Cross-validated performance evaluated with [seqeval](https://github.com/chakki-works/seqeval) (95% confidence intervals based on the estimated standard error across folds in brackets): | |
| | Seqeval-Metric | Score | | |
| |-------------------|-------------| | |
| | F1 | 0.82 [0.82, 0.83] | | |
| | Precision | 0.80 [0.79, 0.81] | | |
| | Recall | 0.84 [0.83, 0.85] | | |
| ## Usage | |
| ### Via `group-appeal-detector` package (recommended) | |
| ```bash | |
| pip install group-appeal-detector | |
| ``` | |
| ```python | |
| from group_appeal_detector import GroupAppealDetector | |
| detector = GroupAppealDetector(device="cpu") | |
| sentence = "Our party supports the interests of young people and working families." | |
| mentions = detector.detect_mentions(sentence) | |
| for m in mentions: | |
| print(m["span"], m["start"], m["end"]) | |
| ``` | |
| For batch processing: | |
| ```python | |
| sentences = [ | |
| "Farmers must earn more money.", | |
| "The government must do more to protect the women living in this country.", | |
| ] | |
| results_df = detector.detect_mentions_batch(sentences, batch_size=8, as_df=True) | |
| ``` | |
| ### Direct usage with Transformers | |
| ```python | |
| from transformers import pipeline | |
| pipe = pipeline( | |
| "token-classification", | |
| model="maxwlnd/roberta_group_mention_detector", | |
| aggregation_strategy="simple", | |
| ) | |
| sentence = "Our party supports the interests of young people and working families." | |
| results = pipe(sentence) | |
| for r in results: | |
| print(r["word"], r["entity_group"], round(r["score"], 3)) | |
| ``` | |
| ## Related Models | |
| This model is one of three models in the group appeal detection pipeline: | |
| | Model | Task | | |
| |---|---| | |
| | [`maxwlnd/roberta_group_mention_detector`](https://huggingface.co/maxwlnd/roberta_group_mention_detector) | Detect social group mentions (this model) | | |
| | [`maxwlnd/socialgroup_stance_classification_nli`](https://huggingface.co/maxwlnd/socialgroup_stance_classification_nli) | Classify stance toward a group as positive, negative, or neutral | | |
| | [`maxwlnd/cl_mention_embedding`](https://huggingface.co/maxwlnd/cl_mention_embedding) | Embed mentions for clustering into qualitative categories | | |
| ## Conceptual Background | |
| The definitions of social group and social group appeal are inspired by [Lena Maria Huber and Alona O. Dolinsky](https://osf.io/preprints/osf/szaqw_v1) and [Will Horne, Alona O. Dolinsky and Lena Maria Huber](https://osf.io/preprints/osf/fp2h3_v3). | |
| A **social group** is a segment of society whose members share common sociodemographic traits that are ascriptive and/or acquired. A reference to a social group in text is a **group mention**. A **group appeal** is an intentional act by a political actor that associates them with a social group in a supportive or critical manner. | |
| ## License | |
| MIT | |