File size: 11,607 Bytes

---
library_name: transformers
tags:
- group-detection
- political-science
- multilingual
- multilabel-classification
- deberta
- group-appeals
language:
- en
- de
- nl
- da
- fr
- es
- it
- sv
base_model: microsoft/mdeberta-v3-base
---

# Model Card for mDeBERTa Group Detection

A multilingual group classification model fine-tuned for classifying social group tokens into meaningful social groups categories in political text.

## Model Details

### Model Description

This model is a fine-tuned mDeBERTa-v3-base that performs multilabel classification to classify social group tokens mentioned in political text into meaningful social groups categories. The model can classify a token into multiple group categories simultaneously to support intersectionality, and was trained on political manifesto data.

- **Developed by:** Will Horne, Alona O. Dolinsky and Lena Maria Huber
- **Model type:** Multilabel Sequence Classification
- **Language(s) (NLP):** English, German (multilingual)
- **Finetuned from model:** microsoft/mdeberta-v3-base

### Model Sources

- **Repository:** rwillh11/mdeberta_groups_2.0
- **Base Model:** [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base)

## Uses

### Direct Use

The model is designed for researchers analyzing political discourse to automatically classify **social group tokens or phrases** into meaningful social group categories. It takes individual group mentions (e.g., "workers", "students", "citizens") as input and outputs predictions for 44 different group categories:

- Adults, Caregivers, Children, Citizens, Civil servants, Consumers
- Crime victims, Criminals, Education professionals, Elderly people
- Employees and workers, Employers and business owners, Ethnic and national communities
- Families, Farmers, Health professionals, Homeless people, Homeowners and landowners
- Investors and stakeholders, Landlords, Law enforcement personnel, LGBTQI
- Lower class, Manual and service workers, Men, Middle class, Migrants and refugees
- Military personnel, Patients, People with disabilities, Politicians, Religious communities
- Road users, Rural communities, Sociocultural professionals, Students, Taxpayers
- Tenants, Unemployed, Upper class, White collar workers, Women, Young people
- and a residual category of "Other"

### Downstream Use

This model can be integrated into larger political text analysis pipelines for:
- **Step 2 of group analysis**: After extracting group mentions from text, classify them into meaningful categories
- Political manifestos analysis and group categorization
- Comparative political research across countries and languages
- Social group representation studies with consistent categorization

### Out-of-Scope Use

This model should not be used for:
- **Detecting group mentions within full text** (this model classifies pre-identified group tokens)
- General entity recognition or named entity recognition tasks
- Processing full sentences or paragraphs directly
- Real-time social media monitoring without human oversight
- Making decisions about individuals or groups
- Content moderation without additional validation

## Bias, Risks, and Limitations

### Technical Limitations
- Trained specifically on political manifesto text; performance may vary on other text types
- Limited to 44 predefined group categories
- Multilabel predictions may have dependencies between group categories

### Bias Considerations
- Training data consists of political manifestos from specific countries and time periods
- May reflect biases present in political discourse of training data

### Recommendations

Users should be aware that this model:
- Is designed for research purposes in political science
- Should be validated on specific domains before deployment
- May require human oversight for sensitive applications
- Performance may vary across different types of groups and political contexts

## How to Get Started with the Model

### Recommended Usage (Pipeline)

```python
from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification

# Load model and tokenizer
model_repo = "rwillh11/mdeberta_groups_2.0"
tokenizer = AutoTokenizer.from_pretrained("microsoft/mdeberta-v3-base")
model = AutoModelForSequenceClassification.from_pretrained(model_repo)

# Create pipeline for multilabel classification
classifier = pipeline(
    "text-classification",
    model=model,
    tokenizer=tokenizer,
    return_all_scores=True,
    device=0  # Use GPU if available
)

# Example usage - classify group tokens/phrases
group_tokens = ["students", "workers", "teachers", "citizens", "elderly people"]

# Get predictions
predictions = classifier(group_tokens)

# Process results with 0.5 threshold
for token, prediction in zip(group_tokens, predictions):
    predicted_labels = [label_score['label'] for label_score in prediction if label_score['score'] > 0.5]
    print(f"'{token}' → {predicted_labels}")
```

### Manual Implementation

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "rwillh11/mdeberta_groups_2.0"
tokenizer = AutoTokenizer.from_pretrained("microsoft/mdeberta-v3-base")
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example group tokens
group_tokens = ["workers", "citizens", "students"]

for token in group_tokens:
    # Tokenize
    inputs = tokenizer(token, return_tensors="pt", truncation=True, max_length=128)

    # Get predictions
    with torch.no_grad():
        outputs = model(**inputs)
        predictions = torch.sigmoid(outputs.logits)

    # Apply threshold (0.5) to get binary predictions
    binary_predictions = (predictions > 0.5).cpu().numpy()

    # Get predicted label indices
    predicted_indices = [i for i, pred in enumerate(binary_predictions[0]) if pred]
    print(f"'{token}' predicted categories: {predicted_indices}")
```

## Training Details

### Training Data

The model was trained on political manifesto data containing:
- **Languages:** English and German
- **Text Type:** Political manifesto sentences and group mentions
- **Labels:** Multiple social group categories (multilabel classification)
- **Source:** `final_group_train.csv`
- **Training Size:** 2,454 examples (80% split)
- **Validation Size:** 614 examples (20% split)
- **Data processing:** MultiLabelBinarizer for one-hot encoding of group labels

### Training Procedure

#### Preprocessing
- Texts tokenized using mDeBERTa tokenizer with max length 128
- Multilabel binarization using scikit-learn's MultiLabelBinarizer
- Each text can have multiple group labels simultaneously

#### Training Hyperparameters (Optimal from Optuna)
- **Training regime:** Mixed precision training with gradient accumulation
- **Optimizer:** AdamW
- **Learning rate:** 1.9432557585419205e-05 (optimized via Optuna)
- **Weight decay:** 0.11740203810285466 (optimized via Optuna)
- **Warmup ratio:** 0.018423412349675528 (optimized via Optuna)
- **Epochs:** 30
- **Batch size:** 8 (train and eval)
- **Gradient accumulation steps:** 2
- **Trials:** 7 Optuna trials for hyperparameter optimization
- **Metric for selection:** F1 Score
- **Seed:** 42 (partial deterministic training - only Transformers seed set)
- **Pruning:** MedianPruner with 5 warmup steps

#### Training Infrastructure
- **Hardware:** CUDA-enabled GPU (Google Colab)
- **Framework:** Transformers, PyTorch
- **Hyperparameter optimization:** Optuna with MedianPruner
- **Early stopping:** MedianPruner with 5 warmup steps

## Evaluation

### Testing Data, Factors & Metrics

#### Testing Data
- 20% holdout from original dataset
- Multilingual political manifesto sentences with group annotations

#### Factors
The model was evaluated across:
- **Languages:** English and German text
- **Group categories:** 44 different social group types
- **Multilabel performance:** Ability to predict multiple groups per text

#### Metrics
Primary metrics used for evaluation:
- **F1 Score:** Primary optimization metric for multilabel classification
- **Accuracy:** Overall prediction accuracy
- **Precision:** Precision across all labels
- **Recall:** Recall across all labels

### Results

**Best Model Performance (Trial 4, Epoch 27):**
- **Accuracy:** 0.9942
- **F1 Score:** 0.8537
- **Precision:** 0.8633
- **Recall:** 0.8443

The model demonstrates strong performance in multilabel group detection with consistent results across hyperparameter trials and excellent convergence during training.

Additional validation on held-out sets return the following micro-averaged metrics excluding the residual category "other":

**English**
- **Precision:** 0.894
- **Recall:** 0.868
- **F1 Micro:** 0.881

**German (using texts translated from English)**
- **Precision:** 0.853
- **Recall:** 0.823
- **F1 Micro:** 0.838

**Dutch (using texts translated from English)**
- **Precision:** 0.833
- **Recall:** 0.789
- **F1 Micro:** 0.817

**Danish (using texts translated from English)**
- **Precision:** 0.845
- **Recall:** 0.789
- **F1 Micro:** 0.816

**Spanish (using texts translated from English)**
- **Precision:** 0.838
- **Recall:** 0.792
- **F1 Micro:** 0.815

**French (using texts translated from English)**
- **Precision:** 0.841
- **Recall:** 0.802
- **F1 Micro:** 0.821

**Italian (using texts translated from English)**
- **Precision:** 0.837
- **Recall:** 0.788
- **F1 Micro:** 0.811

**Swedish (using texts translated from English)**
- **Precision:** 0.837
- **Recall:** 0.774
- **F1 Micro:** 0.804

## Model Examination

The model uses a standard multilabel classification approach:
- Sigmoid activation for independent probability prediction per group
- Binary cross-entropy loss for multilabel training
- Threshold of 0.5 for binary predictions
- Supports detection of multiple groups simultaneously in a single text

## Environmental Impact

Training involved hyperparameter optimization with 7 trials, each training for 30 epochs.

- **Hardware Type:** CUDA-enabled GPU (Google Colab)
- **Hours used:** Approximately 37-38 hours per trial (6 complete trials ≈ 4.5 hours each, ~27 total hours)
- **Cloud Provider:** Google Colab
- **Compute Region:** Variable
- **Carbon Emitted:** Not precisely measured
- **Training Date:** February 24, 2025

## Technical Specifications

### Model Architecture and Objective
- **Base Architecture:** mDeBERTa-v3-base (278M parameters)
- **Task:** Multilabel sequence classification for group detection
- **Input:** Political text (max length 128 tokens)
- **Output:** Multi-dimensional binary vector for group presence
- **Objective:** Binary cross-entropy loss with F1 score optimization
- **Activation:** Sigmoid for independent probability prediction per group
- **Threshold:** 0.5 for binary predictions

### Compute Infrastructure

#### Hardware
- GPU-accelerated training (CUDA)
- Mixed precision training support

#### Software
- Transformers library
- PyTorch framework
- Optuna for hyperparameter optimization
- scikit-learn for metrics and multilabel encoding

## Citation

If you use this model in your research, please cite:

**BibTeX:**
```bibtex
@misc{mdeberta_groups_detection,
  title={mDeBERTa Group Detection Model for Political Text Analysis},
  author={Will Horne and Alona O. Dolinsky and Lena Maria Huber},
  year={2024},
  note={Multilingual model for detecting social groups in political discourse}
}
```

## Model Card Authors

Research team studying group appeals in political discourse.

## Model Card Contact

For questions about this model, please contact the research team through appropriate academic channels.