---
license: cc-by-4.0
---

# GuiltRoBERTa-en: A Two-Stage Classifier for Guilt-Assignment Rhetoric in English Political Texts

**GuiltRoBERTa-en** is a two-stage AI pipeline for detecting guilt-assignment rhetoric in English political discourse. It combines:

1. **Stage 1 – Emotion Pre-Filtering:** emotion labels from the [Babel Emotions6 Tool](https://emotionsbabel.poltextlab.com/)
2. **Stage 2 – Guilt Classification:** a fine-tuned binary XLM-RoBERTa model trained on manually annotated English texts (`guilt` vs `no_guilt`)

The approach is grounded in political communication theory, which suggests that **guilt attribution often emerges in anger-laden contexts**. Thus, only texts labeled as **"Anger"** in Stage 1 are passed to the guilt classifier.

---

## 🧩 Model Architecture

### Stage 1: Emotion Pre-Filtering (Babel Emotions Tool)

* **Tool:** [Emotions 6 Babel Machine](https://emotionsbabel.poltextlab.com/)
* **Task:** 6-class emotion classification (`Anger`, `Fear`, `Disgust`, `Sadness`, `Joy`, `None of them`)
* **Input:** CSV file with one text per row
* **Output:** CSV file with predicted labels and probabilities
* **Usage:** retain only rows with `predicted_emotion == "Anger"` for Stage 2

**The Babel Emotions Tool is not an API but a web-based interface.** Upload a CSV file, download the labeled results, and use them as input to the guilt classifier.

### Stage 2: Guilt Classification

* **Base model:** `xlm-roberta-base`
* **Task:** Binary classification (`guilt`, `no_guilt`)
* **Training data:** Sentence-level annotated English corpus
* **Optimization:** Class-weighted loss function to handle label imbalance
* **Recommended threshold:** τ = **0.15**

---

## Motivation

**Guilt assignment** — attributing moral responsibility or blame — is a key rhetorical strategy in political communication. Since guilt often appears alongside anger, direct one-stage classification risks conflating emotional tones.

This two-stage pipeline improves precision by:
* Filtering anger-related contexts first
* Then applying a dedicated guilt detector only where relevant

---

## Evaluation

The model was evaluated on a held-out validation set (20% stratified split) with the following approach:

| Stage 1 Filter | Threshold (τ) | Precision | Recall | F1 | Accuracy |
|----------------|---------------|-----------|--------|-----|----------|
| Anger-only     | 0.15          | optimized | optimized | optimized | optimized |

* **Best configuration:** Anger-only, τ = 0.15
* **Metrics:** Accuracy, Precision, Recall, F1-score, ROC-AUC, PR-AUC
* The two-stage model shows improved performance compared to single-stage baselines

---

## Usage Example

### Step 1: Get Emotion Predictions from Babel

1. Visit [https://emotionsbabel.poltextlab.com/](https://emotionsbabel.poltextlab.com/)
2. Upload your CSV file (one text per row)
3. Download the predictions (includes `emotion_predicted` column)

### Step 2: Apply Guilt Classifier
```python
import pandas as pd
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipeline

# Load Babel emotion predictions
df = pd.read_excel("your_data_with_emotion_predictions.xlsx")

# Filter for 'Anger' only
anger_df = df[df["emotion_predicted"] == "Anger"].copy()

# Load the guilt classifier
repo_id = "your-org/guiltroberta-en"  # Update with actual path
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSequenceClassification.from_pretrained(repo_id)
pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True)

# Apply guilt predictions with threshold
THRESHOLD = 0.15

anger_df["guilt_score"] = anger_df["text"].apply(
    lambda t: pipe(t)[0][1]["score"]  # score for 'guilt' label
)

anger_df["guilt_predicted"] = anger_df["guilt_score"] > THRESHOLD

# Save results
anger_df.to_excel("anger_with_guilt_predictions.xlsx", index=False)

# Statistics
print(f"Total anger sentences: {len(anger_df)}")
print(f"Predicted guilt: {anger_df['guilt_predicted'].sum()}")
print(f"Guilt ratio: {anger_df['guilt_predicted'].mean():.2%}")
```

### Alternative: Direct Inference
```python
import torch
from transformers import XLMRobertaTokenizer, XLMRobertaForSequenceClassification

# Load model
model_path = "your-org/guiltroberta-en"
tokenizer = XLMRobertaTokenizer.from_pretrained(model_path)
model = XLMRobertaForSequenceClassification.from_pretrained(model_path)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()

# Example: anger-labeled sentence
text = "I'm furious at myself for letting this happen again."

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512, padding=True)
inputs = {k: v.to(device) for k, v in inputs.items()}

with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits
    prob_guilt = torch.softmax(logits, dim=-1)[0][1].item()

# Apply threshold
THRESHOLD = 0.15
prediction = "guilt" if prob_guilt > THRESHOLD else "no_guilt"

print(f"Guilt probability: {prob_guilt:.4f}")
print(f"Prediction: {prediction}")
```

---

## Training Configuration
```python
Epochs: 4
Learning Rate: 2e-5
Batch Size: 8
Max Sequence Length: 512 tokens
Optimizer: AdamW
Scheduler: Linear warmup
Train/Validation Split: 80/20 (stratified)
Class Weighting: Applied to handle label imbalance
```