--- license: cc-by-4.0 --- # GuiltRoBERTa-en: A Two-Stage Classifier for Guilt-Assignment Rhetoric in English Political Texts **GuiltRoBERTa-en** is a two-stage AI pipeline for detecting guilt-assignment rhetoric in English political discourse. It combines: 1. **Stage 1 – Emotion Pre-Filtering:** emotion labels from the [Babel Emotions6 Tool](https://emotionsbabel.poltextlab.com/) 2. **Stage 2 – Guilt Classification:** a fine-tuned binary XLM-RoBERTa model trained on manually annotated English texts (`guilt` vs `no_guilt`) The approach is grounded in political communication theory, which suggests that **guilt attribution often emerges in anger-laden contexts**. Thus, only texts labeled as **"Anger"** in Stage 1 are passed to the guilt classifier. --- ## 🧩 Model Architecture ### Stage 1: Emotion Pre-Filtering (Babel Emotions Tool) * **Tool:** [Emotions 6 Babel Machine](https://emotionsbabel.poltextlab.com/) * **Task:** 6-class emotion classification (`Anger`, `Fear`, `Disgust`, `Sadness`, `Joy`, `None of them`) * **Input:** CSV file with one text per row * **Output:** CSV file with predicted labels and probabilities * **Usage:** retain only rows with `predicted_emotion == "Anger"` for Stage 2 **The Babel Emotions Tool is not an API but a web-based interface.** Upload a CSV file, download the labeled results, and use them as input to the guilt classifier. ### Stage 2: Guilt Classification * **Base model:** `xlm-roberta-base` * **Task:** Binary classification (`guilt`, `no_guilt`) * **Training data:** Sentence-level annotated English corpus * **Optimization:** Class-weighted loss function to handle label imbalance * **Recommended threshold:** τ = **0.15** --- ## Motivation **Guilt assignment** — attributing moral responsibility or blame — is a key rhetorical strategy in political communication. Since guilt often appears alongside anger, direct one-stage classification risks conflating emotional tones. This two-stage pipeline improves precision by: * Filtering anger-related contexts first * Then applying a dedicated guilt detector only where relevant --- ## Evaluation The model was evaluated on a held-out validation set (20% stratified split) with the following approach: | Stage 1 Filter | Threshold (τ) | Precision | Recall | F1 | Accuracy | |----------------|---------------|-----------|--------|-----|----------| | Anger-only | 0.15 | optimized | optimized | optimized | optimized | * **Best configuration:** Anger-only, τ = 0.15 * **Metrics:** Accuracy, Precision, Recall, F1-score, ROC-AUC, PR-AUC * The two-stage model shows improved performance compared to single-stage baselines --- ## Usage Example ### Step 1: Get Emotion Predictions from Babel 1. Visit [https://emotionsbabel.poltextlab.com/](https://emotionsbabel.poltextlab.com/) 2. Upload your CSV file (one text per row) 3. Download the predictions (includes `emotion_predicted` column) ### Step 2: Apply Guilt Classifier ```python import pandas as pd from transformers import AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipeline # Load Babel emotion predictions df = pd.read_excel("your_data_with_emotion_predictions.xlsx") # Filter for 'Anger' only anger_df = df[df["emotion_predicted"] == "Anger"].copy() # Load the guilt classifier repo_id = "your-org/guiltroberta-en" # Update with actual path tokenizer = AutoTokenizer.from_pretrained(repo_id) model = AutoModelForSequenceClassification.from_pretrained(repo_id) pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True) # Apply guilt predictions with threshold THRESHOLD = 0.15 anger_df["guilt_score"] = anger_df["text"].apply( lambda t: pipe(t)[0][1]["score"] # score for 'guilt' label ) anger_df["guilt_predicted"] = anger_df["guilt_score"] > THRESHOLD # Save results anger_df.to_excel("anger_with_guilt_predictions.xlsx", index=False) # Statistics print(f"Total anger sentences: {len(anger_df)}") print(f"Predicted guilt: {anger_df['guilt_predicted'].sum()}") print(f"Guilt ratio: {anger_df['guilt_predicted'].mean():.2%}") ``` ### Alternative: Direct Inference ```python import torch from transformers import XLMRobertaTokenizer, XLMRobertaForSequenceClassification # Load model model_path = "your-org/guiltroberta-en" tokenizer = XLMRobertaTokenizer.from_pretrained(model_path) model = XLMRobertaForSequenceClassification.from_pretrained(model_path) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model.to(device) model.eval() # Example: anger-labeled sentence text = "I'm furious at myself for letting this happen again." # Tokenize and predict inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512, padding=True) inputs = {k: v.to(device) for k, v in inputs.items()} with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits prob_guilt = torch.softmax(logits, dim=-1)[0][1].item() # Apply threshold THRESHOLD = 0.15 prediction = "guilt" if prob_guilt > THRESHOLD else "no_guilt" print(f"Guilt probability: {prob_guilt:.4f}") print(f"Prediction: {prediction}") ``` --- ## Training Configuration ```python Epochs: 4 Learning Rate: 2e-5 Batch Size: 8 Max Sequence Length: 512 tokens Optimizer: AdamW Scheduler: Linear warmup Train/Validation Split: 80/20 (stratified) Class Weighting: Applied to handle label imbalance ```