Argument Quality Ranking - RoBERTa v2
Fine-tuned RoBERTa-base for pairwise argument quality ranking. Given two arguments on the same debate topic, the model predicts which is higher quality.
Model Details
- Base model: roberta-base
- Task: Pairwise argument quality classification (A wins / B wins)
- Training data: IBM ArgQ corpus (3,587 pairs, 60 topics)
- Input format:
[CLS] topic [SEP] arg_a [SEP] arg_b
Improvements over v1
- Topic prepended to input
- Pair-flipping data augmentation (doubles training data)
- Label smoothing (0.1)
- Layerwise learning rate decay (factor 0.9)
- Bottom 6 layers frozen
Performance
| Split | Accuracy | F1 | Precision | Recall |
|---|---|---|---|---|
| In-domain | 63.7% | 0.587 | 0.684 | 0.514 |
| Cross-topic | 62.8% | 0.594 | 0.688 | 0.522 |
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("SambhavSBU/argument-quality-roberta-v2")
model = AutoModelForSequenceClassification.from_pretrained("SambhavSBU/argument-quality-roberta-v2")
topic = "We should ban social media"
arg_a = "Social media spreads misinformation at an unprecedented scale."
arg_b = "Social media connects people across the world."
inputs = tokenizer(
topic + " [SEP] " + arg_a, arg_b,
return_tensors="pt", truncation=True, max_length=256
)
with torch.no_grad():
logits = model(**inputs).logits
winner = "A" if logits.argmax() == 0 else "B"
print(f"Higher quality argument: {winner}")
Citation
Code and full experiments: https://github.com/Sambhav101/Argument-Quality-Ranking
- Downloads last month
- 15