Argument Quality Ranking - RoBERTa v2

Fine-tuned RoBERTa-base for pairwise argument quality ranking. Given two arguments on the same debate topic, the model predicts which is higher quality.

Model Details

  • Base model: roberta-base
  • Task: Pairwise argument quality classification (A wins / B wins)
  • Training data: IBM ArgQ corpus (3,587 pairs, 60 topics)
  • Input format: [CLS] topic [SEP] arg_a [SEP] arg_b

Improvements over v1

  • Topic prepended to input
  • Pair-flipping data augmentation (doubles training data)
  • Label smoothing (0.1)
  • Layerwise learning rate decay (factor 0.9)
  • Bottom 6 layers frozen

Performance

Split Accuracy F1 Precision Recall
In-domain 63.7% 0.587 0.684 0.514
Cross-topic 62.8% 0.594 0.688 0.522

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("SambhavSBU/argument-quality-roberta-v2")
model = AutoModelForSequenceClassification.from_pretrained("SambhavSBU/argument-quality-roberta-v2")

topic = "We should ban social media"
arg_a = "Social media spreads misinformation at an unprecedented scale."
arg_b = "Social media connects people across the world."

inputs = tokenizer(
    topic + " [SEP] " + arg_a, arg_b,
    return_tensors="pt", truncation=True, max_length=256
)
with torch.no_grad():
    logits = model(**inputs).logits
winner = "A" if logits.argmax() == 0 else "B"
print(f"Higher quality argument: {winner}")

Citation

Code and full experiments: https://github.com/Sambhav101/Argument-Quality-Ranking

Downloads last month
15
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support