RADAR Detector (RoBERTa-large)

Adversarially trained AI-generated text detector based on the RADAR framework (Hu et al., NeurIPS 2023), extended with a multi-evasion attack pool for robust detection.

Training

  • Base model: roberta-large
  • Dataset: RAID (Dugan et al., ACL 2024)
  • Evasion attacks seen during training: t5_paraphrase, synonym_replacement, homoglyphs, article_deletion, misspelling
  • Best macro AUROC: 0.6897
  • Generators: chatgpt, gpt2, gpt3, gpt4, cohere, cohere-chat, llama-chat, mistral, mistral-chat, mpt, mpt-chat

Usage

from transformers import RobertaTokenizer, RobertaForSequenceClassification
import torch

tokenizer = RobertaTokenizer.from_pretrained("Shushant/adal-roberta-detector")
model     = RobertaForSequenceClassification.from_pretrained("Shushant/adal-roberta-detector")
model.eval()

text = "Your text here."
enc  = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    probs = torch.softmax(model(**enc).logits, dim=-1)[0]
print(f"P(human)={probs[1]:.3f}  P(AI)={probs[0]:.3f}")

Label mapping

  • Index 0 → AI-generated
  • Index 1 → Human-written
Downloads last month
185
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for Shushant/adal-roberta-detector