RADAR: Robust AI-Text Detection via Adversarial Learning
Paper
• 2307.03838 • Published
• 1
Adversarially trained AI-generated text detector based on the RADAR framework (Hu et al., NeurIPS 2023), extended with a multi-evasion attack pool for robust detection.
roberta-largefrom transformers import RobertaTokenizer, RobertaForSequenceClassification
import torch
tokenizer = RobertaTokenizer.from_pretrained("Shushant/adal-roberta-detector")
model = RobertaForSequenceClassification.from_pretrained("Shushant/adal-roberta-detector")
model.eval()
text = "Your text here."
enc = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
probs = torch.softmax(model(**enc).logits, dim=-1)[0]
print(f"P(human)={probs[1]:.3f} P(AI)={probs[0]:.3f}")