Director-AI RAGTruth token detector (ModernBERT)

A token classifier that flags hallucinated spans in a RAG response. It reads [context] [SEP] [response] and labels each response token supported or hallucinated, then a response is flagged when it contains a hallucinated span. Response/claim-level NLI grounding cannot isolate the short "baseless addition" spans that dominate RAGTruth; a token classifier learns them directly.

Results (RAGTruth test, 2700 examples: 943 hallucinated / 1757 grounded)

approach	example F1	balanced acc	FPR	precision
NLI / claim-decompose	0.366	—	0.347	—
this model	0.763	0.814	0.071	0.841

Operating point: token probability >= 0.95, at least one hallucinated token. Trained on the RAGTruth train split (15090 examples); context truncated to 1024 tokens during training.

Usage

from transformers import AutoModelForTokenClassification, AutoTokenizer
tok = AutoTokenizer.from_pretrained("anulum/director-ragtruth-token-modernbert")
model = AutoModelForTokenClassification.from_pretrained(
    "anulum/director-ragtruth-token-modernbert")
# index 1 = hallucinated

Downloads last month: 19

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for anulum/director-ragtruth-token-modernbert

Base model

answerdotai/ModernBERT-base

Finetuned

(1342)

this model

anulum
/

director-ragtruth-token-modernbert

Director-AI RAGTruth token detector (ModernBERT)

Results (RAGTruth test, 2700 examples: 943 hallucinated / 1757 grounded)

Usage

Model tree for anulum/director-ragtruth-token-modernbert

Dataset used to train anulum/director-ragtruth-token-modernbert