Hallucination Detector (DeBERTa-v3-base)

A fine-tuned DeBERTa-v3-base model for detecting hallucinations in LLM-generated text.

Model Description

This model classifies whether an LLM-generated response is factual or hallucinated given a knowledge context. It was fine-tuned on the HaluEval benchmark.

Base model: microsoft/deberta-v3-base (184M parameters)
Task: Binary classification (Factual vs Hallucinated)
Training data: HaluEval (21,000 samples across QA, Dialogue, Summarization)

Performance

Metric	Score
Accuracy	0.9127
Precision	0.8819
Recall	0.9505
F1 Score	0.9149
AUROC	0.9771

Performance by Task

Task	F1 Score
QA	0.97
Summarization	0.96
Dialogue	0.82

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model
tokenizer = AutoTokenizer.from_pretrained("varunteja99/hallucination-detector-deberta")
model = AutoModelForSequenceClassification.from_pretrained("varunteja99/hallucination-detector-deberta")

# Prepare input
text = """Knowledge: The Eiffel Tower is located in Paris, France.
Question: Where is the Eiffel Tower?
Answer: The Eiffel Tower is in London."""

inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

# Predict
with torch.no_grad():
    outputs = model(**inputs)
    prediction = torch.argmax(outputs.logits, dim=-1).item()

print("Hallucinated" if prediction == 1 else "Factual")
# Output: Hallucinated

Input Format

The model expects input in the following format:

Knowledge: [relevant context/facts]
Question: [the query or prompt]
Answer: [the LLM-generated response to verify]

Labels

ID	Label	Description
0	Factual	Response is supported by knowledge
1	Hallucinated	Response contradicts or is unsupported

Training Details

Epochs: 3
Learning rate: 2e-5
Batch size: 8
Warmup steps: 788
Precision: float32 (for training stability)

Citation

@misc{chundru2026hallucination,
  author = {Chundru, Varun and Biswas, Debasmita},
  title = {Domain-Specific Hallucination Detection in Large Language Models},
  year = {2026},
  publisher = {GitHub},
  url = {https://github.com/varunteja99/hallucination-detection-nlp}
}

Acknowledgments

Course: CS 593 NLP, Purdue University, Spring 2026
Dataset: HaluEval (Li et al., EMNLP 2023)

Downloads last month: 7

Safetensors

Model size

0.2B params

Tensor type

F32

Evaluation results

accuracy on HaluEval
self-reported

0.913
f1 on HaluEval
self-reported

0.915
precision on HaluEval
self-reported

0.882
recall on HaluEval
self-reported

0.951