Hallucination Classifier — DeBERTa-v3-base

A fine-tuned DeBERTa-v3-base model for detecting hallucinations in LLM outputs. Classifies text as grounded or hallucinated given a claim and optional context.

Model Details

  • Base model: microsoft/deberta-v3-base
  • Task: Binary sequence classification
  • Labels: grounded (0), hallucinated (1)
  • Training data: HaluEval + TruthfulQA (~10K balanced samples)
  • Author: Amritanshu Yadav (github.com/TechNxt05)

Performance (Test Set)

Metric Score
Accuracy 0.9371
F1 0.9372
Precision 0.9365
Recall 0.9379

Usage

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="amritanshu05/hallucination-classifier-deberta"
)

result = classifier("The Eiffel Tower is located in Berlin. [SEP] Where is the Eiffel Tower?")
print(result)
# [{'label': 'hallucinated', 'score': 0.97}]

Integration

This model powers the hallucination detection layer in AuditAI — a production LLM observability platform.

Downloads last month
28
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train amritanshu05/hallucination-classifier-deberta