DeBERTa-v3-Small – Factuality / Misinformation Classifier

Lightweight DeBERTa-v3-Small fine-tuned to detect factual vs. non-factual statements using TruthfulQA and FEVER.
Part of the Army of Safeguards research project


Model Details

Property Value
Base model microsoft/deberta-v3-small
Architecture Encoder-only Transformer (≈ 86 M params)
Task Binary text classification (0 = factual, 1 = non-factual)
Language English
Fine-tuning framework Hugging Face Transformers v4.44
Trained by Ajith Bondili
Hardware NVIDIA T4 (Google Colab)
Epochs 3
Batch size 16
Learning rate 2e-5
Max sequence len 256 tokens

Training Data

Merged and balanced from two open-source datasets:

  1. TruthfulQA (generation) – Q/A pairs labeled truthful vs false.
  2. FEVER v1.0 – Real-world claims labeled Supported, Refuted, or Not Enough Info (mapped to binary 0/1).

≈ 20 000 combined examples after cleaning.


Evaluation Results

Metric Base Model (M₀) Fine-Tuned (M₁) Δ Change
Accuracy 0.52 0.80 +0.28
F1 Score 0.00 0.79 +0.79
Eval Loss 0.69 → 0.35

Confusion Matrix

Pred Factual Pred Non-Factual
True Factual 838 205
True Non-Factual 204 753

Intended Use

Acts as a truth-checking critic for large-language-model outputs.

Input

Free-form English text (e.g., an LLM response or claim)

Output

{
  "label": "non-factual",
  "confidence": 0.81,
  "probs": { "supported": 0.19, "non-factual": 0.81 }
}

Out of Scope

  • Non-English text
  • Numerical facts requiring external databases (e.g., live statistics or financial data)
  • Ethical or opinion-based classification tasks

Bias · Risks · Limitations

  • Trained only on English corpora; may mis-score culturally specific or multilingual statements.
  • Can misclassify sarcasm, humor, or figurative speech as “non-factual.”
  • Should be used as one critic in a multi-agent safeguard system, not as a standalone truth detector.

Usage Example

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch, torch.nn.functional as F

repo = "ajithbondili/deberta-v3-factuality-small"
tok = AutoTokenizer.from_pretrained(repo)
model = AutoModelForSequenceClassification.from_pretrained(repo)

text = "The Moon is made of cheese."
inputs = tok(text, return_tensors="pt", truncation=True, padding=True)
with torch.no_grad():
    logits = model(**inputs).logits
    probs = F.softmax(logits, dim=-1)
label = torch.argmax(probs).item()
print({"label": label, "probs": probs.tolist()})

Citation

@software{bondili_2025_factuality, author = {Ajith Bondili}, title = {DeBERTa-v3-Small Factuality / Misinformation Classifier}, year = {2025}, url = {https://huggingface.co/ajith-bondili/deberta-v3-factuality-small} }

Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train Ajith-Bondili/deberta-v3-factuality-small