drond0174's picture
Add model card README
f6120a4 verified
metadata
license: mit
base_model: microsoft/deberta-v3-base
tags:
  - token-classification
  - hallucination-detection
  - tool-use
datasets:
  - drond0174/RAGTruth-Hallucinations

Hallucination detection artifacts (ToolACE / RAGTruth-style)

Checkpoints and test predictions for span-level hallucination detection in tool-augmented answers.

Contents

Path Description
deberta_contradiction_tuned/ Tool-aware DeBERTa fine-tuned on mixed train (contradiction oversample ×3) — best run
deberta_mixed/ Earlier/alternate DeBERTa mixed checkpoint (no contradiction oversampling)
predictions/ mixed_test span predictions (DeBERTa, LookBack, Lettuce)
lookback/lookback_mixed_classifier.joblib Sklearn head for LookBackLens (TinyLlama features)
lookback/lookback_mixed_train_features.npz Cached train attention features (~1.1 GB)
lookback/lookback_mixed_val_features.npz Cached validation attention features (~164 MB)

Dataset: drond0174/RAGTruth-Hallucinations

Load DeBERTa

from transformers import AutoModelForTokenClassification, AutoTokenizer

model_dir = "drond0174/hallucination_detection"
tokenizer = AutoTokenizer.from_pretrained(f"{model_dir}/deberta_contradiction_tuned")
model = AutoModelForTokenClassification.from_pretrained(
    f"{model_dir}/deberta_contradiction_tuned"
)

See deberta_contradiction_tuned/run_meta.json for threshold, best epoch, and validation F1.

LookBack feature caches

Download lookback/*_features.npz to skip re-running TinyLlama feature extraction. Point train_cache_path / val_cache_path in lookback_baseline.py to the downloaded files.