CERT Hallucination Detection Without Another LLM

#1
by AI-that-works - opened
cert framework org

CERT uses embedding geometry to detect hallucinations โ€” no second model
required for context-grounded responses. This Space benchmarks it against
Vectara HHEM-2.1-Open on the same examples so you can see where they agree,
where they disagree, and why disagreement is actually informative (Type III
hallucinations: factually wrong responses that occupy the geometrically
correct embedding region).

Dashboard: cert-framework.com

Research:

cert framework org

Example: "Seasons are caused by Earth's 23.5-degree axial tilt"

CERT DGI: Grounded (0.4227) โ€” 22ms
HHEM: Hallucinated (0.0178) โ€” 108ms

CERT is correct. HHEM produces a false positive on a factually
accurate response. This is the Type III boundary in practice โ€”
geometric displacement correctly identifies grounding patterns
that a classifier misses. 5x latency, better result.

More about the DGI (Directional Grounding Index) here: https://arxiv.org/abs/2602.13224

Sign up or log in to comment