Note: These are Phase 1 research artifacts โ€” per-token activation EBMs that detect hallucination confidence signals from LLM hidden states. For production use of the full Carnot EBM framework (constraint verification, guided decoding, energy-based repair), see:

pip install carnot

Source and documentation: https://github.com/ianblenke/carnot

Important: Research Artifact, Not a Production Detector

This model achieves 86.8% on held-out TruthfulQA test sets, but in practical deployment (8 real questions), activation-based EBMs agreed with ground truth only 50% of the time. The EBM detects model confidence, not correctness โ€” confident hallucinations get low energy (look fine) while correct-but-hedging answers get flagged.

This model is a research artifact documenting activation-space structure. It is NOT a reliable hallucination detector for production use.

For practical verification, use structural constraints (test execution, SAT solving) rather than activation analysis. See the Carnot technical report for 41 experiments and 14 principles learned.

per-token-ebm-gemma4-e2b-nothink

Per-token hallucination detection EBM for google/gemma-4-E2B.

Metric Value
Test accuracy 86.8%
Energy gap 3.8514
Source model google/gemma-4-E2B
Hidden dim 1536
Architecture Gibbs [1536 โ†’ 512 โ†’ 128 โ†’ 1], SiLU
Training tokens 15,822
Thinking disabled

Usage

from carnot.inference.ebm_loader import load_ebm
ebm = load_ebm("per-token-ebm-gemma4-e2b-nothink")
energy = float(ebm.energy(activation_vector))
# Low energy = likely correct, high energy = likely hallucination

Trained with Carnot.

Downloads last month
45
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support