YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

PHASE 1 RESEARCH ARTIFACT — detects model confidence, not factual correctness

This model was trained on LLM hidden-state activations to produce an energy score that correlates with the model's output confidence (hallucination likelihood). It cannot verify whether a model's answer is factually correct — it can only signal how uncertain the model appears token-by-token.

This limitation was confirmed in Exp 184/203: the energy scores reflect model confidence, not answer correctness. Do not use these scores as a correctness verifier.

For production use, install the full Carnot pipeline:
pip install carnot-ebm
The production pipeline includes FormalClaimVerifier (solver-routed formal claim verification), PBT code verification, and the Carnot MCP server. See Carnot on GitHub for documentation.

Exp 316 Full-Scale Benchmark Results (2026-04-14)

The Carnot FCV pipeline was benchmarked on 400 GSM8K questions (adversarial corpus with number_swap and irrelevant_sentence perturbations) and 50 HumanEval problems.

Baseline accuracy on adversarial GSM8K (no Carnot intervention):

Model	GSM8K Accuracy	95% CI	N
Gemma4-E4B-it	26.3%	[22.2%, 30.8%]	400
Qwen3.5-0.8B	27.5%	[23.4%, 32.1%]	400

Note: inference_mode=simulated. Live GPU results pending. See results/experiment_316_fullscale_results.json for full details.

tags: - energy-based-model - hallucination-detection - jax - carnot license: apache-2.0

⚠️ PHASE 1 RESEARCH ARTIFACT

This model detects output confidence (hallucination likelihood signals from LLM hidden-state activations), not correctness. It cannot verify whether a model's answer is right — it can only signal how uncertain the model appears token-by-token.

For production use, install the full Carnot pipeline which includes FormalClaimVerifier (solver-routed formal claim verification), PBT code verification (property-based testing on 164-problem HumanEval), process integrity detection (right-for-wrong-reasons), and the Carnot MCP server:
pip install carnot-ebm
See Carnot on GitHub for documentation and the full production API.

Note: These are Phase 1 research artifacts — per-token activation EBMs that detect hallucination confidence signals from LLM hidden states. For production use of the full Carnot EBM framework (constraint verification, guided decoding, energy-based repair), see:
pip install carnot-ebm
Source and documentation: https://github.com/Carnot-EBM/carnot-ebm

Important: Research Artifact, Not a Production Detector

This model achieves 86.8% on held-out TruthfulQA test sets, but in practical deployment (8 real questions), activation-based EBMs agreed with ground truth only 50% of the time. The EBM detects model confidence, not correctness — confident hallucinations get low energy (look fine) while correct-but-hedging answers get flagged.

This model is a research artifact documenting activation-space structure. It is NOT a reliable hallucination detector for production use.

For practical verification, use structural constraints (test execution, SAT solving) rather than activation analysis. See the Carnot technical report for 41 experiments and 14 principles learned.

per-token-ebm-gemma4-e2b-nothink

Per-token hallucination detection EBM for google/gemma-4-E2B.

Metric	Value
Test accuracy	86.8%
Energy gap	3.8514
Source model	google/gemma-4-E2B
Hidden dim	1536
Architecture	Gibbs [1536 → 512 → 128 → 1], SiLU
Training tokens	15,822
Thinking	disabled

Usage

from carnot.inference.ebm_loader import load_ebm
ebm = load_ebm("per-token-ebm-gemma4-e2b-nothink")
energy = float(ebm.energy(activation_vector))
# Low energy = likely correct, high energy = likely hallucination

Trained with Carnot.

What's Proven to Work (2026)

The following Carnot pipeline capabilities have been validated with live GPU inference (not simulation) as of April 2026. Install via pip install carnot-ebm.

Capability	What it does	Evidence
FormalClaimVerifier	Solver-routed formal claim verification: arithmetic, boolean-entailment, set-membership, execution-oracle, cardinality, comparison routes	1,243 solver-routable rows from live GSM8K + HumanEval traces (Exp 244/246)
PBT code verification	Property-based testing (Hypothesis) catches bugs that official test suites miss	+3.0pp on 164-problem HumanEval with Gemma4-E4B-it (Exp 226); 2 official-test misses caught on Qwen3.5-0.8B (Exp 227)
Process integrity detection	Detects right-for-wrong-reasons answers where the output is correct but the reasoning process is invalid	5 right-for-wrong-reasons cases caught across 30-case HumanEval cohort (Exp 251)
Carnot MCP server	Exposes `verify_code_with_pbt` and 6 other tools to any MCP-compatible agent	7 discoverable tools, 30s timeout, 10K input guard (VERIFY-031)

These results use instruction-tuned models (Gemma4-E4B-it, Qwen3.5-0.8B) on live CUDA hardware. All per-token EBM confidence results (this model family) are Phase 1 research artifacts and should not be interpreted as correctness scores.

Production Use

For production LLM output verification, install: pip install carnot-ebm

This model is a Phase 1 research artifact (activation-based confidence detection). These 16 per-token activation EBMs detect confidence, not correctness. For the full verify-repair pipeline, see: https://github.com/Carnot-EBM/carnot-ebm

Downloads last month: 11

Safetensors

Model size

853k params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support