Photon-17B — a local model that knows when it's inventing an entity

Lucidia family · Photon line · honesty.tools

Scope, read first. Photon catches entities that do not exist at all (made-up drugs, companies, people, papers, case citations). It is at chance on everything else — fluent lies about real entities and confident reasoning errors. It does not make the model answer better; it makes it abstain better, with a receipt. Route real-entity lies and reasoning errors to a separate verifier.

Photon-17B is a package over stock models, not a fine-tune. It is Qwen2.5-14B-Instruct, unchanged — same weights, same answers, same capability — wrapped by an honesty layer. No LoRA, no merge, nothing trained. The "17" is the param sum of the two stock models it orchestrates (a 14B answerer + a 3.8B independent lens). It adds zero capability; the novelty is the architecture of the check.

How it works

A cross-family independent lens — Phi-3.5-mini-instruct, a different model family from the answerer. A verifier from the same mind only confirms; a different one checks. If either model refuses in its own words, Photon abstains. This is the universal spine — it holds out-of-distribution and on private entities.
The Grounded Atlas — Lucidia's offline existence index (~16.7M real-entity names) baked into the package: no network, no API call, swappable for your own domain (compiled from public article-title corpora). Presence is a prior, not proof; absence is not proof of non-existence.
A signed receipt — every verdict ships an ed25519-signed record + its null (what the detector reads on shuffled labels, ≈0.5). Auditable, not asserted.

The honest numbers

Measured across four held-out batteries (frozen thresholds, each ships its null):

setting	fabrication recall	over-abstain
dual-family hedge + Grounded Atlas (Ollama tier)	~77–97%	2–10%
disjoint held-out (full package)	95–96.6%	3.7% (reals with an atlas entry)
out-of-distribution (deployed fusion)	~90–96%	~10% (≈ the base model's own refusal rate)

The out-of-distribution ~10% over-abstain is essentially what bare Qwen-14B already refuses on obscure reals — Photon adds ≈0. On a HalluLens-protocol head-to-head it cuts base Qwen-14B's fabricated-entity false-acceptance roughly 10–16× (in-distribution, on the model's easier domains — not a cross-model leaderboard claim).

Never read these as more than they are: the in-distribution figures isolate the mechanism; plan a deployment around the out-of-distribution one. The numbers are measured on held-out batteries with frozen thresholds, and each ships its null. Full methodology (including where the layer is weakest) is open at honesty.tools.

Scope & limits (load-bearing, not fine print)

Catches: entities (people, places, papers, works, businesses, species, drug/company/case names) that do not exist at all, and off-distribution inputs.
Blind to: fluent lies about real entities, and confident reasoning errors — geometry charts familiarity, not truth. Complementary to semantic-entropy / SelfCheckGPT, which own that axis.
Per-model calibration is a hard dependency; swapping the answerer needs re-calibration.
Not-in-atlas entities: on real entities outside the Grounded Atlas (private/enterprise), the reliable signal is the model's own refusal (the hedge leg); swap a domain index to ground them.

Run it (Ollama)

ollama pull qwen2.5:14b-instruct
ollama pull phi3.5
python photon_ollama.py --answerer qwen2.5:14b-instruct --lens phi3.5 \
       --bloom grounded_atlas.bloom "Tell me about the medicine Velodose"
# -> off_map: true, route: "dual-family refusal"   (a fabricated drug)
python photon_ollama.py ... "Tell me about Marie Curie"
# -> off_map: false, route: "grounded"             (a real, atlas-covered entity)

grounded_atlas.bloom (~33 MB) is the offline existence index, included. The dual-family hedge + atlas is the universal spine (words-only, runs in Ollama/llama.cpp). A stronger Python-served tier adds a residual-stream probe (+recall on atlas-covered domains) — see honesty.tools.

Provenance

Built on stock Qwen2.5-14B-Instruct + Phi-3.5-mini-instruct + the Grounded Atlas. No weights were modified, merged, or trained. Methodology, batteries, and the where-it's-weakest writeups are open at honesty.tools.

Downloads last month: -; Downloads are not tracked for this model. How to track