Photon-17B β a local model that knows when it's inventing an entity
Lucidia family Β· Photon line Β· honesty.tools
Scope, read first. Photon catches entities that do not exist at all (made-up drugs, companies, people, papers, case citations). It is at chance on everything else β fluent lies about real entities and confident reasoning errors. It does not make the model answer better; it makes it abstain better, with a receipt. Route real-entity lies and reasoning errors to a separate verifier.
Photon-17B is a package over stock models, not a fine-tune. It is Qwen2.5-14B-Instruct, unchanged β same weights, same answers, same capability β wrapped by an honesty layer. No LoRA, no merge, nothing trained. The "17" is the param sum of the two stock models it orchestrates (a 14B answerer + a 3.8B independent lens). It adds zero capability; the novelty is the architecture of the check.
How it works
- A cross-family independent lens β Phi-3.5-mini-instruct, a different model family from the answerer. A verifier from the same mind only confirms; a different one checks. If either model refuses in its own words, Photon abstains. This is the universal spine β it holds out-of-distribution and on private entities.
- The Grounded Atlas β Lucidia's offline existence index (~16.7M real-entity names) baked into the package: no network, no API call, swappable for your own domain (compiled from public article-title corpora). Presence is a prior, not proof; absence is not proof of non-existence.
- A signed receipt β every verdict ships an ed25519-signed record + its null (what the detector reads on shuffled labels, β0.5). Auditable, not asserted.
The honest numbers
Measured across four held-out batteries (frozen thresholds, each ships its null):
| setting | fabrication recall | over-abstain |
|---|---|---|
| dual-family hedge + Grounded Atlas (Ollama tier) | ~77β97% | 2β10% |
| disjoint held-out (full package) | 95β96.6% | 3.7% (reals with an atlas entry) |
| out-of-distribution (deployed fusion) | ~90β96% | ~10% (β the base model's own refusal rate) |
The out-of-distribution ~10% over-abstain is essentially what bare Qwen-14B already refuses on obscure reals β Photon adds β0. On a HalluLens-protocol head-to-head it cuts base Qwen-14B's fabricated-entity false-acceptance roughly 10β16Γ (in-distribution, on the model's easier domains β not a cross-model leaderboard claim).
Never read these as more than they are: the in-distribution figures isolate the mechanism; plan a deployment around the out-of-distribution one. The numbers are measured on held-out batteries with frozen thresholds, and each ships its null. Full methodology (including where the layer is weakest) is open at honesty.tools.
Scope & limits (load-bearing, not fine print)
- Catches: entities (people, places, papers, works, businesses, species, drug/company/case names) that do not exist at all, and off-distribution inputs.
- Blind to: fluent lies about real entities, and confident reasoning errors β geometry charts familiarity, not truth. Complementary to semantic-entropy / SelfCheckGPT, which own that axis.
- Per-model calibration is a hard dependency; swapping the answerer needs re-calibration.
- Not-in-atlas entities: on real entities outside the Grounded Atlas (private/enterprise), the reliable signal is the model's own refusal (the hedge leg); swap a domain index to ground them.
Run it (Ollama)
ollama pull qwen2.5:14b-instruct
ollama pull phi3.5
python photon_ollama.py --answerer qwen2.5:14b-instruct --lens phi3.5 \
--bloom grounded_atlas.bloom "Tell me about the medicine Velodose"
# -> off_map: true, route: "dual-family refusal" (a fabricated drug)
python photon_ollama.py ... "Tell me about Marie Curie"
# -> off_map: false, route: "grounded" (a real, atlas-covered entity)
grounded_atlas.bloom (~33 MB) is the offline existence index, included. The dual-family hedge + atlas is the
universal spine (words-only, runs in Ollama/llama.cpp). A stronger Python-served tier adds a residual-stream
probe (+recall on atlas-covered domains) β see honesty.tools.
Provenance
Built on stock Qwen2.5-14B-Instruct + Phi-3.5-mini-instruct + the Grounded Atlas. No weights were modified, merged, or trained. Methodology, batteries, and the where-it's-weakest writeups are open at honesty.tools.