---
license: mit
language:
- en
tags:
- zero-shot
- natural-language-inference
- self-reflection
- logic
- reasoning
- evaluation
- trignum
- trignumentality
---
# 🧲 TRIGNUM-300M
### The Pre-Flight Check for Autonomous AI
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
[](#-benchmark-results)
[](https://doi.org/10.5281/zenodo.18672142)
> **"You wouldn't let a plane take off without a pre-flight check.**
> **Why are we letting AI agents act without one?"**
---
## What Is This?
TRIGNUM-300M is a **zero-model reasoning integrity validator** for LLM outputs. It catches structural logic failures — contradictions, circular reasoning, non-sequiturs — before an AI agent acts on them.
```python
from trignum_core.subtractive_filter import SubtractiveFilter
sf = SubtractiveFilter()
result = sf.apply(agent_output)
if result.illogics_found:
agent.halt(reason=result.illogics_found)
# T-CHIP glows RED 🔴 → Human review required
else:
agent.execute()
# T-CHIP glows BLUE 🔵 → Cleared for takeoff
```
**No LLM. No API. No training data. ~300 lines of Python. <1ms.**
---
## 🔬 Benchmark Results
We expanded our evaluation to **58,000+ real LLM outputs** including a new **517-sample curated dataset** for structural reasoning. Honest results:
| Benchmark | Samples | Precision | Recall | F1 | Speed |
| ---------------------------- | ------- | --------- | ------ | --------- | ----- |
| **Structural illogic (curated)** | **517** | **100%** | **98.9%** | **99.5%** | **<1ms** |
| HaluEval (full dataset) | 58,293 | 60% | 2.1% | 4.0% | 706ms |
### What this means:
- **99.5% F1 on structural reasoning failures** — contradictions, circular logic, unsupported conclusions
- **4.0% F1 on factual hallucinations** — we don't catch wrong facts
**That's the point.** There are 100 tools for fact-checking. There are **zero tools for reasoning-checking.** Until now.
### Per-Task Breakdown (HaluEval)
| Task | n | Precision | Recall | F1 |
| ------------- | ------ | --------- | ------ | ----- |
| QA | 18,316 | 83.3% | 0.25% | 0.50% |
| Dialogue | 19,977 | 60.1% | 4.38% | 8.16% |
| Summarization | 20,000 | 57.4% | 1.60% | 3.11% |
**Throughput: 146,866 samples/second** — orders of magnitude faster than LLM-based validation.
---
## ✈️ The Pre-Flight Check Analogy
A pre-flight checklist doesn't verify that London exists. It verifies that:
- ✅ Instruments don't **contradict** each other
- ✅ There are no **circular faults** (sensor A confirms B confirms A)
- ✅ The flight computer draws **conclusions from actual data**
- ✅ Systems are **logically consistent**
The Subtractive Filter does the same for AI reasoning:
```
LLM Output → Subtractive Filter → [PASS] 🔵 → Agent Executes
→ [FAIL] 🔴 → Agent Halts → Human Review
```
---
## 🤖 The Missing "Agentic Validator"
In the context of the recent shift towards **Agentic Reasoning**, autonomous LLMs are moving from static prompts to dynamic _thought-action_ loops involving planning, tool-use, and multi-agent collaboration.
Current systems rely heavily on probabilistic models to act as the "Critic/Evaluator" or use "Validator-Driven Feedback" via unit tests for code or simulators for robotics. **But there has been no validator for pure logic.** If an agent hallucinates a non-sequitur or circular justification during its internal planning phase, the error cascades.
TRIGNUM-300M fills this exact gap. It acts as a deterministic, <1ms **Validator-Driven Feedback** gate. It halts execution if the agent's internal thought (`zt`) contains a structural illogic, providing an immediate failure signal (`rt = 0`) _before_ the agent commits to an irreversible external action (`at`).
---
## 🔺 Core Architecture
### The Trignum Pyramid
Three faces acting as magnetic poles for data separation:
| Face | Role | What It Does |
| --------------- | --------------- | ----------------------------------------------------- |
| **α (Logic)** | Truth detection | Identifies structurally sound reasoning |
| **β (Illogic)** | Error detection | Catches contradictions, circular logic, non-sequiturs |
| **γ (Context)** | Human grounding | Anchors output to human intent |
### T-CHIP: The Tensor Character
```
╔═══════════════════════════════════════════════════════╗
║ T-CHIP [v.300M] ║
║ ║
║ 🔵 Blue = Logic Stable (Cleared for Takeoff) ║
║ 🔴 Red = Illogic Detected (THE FREEZE) ║
║ 🟡 Gold = Human Pulse Locked (Sovereign Override) ║
║ ║
║ Response time: <1ms | False alarms: 0% (structural) ║
╚═══════════════════════════════════════════════════════╝
```
### The Subtractive Filter
Four detection layers, all pattern-based:
| Layer | Catches | Method |
| ------------------ | ------------------------------------ | -------------------------------- |
| **Contradiction** | "X is always true. X is never true." | Antonym pairs, negation patterns |
| **Circular Logic** | A proves B proves A | Reference chain analysis |
| **Non-Sequitur** | "Therefore X" without premises | Causal connective analysis |
| **Depth Check** | Claims without any reasoning | Assertion density scoring |
---
## 📦 Repository Structure
```
TRIGNUM-300M-TCHIP/
├── src/
│ └── trignum_core/ # Core Python library
│ ├── pyramid.py # Trignum Pyramid (3 magnetic faces)
│ ├── tchip.py # T-CHIP (glow states)
│ ├── subtractive_filter.py # ★ The Subtractive Filter
│ ├── human_pulse.py # Human sovereignty layer
│ └── magnetic_trillage.py # Data separation
├── tests/ # 34 unit tests (all passing)
├── benchmarks/
│ ├── hallucination_benchmark.py # Curated structural test
│ ├── full_halueval_benchmark.py # Full 58K HaluEval test
│ ├── results.json # Structural benchmark results
│ └── full_halueval_results.json # Full HaluEval results
├── demo/
│ └── index.html # Three.js 3D interactive demo
├── paper/
│ └── TRIGNUM_300M_Position_Paper.md # Position paper
├── docs/
│ └── theory/ # 6 foundational theory documents
├── T-CHIP CLEARED FOR TAKEOFF.md # The pitch
└── ROADMAP.md # 2-quarter development plan
```
---
## 🚀 Quick Start
```bash
# Clone
git clone https://github.com/trace-on-lab/trignum-300m.git
cd trignum-300m
# Install
pip install -r requirements.txt
pip install -e .
# Run the structural benchmark
python benchmarks/hallucination_benchmark.py
# Run the full HaluEval benchmark (downloads ~13MB of data)
python benchmarks/full_halueval_benchmark.py
# Run tests
pytest tests/ -v
```
---
## 🌐 Prior Art: Nobody Is Doing This
We searched arXiv, ResearchGate, ACL Anthology, and Semantic Scholar. Every existing reasoning validation system requires model inference:
| System | Requires Model | Validates Reasoning |
| ---------------------------- | :-------------: | :-----------------: |
| VerifyLLM (2025) | ✅ Yes | Partially |
| ContraGen | ✅ Yes | Partially |
| Process Supervision (OpenAI) | ✅ Yes | Yes |
| Guardrails AI | ✅ Configurable | No (content) |
| **Subtractive Filter** | **❌ No** | **✅ Yes** |
> **Existing work uses LLMs to check LLMs. TRIGNUM uses logic to check LLMs.**
Read the full analysis in our [position paper](paper/TRIGNUM_300M_Position_Paper.md).
---
## ⚛️ Quantum Integration: TQPE
[](https://doi.org/10.5281/zenodo.18751914)
TRIGNUM-300M serves as Phase 1 ("Technical A Priori Validation") for **Trignumental Quantum Phase Estimation (TQPE)**.
In our groundbreaking case study estimating the ground state energy of the **H₂ molecule**, TRIGNUM successfully validated the physical consistency and structural logic of the quantum circuit _before execution_. By acting as the preliminary gatekeeper, TRIGNUM ensured that no quantum resources were wasted on structurally ill-formed configurations, enabling an epistemic confidence score of **82.8%** on the final estimate (-1.1384 Ha).
Read the full `BUILDING THE BRIDGE` paper on Trignumentality and TQPE in the foundational [Trignumentality](https://github.com/Codfski/trignumentality) repository.
---
## 📚 Documentation
| Document | Description |
| ---------------------------------------------------------------- | ----------------------------------- |
| [Core Postulate](docs/theory/01_core_postulate.md) | The fundamental axioms of Trignum |
| [Three Faces](docs/theory/02_three_faces.md) | α (Logic), β (Illogic), γ (Context) |
| [Magnetic Trillage](docs/theory/03_magnetic_trillage.md) | Data separation mechanism |
| [T-CHIP Spec](docs/theory/04_tchip_spec.md) | The Tensor Character in detail |
| [Cold State Hardware](docs/theory/05_cold_state_hardware.md) | Hardware implications |
| [Hallucination Paradox](docs/theory/06_hallucination_paradox.md) | Reframing the "Big Monster" |
| [Position Paper](paper/TRIGNUM_300M_Position_Paper.md) | Full academic paper with benchmarks |
| [Roadmap](ROADMAP.md) | 2-quarter development plan |
---
## 💎 The Golden Gems
| Gem | Wisdom |
| ----- | --------------------------------------- |
| GEM 1 | "The Human Pulse is the Master Clock" |
| GEM 2 | "The Illogic is the Compass" |
| GEM 3 | "Magnetic Trillage Over Brute Force" |
| GEM 4 | "The Hallucination is the Raw Material" |
| GEM 5 | "T-CHIP is the Mirror" |
---
## 🤝 Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
---
## 📄 License
MIT License — see [LICENSE](LICENSE).
---
## 📞 Contact
**TRACE ON LAB**
📧 traceonlab@proton.me
---
## 🛡️ The Call
> _"The most dangerous AI failure is not a wrong fact. It is reasoning that sounds right but isn't."_
```
╔═══════════════════════════════════════════════════════╗
║ 🧲 TRACE ON LAB — TRIGNUM-300M — v.300M ║
║ ║
║ The Pre-Flight Check for Autonomous AI. ║
║ Zero models. Zero API calls. 146,866 samples/second. ║
║ ║
║ 🔵 T-CHIP: CLEARED FOR TAKEOFF. ║
╚═══════════════════════════════════════════════════════╝
```
⭐ **Star this repo if you believe AI should check its logic before it acts.**