File size: 5,716 Bytes

---
language:
- en
license: apache-2.0
tags:
- pytorch
- neural-controller
- llm-agent
- kv-cache
- mamba
- lora
- token-efficiency
- tool-routing
- belief-tracking
library_name: metacontrol
pipeline_tag: text-generation
---

# NEXUS — Neural EXecution & Understanding Substrate

---

## Model Description

NEXUS is a **6.29M-parameter neural controller** that runs alongside a frozen LLM during agent task execution. It replaces verbose token-based communication (system prompts, tool definitions, history re-injection) with compressed vector signals injected directly into the LLM's KV-cache.

NEXUS comprises five subsystems:

| Component | Parameters | Role |
|---|---:|---|
| Protocol Cortex (TSM) | 4,474,624 | KV-cache task vector injection |
| Belief Engine (BTBS) | 601,234 | Mamba SSM particle filter; tracks P(done) |
| Resource Router (FSM-NHC) | 184,213 | 7-class tool classifier |
| SAC Corrector | 454,273 | Semantic drift correction patches |
| Adapter Switch (TALoRA) | 42,373 | LoRA routing by sub-task type |
| Drift Sentinel | 33,287 | Drift detection from trajectory buffer |
| **Total** | **6,293,434** | |

---

## Intended Uses

- **Agent system controllers** — drop-in controller layer for LLM agent frameworks
- **Token efficiency research** — KV-cache prefix injection for overhead elimination
- **Belief tracking** — probabilistic task-state estimation for agentic loops
- **Tool routing** — lightweight 7-class classifier replacing LLM tool-selection reasoning

### Out of Scope
- Standalone LLM inference (NEXUS requires a base LLM)
- Non-agent text generation tasks

---

## Training Data

Checkpoints are trained on **synthetic data** approximating Chatp production agent interaction patterns:

| Component | Training samples | Distribution |
|---|---:|---|
| Resource Router | 5,600 train / 840 val | Gaussian clusters per tool class |
| Sub-task Classifier | 4,000 train / 600 val | Gaussian clusters per sub-task |
| Belief Engine | ~1,700 train / 300 val | Sigmoid completion ramps |
| Drift Sentinel | Synthetic trajectories | Orthogonal rotation drift injection |

---

## Training Procedure

All components trained with AdamW + CosineAnnealingLR on NVIDIA RTX 4060 (CUDA 12.8):

```python
optimizer = AdamW(params, lr=3e-4, weight_decay=1e-4)
scheduler = CosineAnnealingLR(optimizer, T_max=30, eta_min=3e-6)
```

---

## Evaluation Results

Evaluated on a 200-task synthetic benchmark (T=15 steps, 35% drift fraction, seed 2026):

| Metric | Baseline (random init) | NEXUS (trained) | Δ |
|---|---:|---:|---:|
| Belief Calibration Error ↓ | 0.392 | **0.313** | −20.2% |
| Controller Efficiency Ratio ↑ | 1.000 | **1.101** | +10.1% |
| Token Overhead Ratio ↓ | 99.95% | **0.00%** | −100% |
| Tool Routing Accuracy ↑ | 13.5% | **14.0%** | +3.7% |

**Training results (val metrics):**

| Component | Metric | Value |
|---|---|---|
| Resource Router | Val accuracy | **95.5%** |
| Sub-task Classifier | Val accuracy | **99.8%** |
| Belief Engine | Val loss (MSE) | **7×10⁻⁵** |
| Drift Sentinel | Val accuracy | **69.0%** |

---

## How to Use

```python
import torch
from metacontrol.core.config import MetacontrolConfig
from metacontrol.pipeline.metacontrol_pipeline import MetacontrolPipeline

cfg = MetacontrolConfig()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

pipeline = MetacontrolPipeline(cfg).to(device)
pipeline.load_checkpoint("checkpoints/")
pipeline.reset(batch_size=1, device=device)

# Each step: pass LLM hidden states + goal embedding
llm_hidden = torch.randn(1, 8, cfg.base_llm_d_model, device=device)
goal_embed  = torch.randn(1, cfg.base_llm_d_model, device=device)

result = pipeline.step(llm_hidden, goal_embed)
print(result["state_summary"])
# === Step 1 ===
#   Phase:   EXECUTE (conf=0.50)
#   Tool:    BROWSER_RELAY (conf=0.91)
#   P(done): 0.300
#   Drift:   0.766
```

### Quick Start

```bash
git clone https://github.com/brian-Lab-0/nexus
cd nexus
pip install -e ".[dev]"
python -m pytest          # 155 tests, should all pass
python scripts/evaluate.py --n-tasks 200 --device cuda
```

---

## Hardware Requirements

| Configuration | VRAM | Notes |
|---|---|---|
| Controller only (no LLM) | ~24 MB | Inference only |
| + TinyLlama 1.1B fp16 | ~2.1 GB | Full pipeline |
| + Training (classifiers) | ~1 GB | 30 epochs, batch 64 |
| + Training (Belief Engine) | ~2 GB | BPTT over T=20 steps |

Tested on: NVIDIA RTX 4060 Laptop GPU (8 GB VRAM), CUDA 12.8, PyTorch 2.9.1+cu128.

---

## Limitations

- **Synthetic training gap** — TTCS/DRP metrics require real production traces to differentiate from baseline
- **LTVI pending** — KV-cache injection path is functional but Protocol Cortex needs training on real traces for coherent generation
- **Drift sentinel** — 69% accuracy; natural drift is more varied than synthetic orthogonal rotation

---

## Citation

```bibtex
@software{langay2026nexus,
  author    = {Langay, Brian},
  title     = {{NEXUS}: Neural {EXecution} \& {Understanding} {Substrate}},
  year      = {2026},
  publisher = {OpenBnet},
  url       = {https://github.com/brian-Lab-0/nexus},
  note      = {6.29M-parameter neural controller for LLM agent systems}
}
```

---

*Created and maintained by **Brian Langay** — [support@openbnet.com](mailto:support@openbnet.com) · [services@openbnet.cloud](mailto:services@openbnet.cloud)*  
*OpenBnet · 2026*
**Paper:** [NEXUS: Design, Implementation, and Empirical Evaluation of a Lightweight Neural Controller for LLM Agent Systems](https://github.com/brian-Lab-0/nexus/blob/main/NEXUS_Implementation_Report.pdf)  
**Code:** [github.com/brian-Lab-0/nexus](https://github.com/brian-Lab-0/nexus)  
**License:** Apache 2.0