language:
- en
license: apache-2.0
tags:
- pytorch
- neural-controller
- llm-agent
- kv-cache
- mamba
- lora
- token-efficiency
- tool-routing
- belief-tracking
library_name: metacontrol
pipeline_tag: text-generation
NEXUS β Neural EXecution & Understanding Substrate
Model Description
NEXUS is a 6.29M-parameter neural controller that runs alongside a frozen LLM during agent task execution. It replaces verbose token-based communication (system prompts, tool definitions, history re-injection) with compressed vector signals injected directly into the LLM's KV-cache.
NEXUS comprises five subsystems:
| Component | Parameters | Role |
|---|---|---|
| Protocol Cortex (TSM) | 4,474,624 | KV-cache task vector injection |
| Belief Engine (BTBS) | 601,234 | Mamba SSM particle filter; tracks P(done) |
| Resource Router (FSM-NHC) | 184,213 | 7-class tool classifier |
| SAC Corrector | 454,273 | Semantic drift correction patches |
| Adapter Switch (TALoRA) | 42,373 | LoRA routing by sub-task type |
| Drift Sentinel | 33,287 | Drift detection from trajectory buffer |
| Total | 6,293,434 |
Intended Uses
- Agent system controllers β drop-in controller layer for LLM agent frameworks
- Token efficiency research β KV-cache prefix injection for overhead elimination
- Belief tracking β probabilistic task-state estimation for agentic loops
- Tool routing β lightweight 7-class classifier replacing LLM tool-selection reasoning
Out of Scope
- Standalone LLM inference (NEXUS requires a base LLM)
- Non-agent text generation tasks
Training Data
Checkpoints are trained on synthetic data approximating Chatp production agent interaction patterns:
| Component | Training samples | Distribution |
|---|---|---|
| Resource Router | 5,600 train / 840 val | Gaussian clusters per tool class |
| Sub-task Classifier | 4,000 train / 600 val | Gaussian clusters per sub-task |
| Belief Engine | ~1,700 train / 300 val | Sigmoid completion ramps |
| Drift Sentinel | Synthetic trajectories | Orthogonal rotation drift injection |
Training Procedure
All components trained with AdamW + CosineAnnealingLR on NVIDIA RTX 4060 (CUDA 12.8):
optimizer = AdamW(params, lr=3e-4, weight_decay=1e-4)
scheduler = CosineAnnealingLR(optimizer, T_max=30, eta_min=3e-6)
Evaluation Results
Evaluated on a 200-task synthetic benchmark (T=15 steps, 35% drift fraction, seed 2026):
| Metric | Baseline (random init) | NEXUS (trained) | Ξ |
|---|---|---|---|
| Belief Calibration Error β | 0.392 | 0.313 | β20.2% |
| Controller Efficiency Ratio β | 1.000 | 1.101 | +10.1% |
| Token Overhead Ratio β | 99.95% | 0.00% | β100% |
| Tool Routing Accuracy β | 13.5% | 14.0% | +3.7% |
Training results (val metrics):
| Component | Metric | Value |
|---|---|---|
| Resource Router | Val accuracy | 95.5% |
| Sub-task Classifier | Val accuracy | 99.8% |
| Belief Engine | Val loss (MSE) | 7Γ10β»β΅ |
| Drift Sentinel | Val accuracy | 69.0% |
How to Use
import torch
from metacontrol.core.config import MetacontrolConfig
from metacontrol.pipeline.metacontrol_pipeline import MetacontrolPipeline
cfg = MetacontrolConfig()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
pipeline = MetacontrolPipeline(cfg).to(device)
pipeline.load_checkpoint("checkpoints/")
pipeline.reset(batch_size=1, device=device)
# Each step: pass LLM hidden states + goal embedding
llm_hidden = torch.randn(1, 8, cfg.base_llm_d_model, device=device)
goal_embed = torch.randn(1, cfg.base_llm_d_model, device=device)
result = pipeline.step(llm_hidden, goal_embed)
print(result["state_summary"])
# === Step 1 ===
# Phase: EXECUTE (conf=0.50)
# Tool: BROWSER_RELAY (conf=0.91)
# P(done): 0.300
# Drift: 0.766
Quick Start
git clone https://github.com/brian-Lab-0/nexus
cd nexus
pip install -e ".[dev]"
python -m pytest # 155 tests, should all pass
python scripts/evaluate.py --n-tasks 200 --device cuda
Hardware Requirements
| Configuration | VRAM | Notes |
|---|---|---|
| Controller only (no LLM) | ~24 MB | Inference only |
| + TinyLlama 1.1B fp16 | ~2.1 GB | Full pipeline |
| + Training (classifiers) | ~1 GB | 30 epochs, batch 64 |
| + Training (Belief Engine) | ~2 GB | BPTT over T=20 steps |
Tested on: NVIDIA RTX 4060 Laptop GPU (8 GB VRAM), CUDA 12.8, PyTorch 2.9.1+cu128.
Limitations
- Synthetic training gap β TTCS/DRP metrics require real production traces to differentiate from baseline
- LTVI pending β KV-cache injection path is functional but Protocol Cortex needs training on real traces for coherent generation
- Drift sentinel β 69% accuracy; natural drift is more varied than synthetic orthogonal rotation
Citation
@software{langay2026nexus,
author = {Langay, Brian},
title = {{NEXUS}: Neural {EXecution} \& {Understanding} {Substrate}},
year = {2026},
publisher = {OpenBnet},
url = {https://github.com/brian-Lab-0/nexus},
note = {6.29M-parameter neural controller for LLM agent systems}
}
Created and maintained by Brian Langay β support@openbnet.com Β· services@openbnet.cloud
OpenBnet Β· 2026
Paper: NEXUS: Design, Implementation, and Empirical Evaluation of a Lightweight Neural Controller for LLM Agent Systems
Code: github.com/brian-Lab-0/nexus
License: Apache 2.0