--- language: - en license: apache-2.0 tags: - pytorch - neural-controller - llm-agent - kv-cache - mamba - lora - token-efficiency - tool-routing - belief-tracking library_name: metacontrol pipeline_tag: text-generation --- # NEXUS — Neural EXecution & Understanding Substrate --- ## Model Description NEXUS is a **6.29M-parameter neural controller** that runs alongside a frozen LLM during agent task execution. It replaces verbose token-based communication (system prompts, tool definitions, history re-injection) with compressed vector signals injected directly into the LLM's KV-cache. NEXUS comprises five subsystems: | Component | Parameters | Role | |---|---:|---| | Protocol Cortex (TSM) | 4,474,624 | KV-cache task vector injection | | Belief Engine (BTBS) | 601,234 | Mamba SSM particle filter; tracks P(done) | | Resource Router (FSM-NHC) | 184,213 | 7-class tool classifier | | SAC Corrector | 454,273 | Semantic drift correction patches | | Adapter Switch (TALoRA) | 42,373 | LoRA routing by sub-task type | | Drift Sentinel | 33,287 | Drift detection from trajectory buffer | | **Total** | **6,293,434** | | --- ## Intended Uses - **Agent system controllers** — drop-in controller layer for LLM agent frameworks - **Token efficiency research** — KV-cache prefix injection for overhead elimination - **Belief tracking** — probabilistic task-state estimation for agentic loops - **Tool routing** — lightweight 7-class classifier replacing LLM tool-selection reasoning ### Out of Scope - Standalone LLM inference (NEXUS requires a base LLM) - Non-agent text generation tasks --- ## Training Data Checkpoints are trained on **synthetic data** approximating Chatp production agent interaction patterns: | Component | Training samples | Distribution | |---|---:|---| | Resource Router | 5,600 train / 840 val | Gaussian clusters per tool class | | Sub-task Classifier | 4,000 train / 600 val | Gaussian clusters per sub-task | | Belief Engine | ~1,700 train / 300 val | Sigmoid completion ramps | | Drift Sentinel | Synthetic trajectories | Orthogonal rotation drift injection | --- ## Training Procedure All components trained with AdamW + CosineAnnealingLR on NVIDIA RTX 4060 (CUDA 12.8): ```python optimizer = AdamW(params, lr=3e-4, weight_decay=1e-4) scheduler = CosineAnnealingLR(optimizer, T_max=30, eta_min=3e-6) ``` --- ## Evaluation Results Evaluated on a 200-task synthetic benchmark (T=15 steps, 35% drift fraction, seed 2026): | Metric | Baseline (random init) | NEXUS (trained) | Δ | |---|---:|---:|---:| | Belief Calibration Error ↓ | 0.392 | **0.313** | −20.2% | | Controller Efficiency Ratio ↑ | 1.000 | **1.101** | +10.1% | | Token Overhead Ratio ↓ | 99.95% | **0.00%** | −100% | | Tool Routing Accuracy ↑ | 13.5% | **14.0%** | +3.7% | **Training results (val metrics):** | Component | Metric | Value | |---|---|---| | Resource Router | Val accuracy | **95.5%** | | Sub-task Classifier | Val accuracy | **99.8%** | | Belief Engine | Val loss (MSE) | **7×10⁻⁵** | | Drift Sentinel | Val accuracy | **69.0%** | --- ## How to Use ```python import torch from metacontrol.core.config import MetacontrolConfig from metacontrol.pipeline.metacontrol_pipeline import MetacontrolPipeline cfg = MetacontrolConfig() device = torch.device("cuda" if torch.cuda.is_available() else "cpu") pipeline = MetacontrolPipeline(cfg).to(device) pipeline.load_checkpoint("checkpoints/") pipeline.reset(batch_size=1, device=device) # Each step: pass LLM hidden states + goal embedding llm_hidden = torch.randn(1, 8, cfg.base_llm_d_model, device=device) goal_embed = torch.randn(1, cfg.base_llm_d_model, device=device) result = pipeline.step(llm_hidden, goal_embed) print(result["state_summary"]) # === Step 1 === # Phase: EXECUTE (conf=0.50) # Tool: BROWSER_RELAY (conf=0.91) # P(done): 0.300 # Drift: 0.766 ``` ### Quick Start ```bash git clone https://github.com/brian-Lab-0/nexus cd nexus pip install -e ".[dev]" python -m pytest # 155 tests, should all pass python scripts/evaluate.py --n-tasks 200 --device cuda ``` --- ## Hardware Requirements | Configuration | VRAM | Notes | |---|---|---| | Controller only (no LLM) | ~24 MB | Inference only | | + TinyLlama 1.1B fp16 | ~2.1 GB | Full pipeline | | + Training (classifiers) | ~1 GB | 30 epochs, batch 64 | | + Training (Belief Engine) | ~2 GB | BPTT over T=20 steps | Tested on: NVIDIA RTX 4060 Laptop GPU (8 GB VRAM), CUDA 12.8, PyTorch 2.9.1+cu128. --- ## Limitations - **Synthetic training gap** — TTCS/DRP metrics require real production traces to differentiate from baseline - **LTVI pending** — KV-cache injection path is functional but Protocol Cortex needs training on real traces for coherent generation - **Drift sentinel** — 69% accuracy; natural drift is more varied than synthetic orthogonal rotation --- ## Citation ```bibtex @software{langay2026nexus, author = {Langay, Brian}, title = {{NEXUS}: Neural {EXecution} \& {Understanding} {Substrate}}, year = {2026}, publisher = {OpenBnet}, url = {https://github.com/brian-Lab-0/nexus}, note = {6.29M-parameter neural controller for LLM agent systems} } ``` --- *Created and maintained by **Brian Langay** — [support@openbnet.com](mailto:support@openbnet.com) · [services@openbnet.cloud](mailto:services@openbnet.cloud)* *OpenBnet · 2026* **Paper:** [NEXUS: Design, Implementation, and Empirical Evaluation of a Lightweight Neural Controller for LLM Agent Systems](https://github.com/brian-Lab-0/nexus/blob/main/NEXUS_Implementation_Report.pdf) **Code:** [github.com/brian-Lab-0/nexus](https://github.com/brian-Lab-0/nexus) **License:** Apache 2.0