nexus / README.md
adobeXd's picture
updated info
9988fa1 verified
metadata
language:
  - en
license: apache-2.0
tags:
  - pytorch
  - neural-controller
  - llm-agent
  - kv-cache
  - mamba
  - lora
  - token-efficiency
  - tool-routing
  - belief-tracking
library_name: metacontrol
pipeline_tag: text-generation

NEXUS β€” Neural EXecution & Understanding Substrate


Model Description

NEXUS is a 6.29M-parameter neural controller that runs alongside a frozen LLM during agent task execution. It replaces verbose token-based communication (system prompts, tool definitions, history re-injection) with compressed vector signals injected directly into the LLM's KV-cache.

NEXUS comprises five subsystems:

Component Parameters Role
Protocol Cortex (TSM) 4,474,624 KV-cache task vector injection
Belief Engine (BTBS) 601,234 Mamba SSM particle filter; tracks P(done)
Resource Router (FSM-NHC) 184,213 7-class tool classifier
SAC Corrector 454,273 Semantic drift correction patches
Adapter Switch (TALoRA) 42,373 LoRA routing by sub-task type
Drift Sentinel 33,287 Drift detection from trajectory buffer
Total 6,293,434

Intended Uses

  • Agent system controllers β€” drop-in controller layer for LLM agent frameworks
  • Token efficiency research β€” KV-cache prefix injection for overhead elimination
  • Belief tracking β€” probabilistic task-state estimation for agentic loops
  • Tool routing β€” lightweight 7-class classifier replacing LLM tool-selection reasoning

Out of Scope

  • Standalone LLM inference (NEXUS requires a base LLM)
  • Non-agent text generation tasks

Training Data

Checkpoints are trained on synthetic data approximating Chatp production agent interaction patterns:

Component Training samples Distribution
Resource Router 5,600 train / 840 val Gaussian clusters per tool class
Sub-task Classifier 4,000 train / 600 val Gaussian clusters per sub-task
Belief Engine ~1,700 train / 300 val Sigmoid completion ramps
Drift Sentinel Synthetic trajectories Orthogonal rotation drift injection

Training Procedure

All components trained with AdamW + CosineAnnealingLR on NVIDIA RTX 4060 (CUDA 12.8):

optimizer = AdamW(params, lr=3e-4, weight_decay=1e-4)
scheduler = CosineAnnealingLR(optimizer, T_max=30, eta_min=3e-6)

Evaluation Results

Evaluated on a 200-task synthetic benchmark (T=15 steps, 35% drift fraction, seed 2026):

Metric Baseline (random init) NEXUS (trained) Ξ”
Belief Calibration Error ↓ 0.392 0.313 βˆ’20.2%
Controller Efficiency Ratio ↑ 1.000 1.101 +10.1%
Token Overhead Ratio ↓ 99.95% 0.00% βˆ’100%
Tool Routing Accuracy ↑ 13.5% 14.0% +3.7%

Training results (val metrics):

Component Metric Value
Resource Router Val accuracy 95.5%
Sub-task Classifier Val accuracy 99.8%
Belief Engine Val loss (MSE) 7Γ—10⁻⁡
Drift Sentinel Val accuracy 69.0%

How to Use

import torch
from metacontrol.core.config import MetacontrolConfig
from metacontrol.pipeline.metacontrol_pipeline import MetacontrolPipeline

cfg = MetacontrolConfig()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

pipeline = MetacontrolPipeline(cfg).to(device)
pipeline.load_checkpoint("checkpoints/")
pipeline.reset(batch_size=1, device=device)

# Each step: pass LLM hidden states + goal embedding
llm_hidden = torch.randn(1, 8, cfg.base_llm_d_model, device=device)
goal_embed  = torch.randn(1, cfg.base_llm_d_model, device=device)

result = pipeline.step(llm_hidden, goal_embed)
print(result["state_summary"])
# === Step 1 ===
#   Phase:   EXECUTE (conf=0.50)
#   Tool:    BROWSER_RELAY (conf=0.91)
#   P(done): 0.300
#   Drift:   0.766

Quick Start

git clone https://github.com/brian-Lab-0/nexus
cd nexus
pip install -e ".[dev]"
python -m pytest          # 155 tests, should all pass
python scripts/evaluate.py --n-tasks 200 --device cuda

Hardware Requirements

Configuration VRAM Notes
Controller only (no LLM) ~24 MB Inference only
+ TinyLlama 1.1B fp16 ~2.1 GB Full pipeline
+ Training (classifiers) ~1 GB 30 epochs, batch 64
+ Training (Belief Engine) ~2 GB BPTT over T=20 steps

Tested on: NVIDIA RTX 4060 Laptop GPU (8 GB VRAM), CUDA 12.8, PyTorch 2.9.1+cu128.


Limitations

  • Synthetic training gap β€” TTCS/DRP metrics require real production traces to differentiate from baseline
  • LTVI pending β€” KV-cache injection path is functional but Protocol Cortex needs training on real traces for coherent generation
  • Drift sentinel β€” 69% accuracy; natural drift is more varied than synthetic orthogonal rotation

Citation

@software{langay2026nexus,
  author    = {Langay, Brian},
  title     = {{NEXUS}: Neural {EXecution} \& {Understanding} {Substrate}},
  year      = {2026},
  publisher = {OpenBnet},
  url       = {https://github.com/brian-Lab-0/nexus},
  note      = {6.29M-parameter neural controller for LLM agent systems}
}

Created and maintained by Brian Langay β€” support@openbnet.com Β· services@openbnet.cloud
OpenBnet Β· 2026 Paper: NEXUS: Design, Implementation, and Empirical Evaluation of a Lightweight Neural Controller for LLM Agent Systems
Code: github.com/brian-Lab-0/nexus
License: Apache 2.0