nexus / README.md
adobeXd's picture
updated info
9988fa1 verified
---
language:
- en
license: apache-2.0
tags:
- pytorch
- neural-controller
- llm-agent
- kv-cache
- mamba
- lora
- token-efficiency
- tool-routing
- belief-tracking
library_name: metacontrol
pipeline_tag: text-generation
---
# NEXUS β€” Neural EXecution & Understanding Substrate
---
## Model Description
NEXUS is a **6.29M-parameter neural controller** that runs alongside a frozen LLM during agent task execution. It replaces verbose token-based communication (system prompts, tool definitions, history re-injection) with compressed vector signals injected directly into the LLM's KV-cache.
NEXUS comprises five subsystems:
| Component | Parameters | Role |
|---|---:|---|
| Protocol Cortex (TSM) | 4,474,624 | KV-cache task vector injection |
| Belief Engine (BTBS) | 601,234 | Mamba SSM particle filter; tracks P(done) |
| Resource Router (FSM-NHC) | 184,213 | 7-class tool classifier |
| SAC Corrector | 454,273 | Semantic drift correction patches |
| Adapter Switch (TALoRA) | 42,373 | LoRA routing by sub-task type |
| Drift Sentinel | 33,287 | Drift detection from trajectory buffer |
| **Total** | **6,293,434** | |
---
## Intended Uses
- **Agent system controllers** β€” drop-in controller layer for LLM agent frameworks
- **Token efficiency research** β€” KV-cache prefix injection for overhead elimination
- **Belief tracking** β€” probabilistic task-state estimation for agentic loops
- **Tool routing** β€” lightweight 7-class classifier replacing LLM tool-selection reasoning
### Out of Scope
- Standalone LLM inference (NEXUS requires a base LLM)
- Non-agent text generation tasks
---
## Training Data
Checkpoints are trained on **synthetic data** approximating Chatp production agent interaction patterns:
| Component | Training samples | Distribution |
|---|---:|---|
| Resource Router | 5,600 train / 840 val | Gaussian clusters per tool class |
| Sub-task Classifier | 4,000 train / 600 val | Gaussian clusters per sub-task |
| Belief Engine | ~1,700 train / 300 val | Sigmoid completion ramps |
| Drift Sentinel | Synthetic trajectories | Orthogonal rotation drift injection |
---
## Training Procedure
All components trained with AdamW + CosineAnnealingLR on NVIDIA RTX 4060 (CUDA 12.8):
```python
optimizer = AdamW(params, lr=3e-4, weight_decay=1e-4)
scheduler = CosineAnnealingLR(optimizer, T_max=30, eta_min=3e-6)
```
---
## Evaluation Results
Evaluated on a 200-task synthetic benchmark (T=15 steps, 35% drift fraction, seed 2026):
| Metric | Baseline (random init) | NEXUS (trained) | Ξ” |
|---|---:|---:|---:|
| Belief Calibration Error ↓ | 0.392 | **0.313** | βˆ’20.2% |
| Controller Efficiency Ratio ↑ | 1.000 | **1.101** | +10.1% |
| Token Overhead Ratio ↓ | 99.95% | **0.00%** | βˆ’100% |
| Tool Routing Accuracy ↑ | 13.5% | **14.0%** | +3.7% |
**Training results (val metrics):**
| Component | Metric | Value |
|---|---|---|
| Resource Router | Val accuracy | **95.5%** |
| Sub-task Classifier | Val accuracy | **99.8%** |
| Belief Engine | Val loss (MSE) | **7Γ—10⁻⁡** |
| Drift Sentinel | Val accuracy | **69.0%** |
---
## How to Use
```python
import torch
from metacontrol.core.config import MetacontrolConfig
from metacontrol.pipeline.metacontrol_pipeline import MetacontrolPipeline
cfg = MetacontrolConfig()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
pipeline = MetacontrolPipeline(cfg).to(device)
pipeline.load_checkpoint("checkpoints/")
pipeline.reset(batch_size=1, device=device)
# Each step: pass LLM hidden states + goal embedding
llm_hidden = torch.randn(1, 8, cfg.base_llm_d_model, device=device)
goal_embed = torch.randn(1, cfg.base_llm_d_model, device=device)
result = pipeline.step(llm_hidden, goal_embed)
print(result["state_summary"])
# === Step 1 ===
# Phase: EXECUTE (conf=0.50)
# Tool: BROWSER_RELAY (conf=0.91)
# P(done): 0.300
# Drift: 0.766
```
### Quick Start
```bash
git clone https://github.com/brian-Lab-0/nexus
cd nexus
pip install -e ".[dev]"
python -m pytest # 155 tests, should all pass
python scripts/evaluate.py --n-tasks 200 --device cuda
```
---
## Hardware Requirements
| Configuration | VRAM | Notes |
|---|---|---|
| Controller only (no LLM) | ~24 MB | Inference only |
| + TinyLlama 1.1B fp16 | ~2.1 GB | Full pipeline |
| + Training (classifiers) | ~1 GB | 30 epochs, batch 64 |
| + Training (Belief Engine) | ~2 GB | BPTT over T=20 steps |
Tested on: NVIDIA RTX 4060 Laptop GPU (8 GB VRAM), CUDA 12.8, PyTorch 2.9.1+cu128.
---
## Limitations
- **Synthetic training gap** β€” TTCS/DRP metrics require real production traces to differentiate from baseline
- **LTVI pending** β€” KV-cache injection path is functional but Protocol Cortex needs training on real traces for coherent generation
- **Drift sentinel** β€” 69% accuracy; natural drift is more varied than synthetic orthogonal rotation
---
## Citation
```bibtex
@software{langay2026nexus,
author = {Langay, Brian},
title = {{NEXUS}: Neural {EXecution} \& {Understanding} {Substrate}},
year = {2026},
publisher = {OpenBnet},
url = {https://github.com/brian-Lab-0/nexus},
note = {6.29M-parameter neural controller for LLM agent systems}
}
```
---
*Created and maintained by **Brian Langay** β€” [support@openbnet.com](mailto:support@openbnet.com) Β· [services@openbnet.cloud](mailto:services@openbnet.cloud)*
*OpenBnet Β· 2026*
**Paper:** [NEXUS: Design, Implementation, and Empirical Evaluation of a Lightweight Neural Controller for LLM Agent Systems](https://github.com/brian-Lab-0/nexus/blob/main/NEXUS_Implementation_Report.pdf)
**Code:** [github.com/brian-Lab-0/nexus](https://github.com/brian-Lab-0/nexus)
**License:** Apache 2.0