nexus / README.md

updated info

9988fa1 verified 3 days ago

5.72 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- pytorch
	- neural-controller
	- llm-agent
	- kv-cache
	- mamba
	- lora
	- token-efficiency
	- tool-routing
	- belief-tracking
	library_name: metacontrol
	pipeline_tag: text-generation
	---

	# NEXUS — Neural EXecution & Understanding Substrate

	---

	## Model Description

	NEXUS is a 6.29M-parameter neural controller that runs alongside a frozen LLM during agent task execution. It replaces verbose token-based communication (system prompts, tool definitions, history re-injection) with compressed vector signals injected directly into the LLM's KV-cache.

	NEXUS comprises five subsystems:

	\| Component \| Parameters \| Role \|
	\|---\|---:\|---\|
	\| Protocol Cortex (TSM) \| 4,474,624 \| KV-cache task vector injection \|
	\| Belief Engine (BTBS) \| 601,234 \| Mamba SSM particle filter; tracks P(done) \|
	\| Resource Router (FSM-NHC) \| 184,213 \| 7-class tool classifier \|
	\| SAC Corrector \| 454,273 \| Semantic drift correction patches \|
	\| Adapter Switch (TALoRA) \| 42,373 \| LoRA routing by sub-task type \|
	\| Drift Sentinel \| 33,287 \| Drift detection from trajectory buffer \|
	\| Total \| 6,293,434 \| \|

	---

	## Intended Uses

	- Agent system controllers — drop-in controller layer for LLM agent frameworks
	- Token efficiency research — KV-cache prefix injection for overhead elimination
	- Belief tracking — probabilistic task-state estimation for agentic loops
	- Tool routing — lightweight 7-class classifier replacing LLM tool-selection reasoning

	### Out of Scope
	- Standalone LLM inference (NEXUS requires a base LLM)
	- Non-agent text generation tasks

	---

	## Training Data

	Checkpoints are trained on synthetic data approximating Chatp production agent interaction patterns:

	\| Component \| Training samples \| Distribution \|
	\|---\|---:\|---\|
	\| Resource Router \| 5,600 train / 840 val \| Gaussian clusters per tool class \|
	\| Sub-task Classifier \| 4,000 train / 600 val \| Gaussian clusters per sub-task \|
	\| Belief Engine \| ~1,700 train / 300 val \| Sigmoid completion ramps \|
	\| Drift Sentinel \| Synthetic trajectories \| Orthogonal rotation drift injection \|

	---

	## Training Procedure

	All components trained with AdamW + CosineAnnealingLR on NVIDIA RTX 4060 (CUDA 12.8):

	```python
	optimizer = AdamW(params, lr=3e-4, weight_decay=1e-4)
	scheduler = CosineAnnealingLR(optimizer, T_max=30, eta_min=3e-6)
	```

	---

	## Evaluation Results

	Evaluated on a 200-task synthetic benchmark (T=15 steps, 35% drift fraction, seed 2026):

	\| Metric \| Baseline (random init) \| NEXUS (trained) \| Δ \|
	\|---\|---:\|---:\|---:\|
	\| Belief Calibration Error ↓ \| 0.392 \| 0.313 \| −20.2% \|
	\| Controller Efficiency Ratio ↑ \| 1.000 \| 1.101 \| +10.1% \|
	\| Token Overhead Ratio ↓ \| 99.95% \| 0.00% \| −100% \|
	\| Tool Routing Accuracy ↑ \| 13.5% \| 14.0% \| +3.7% \|

	Training results (val metrics):

	\| Component \| Metric \| Value \|
	\|---\|---\|---\|
	\| Resource Router \| Val accuracy \| 95.5% \|
	\| Sub-task Classifier \| Val accuracy \| 99.8% \|
	\| Belief Engine \| Val loss (MSE) \| 7×10⁻⁵ \|
	\| Drift Sentinel \| Val accuracy \| 69.0% \|

	---

	## How to Use

	```python
	import torch
	from metacontrol.core.config import MetacontrolConfig
	from metacontrol.pipeline.metacontrol_pipeline import MetacontrolPipeline

	cfg = MetacontrolConfig()
	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

	pipeline = MetacontrolPipeline(cfg).to(device)
	pipeline.load_checkpoint("checkpoints/")
	pipeline.reset(batch_size=1, device=device)

	# Each step: pass LLM hidden states + goal embedding
	llm_hidden = torch.randn(1, 8, cfg.base_llm_d_model, device=device)
	goal_embed = torch.randn(1, cfg.base_llm_d_model, device=device)

	result = pipeline.step(llm_hidden, goal_embed)
	print(result["state_summary"])
	# === Step 1 ===
	# Phase: EXECUTE (conf=0.50)
	# Tool: BROWSER_RELAY (conf=0.91)
	# P(done): 0.300
	# Drift: 0.766
	```

	### Quick Start

	```bash
	git clone https://github.com/brian-Lab-0/nexus
	cd nexus
	pip install -e ".[dev]"
	python -m pytest # 155 tests, should all pass
	python scripts/evaluate.py --n-tasks 200 --device cuda
	```

	---

	## Hardware Requirements

	\| Configuration \| VRAM \| Notes \|
	\|---\|---\|---\|
	\| Controller only (no LLM) \| ~24 MB \| Inference only \|
	\| + TinyLlama 1.1B fp16 \| ~2.1 GB \| Full pipeline \|
	\| + Training (classifiers) \| ~1 GB \| 30 epochs, batch 64 \|
	\| + Training (Belief Engine) \| ~2 GB \| BPTT over T=20 steps \|

	Tested on: NVIDIA RTX 4060 Laptop GPU (8 GB VRAM), CUDA 12.8, PyTorch 2.9.1+cu128.

	---

	## Limitations

	- Synthetic training gap — TTCS/DRP metrics require real production traces to differentiate from baseline
	- LTVI pending — KV-cache injection path is functional but Protocol Cortex needs training on real traces for coherent generation
	- Drift sentinel — 69% accuracy; natural drift is more varied than synthetic orthogonal rotation

	---

	## Citation

	```bibtex
	@software{langay2026nexus,
	author = {Langay, Brian},
	title = {{NEXUS}: Neural {EXecution} \& {Understanding} {Substrate}},
	year = {2026},
	publisher = {OpenBnet},
	url = {https://github.com/brian-Lab-0/nexus},
	note = {6.29M-parameter neural controller for LLM agent systems}
	}
	```

	---

	Created and maintained by Brian Langay* — [support@openbnet.com](mailto:support@openbnet.com) · [services@openbnet.cloud](mailto:services@openbnet.cloud)*
	OpenBnet · 2026
	Paper: [NEXUS: Design, Implementation, and Empirical Evaluation of a Lightweight Neural Controller for LLM Agent Systems](https://github.com/brian-Lab-0/nexus/blob/main/NEXUS_Implementation_Report.pdf)
	Code: [github.com/brian-Lab-0/nexus](https://github.com/brian-Lab-0/nexus)
	License: Apache 2.0