| --- |
| language: |
| - en |
| license: apache-2.0 |
| tags: |
| - pytorch |
| - neural-controller |
| - llm-agent |
| - kv-cache |
| - mamba |
| - lora |
| - token-efficiency |
| - tool-routing |
| - belief-tracking |
| library_name: metacontrol |
| pipeline_tag: text-generation |
| --- |
| |
| # NEXUS β Neural EXecution & Understanding Substrate |
|
|
| --- |
|
|
| ## Model Description |
|
|
| NEXUS is a **6.29M-parameter neural controller** that runs alongside a frozen LLM during agent task execution. It replaces verbose token-based communication (system prompts, tool definitions, history re-injection) with compressed vector signals injected directly into the LLM's KV-cache. |
|
|
| NEXUS comprises five subsystems: |
|
|
| | Component | Parameters | Role | |
| |---|---:|---| |
| | Protocol Cortex (TSM) | 4,474,624 | KV-cache task vector injection | |
| | Belief Engine (BTBS) | 601,234 | Mamba SSM particle filter; tracks P(done) | |
| | Resource Router (FSM-NHC) | 184,213 | 7-class tool classifier | |
| | SAC Corrector | 454,273 | Semantic drift correction patches | |
| | Adapter Switch (TALoRA) | 42,373 | LoRA routing by sub-task type | |
| | Drift Sentinel | 33,287 | Drift detection from trajectory buffer | |
| | **Total** | **6,293,434** | | |
|
|
| --- |
|
|
| ## Intended Uses |
|
|
| - **Agent system controllers** β drop-in controller layer for LLM agent frameworks |
| - **Token efficiency research** β KV-cache prefix injection for overhead elimination |
| - **Belief tracking** β probabilistic task-state estimation for agentic loops |
| - **Tool routing** β lightweight 7-class classifier replacing LLM tool-selection reasoning |
|
|
| ### Out of Scope |
| - Standalone LLM inference (NEXUS requires a base LLM) |
| - Non-agent text generation tasks |
|
|
| --- |
|
|
| ## Training Data |
|
|
| Checkpoints are trained on **synthetic data** approximating Chatp production agent interaction patterns: |
|
|
| | Component | Training samples | Distribution | |
| |---|---:|---| |
| | Resource Router | 5,600 train / 840 val | Gaussian clusters per tool class | |
| | Sub-task Classifier | 4,000 train / 600 val | Gaussian clusters per sub-task | |
| | Belief Engine | ~1,700 train / 300 val | Sigmoid completion ramps | |
| | Drift Sentinel | Synthetic trajectories | Orthogonal rotation drift injection | |
|
|
| --- |
|
|
| ## Training Procedure |
|
|
| All components trained with AdamW + CosineAnnealingLR on NVIDIA RTX 4060 (CUDA 12.8): |
|
|
| ```python |
| optimizer = AdamW(params, lr=3e-4, weight_decay=1e-4) |
| scheduler = CosineAnnealingLR(optimizer, T_max=30, eta_min=3e-6) |
| ``` |
|
|
| --- |
|
|
| ## Evaluation Results |
|
|
| Evaluated on a 200-task synthetic benchmark (T=15 steps, 35% drift fraction, seed 2026): |
|
|
| | Metric | Baseline (random init) | NEXUS (trained) | Ξ | |
| |---|---:|---:|---:| |
| | Belief Calibration Error β | 0.392 | **0.313** | β20.2% | |
| | Controller Efficiency Ratio β | 1.000 | **1.101** | +10.1% | |
| | Token Overhead Ratio β | 99.95% | **0.00%** | β100% | |
| | Tool Routing Accuracy β | 13.5% | **14.0%** | +3.7% | |
|
|
| **Training results (val metrics):** |
|
|
| | Component | Metric | Value | |
| |---|---|---| |
| | Resource Router | Val accuracy | **95.5%** | |
| | Sub-task Classifier | Val accuracy | **99.8%** | |
| | Belief Engine | Val loss (MSE) | **7Γ10β»β΅** | |
| | Drift Sentinel | Val accuracy | **69.0%** | |
|
|
| --- |
|
|
| ## How to Use |
|
|
| ```python |
| import torch |
| from metacontrol.core.config import MetacontrolConfig |
| from metacontrol.pipeline.metacontrol_pipeline import MetacontrolPipeline |
| |
| cfg = MetacontrolConfig() |
| device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
| |
| pipeline = MetacontrolPipeline(cfg).to(device) |
| pipeline.load_checkpoint("checkpoints/") |
| pipeline.reset(batch_size=1, device=device) |
| |
| # Each step: pass LLM hidden states + goal embedding |
| llm_hidden = torch.randn(1, 8, cfg.base_llm_d_model, device=device) |
| goal_embed = torch.randn(1, cfg.base_llm_d_model, device=device) |
| |
| result = pipeline.step(llm_hidden, goal_embed) |
| print(result["state_summary"]) |
| # === Step 1 === |
| # Phase: EXECUTE (conf=0.50) |
| # Tool: BROWSER_RELAY (conf=0.91) |
| # P(done): 0.300 |
| # Drift: 0.766 |
| ``` |
|
|
| ### Quick Start |
|
|
| ```bash |
| git clone https://github.com/brian-Lab-0/nexus |
| cd nexus |
| pip install -e ".[dev]" |
| python -m pytest # 155 tests, should all pass |
| python scripts/evaluate.py --n-tasks 200 --device cuda |
| ``` |
|
|
| --- |
|
|
| ## Hardware Requirements |
|
|
| | Configuration | VRAM | Notes | |
| |---|---|---| |
| | Controller only (no LLM) | ~24 MB | Inference only | |
| | + TinyLlama 1.1B fp16 | ~2.1 GB | Full pipeline | |
| | + Training (classifiers) | ~1 GB | 30 epochs, batch 64 | |
| | + Training (Belief Engine) | ~2 GB | BPTT over T=20 steps | |
|
|
| Tested on: NVIDIA RTX 4060 Laptop GPU (8 GB VRAM), CUDA 12.8, PyTorch 2.9.1+cu128. |
|
|
| --- |
|
|
| ## Limitations |
|
|
| - **Synthetic training gap** β TTCS/DRP metrics require real production traces to differentiate from baseline |
| - **LTVI pending** β KV-cache injection path is functional but Protocol Cortex needs training on real traces for coherent generation |
| - **Drift sentinel** β 69% accuracy; natural drift is more varied than synthetic orthogonal rotation |
|
|
| --- |
|
|
| ## Citation |
|
|
| ```bibtex |
| @software{langay2026nexus, |
| author = {Langay, Brian}, |
| title = {{NEXUS}: Neural {EXecution} \& {Understanding} {Substrate}}, |
| year = {2026}, |
| publisher = {OpenBnet}, |
| url = {https://github.com/brian-Lab-0/nexus}, |
| note = {6.29M-parameter neural controller for LLM agent systems} |
| } |
| ``` |
|
|
| --- |
|
|
| *Created and maintained by **Brian Langay** β [support@openbnet.com](mailto:support@openbnet.com) Β· [services@openbnet.cloud](mailto:services@openbnet.cloud)* |
| *OpenBnet Β· 2026* |
| **Paper:** [NEXUS: Design, Implementation, and Empirical Evaluation of a Lightweight Neural Controller for LLM Agent Systems](https://github.com/brian-Lab-0/nexus/blob/main/NEXUS_Implementation_Report.pdf) |
| **Code:** [github.com/brian-Lab-0/nexus](https://github.com/brian-Lab-0/nexus) |
| **License:** Apache 2.0 |
|
|
|
|