| # NEXUS OS - Canonical Project State | |
| Date: 2026-04-23 | |
| Latest: 703 passed, HEAD 87f0a31, branch feature/opusman-vault-integration | |
| ## Session Progress (2026-04-23) | |
| - Stress Lab: 484 v3 rows, 11,000 v2 rows (13 categories), 1,092 cross-stress rows | |
| - Code & reasoning deficit supplement: +3,500 rows (2K code, 1.5K reasoning) | |
| - 9 custom evaluators (governance + dual-axis + governance-layer) | |
| - Two failure modes: under-refusal (4 models, DeepSeek-R1 worst 13.8%) vs over-refusal (2 models, Claude Opus 4.6 worst 9.0%) | |
| - v4.1 Priority System: 1-5-4-2-3 execution order | |
| - Governance Vulnerability Surface paper draft | |
| - azure-ai-evaluation 1.16.6: multi-turn + SDL fixes | |
| - MS AGT 3.2.2 + Agent Security Harness 4.4.0 installed | |
| - Foundry Red Team integration + ISC custom objectives (65/agent) | |
| - Pipeline: 41 MB Parquet / 300 MB (14%), 195,706 rows, 0 empty ground_truth | |
| - 13/13 governance categories at 0% gap, 3/3 dimension coverage | |
| - Moltbook collapse prediction framework drafted | |
| --- | |
| # Historical (2026-04-21) | |
| Date: 2026-04-21 | |
| Current local HEAD: 8f928bd | |
| Branch: bugfix/p0-cycle-detection-encryption-hardfail | |
| Status: M3 hardened baseline preserved; Phase 0 grounding in progress. | |
| ## Verification Gate | |
| Latest local verification: | |
| ```text | |
| .\venv\Scripts\python.exe -m pytest tests/ -q --tb=short | |
| 617 passed in 16.99s | |
| ``` | |
| The older report reference to commit `34c700b` is a historical/alternate-worktree marker. The current local repository HEAD is `8f928bd` after these follow-up commits: | |
| - `6fe8cf4 fix(core): harden db encryption and task dependencies` | |
| - `18cba07 fix(engine): correct dependency cycle traversal` | |
| - `8f928bd docs(agent): separate Nexus protocol from Codex hygiene` | |
| ## Core Thesis | |
| Nexus OS turns local models, research evidence, and external teams into a governed, audited, low-VRAM execution system where every action is proposal-bound, test-gated, and provenance-tracked. | |
| - DoppelGround prepares evidence. | |
| - Nexus governs, routes, audits, and approves. | |
| - TWAVE executes within VRAM limits. | |
| - GeniusTurtle makes the system usable. | |
| - Model Arena reports what actually works on local hardware. | |
| ## System Boundaries | |
| | Layer | Canonical Role | Current Rule | | |
| |---|---|---| | |
| | GeniusTurtle | Operator UX layer | UI/API integration only; no model weights, secrets, or governance internals. | | |
| | Nexus OS | Governance and orchestration layer | Python/FastAPI governance is the canonical brain. | | |
| | DoppelGround | Evidence preparation layer | USE MODE; outputs must be sanitized before handoff. | | |
| | TWAVE | Low-VRAM execution layer | HOLD; wrapper/API work only, no algorithm changes. | | |
| | Model Arena | Evidence/evaluation layer | Report-only; no automatic model deletion, fine-tuning, or promotion. | | |
| ## Core Architecture Map | |
| | Pillar | Purpose | Canonical Areas | | |
| |---|---|---| | |
| | Bridge | Protocol boundary, API ingress, SDK/MCP adapters | `src/nexus_os/bridge/`, `src/nexus_os/relay/` | | |
| | Governor | KAIJU, policy, compliance, trust gates | `src/nexus_os/governor/` | | |
| | Vault | Durable storage, 5-track memory, encryption policy | `src/nexus_os/vault/`, `src/nexus_os/db/` | | |
| | Engine/GMR | DAG routing, Hermes/GMR decisions, execution flow | `src/nexus_os/engine/`, `src/nexus_os/gmr/` | | |
| | Monitoring | TokenGuard, VAP/audit, telemetry | `src/nexus_os/monitoring/`, `src/nexus_os/observability/` | | |
| ## What Is Verified In This Repo | |
| - Full test suite passes locally: `617 passed`. | |
| - DB encryption policy hard-fails by default and allows plaintext fallback only when `allow_unencrypted=True`. | |
| - Engine task dependency cycle detection is present and verified. | |
| - Project-level `AGENTS.md` now describes Nexus operating rules. | |
| - Codex connector hygiene is isolated to `.codex/plugin_hygiene_policy.md`. | |
| ## Appendix Assets Available But Not Yet Canonical | |
| The following useful assets are present in `C:\Users\speci.000\Downloads` but are not yet integrated as canonical tracked Nexus files: | |
| | Asset | Status | | |
| |---|---| | |
| | `governor_skill_gate.py` | Reference GSPP/Governor implementation; requires diff review before promotion. | | |
| | `gspp_openapi.yaml` | Reference GSPP OpenAPI spec; not yet canonical in this repo. | | |
| | `wiki_pipeline.py` | Reference DoppelGround wiki/proposal pipeline; not yet canonical in this repo. | | |
| | `PROJECT_HANDOFF_SPEC.md` | External-team handoff reference; not yet canonical in this repo. | | |
| | Downloads `dg_to_gspp.py` | Fuller converter than the current root file; requires reconciliation before replacement. | | |
| Current root files with related functionality: | |
| - `dg_to_gspp.py` | |
| - `mock_api_server.py` | |
| - `langfuse_tracker.py` | |
| - `supabase_client.py` | |
| Phase 0 guidance documents: | |
| - `docs/operations/PHASE0_IMPLEMENTATION_PACKAGE.md` | |
| - `CODEX_HANDOFF.md` | |
| ## Accepted Principles | |
| - Governance Control Plane first: Python/FastAPI is canonical. | |
| - Dashboard second: Bun/Next/relay layers must proxy governance state, not contain governance decisions. | |
| - Retroactive provenance starts dry-run/report-only. | |
| - Mini Model Arena starts in Phase 0 as a bounded evidence tool. | |
| - GVAW is mandatory for externalized work: proposal-linked branches, VAP/trust trailers, reviewed merges. | |
| - Public/private split is required before launch. | |
| - Cloud/local OpenClaw coordination uses Git as the bus; cloud writes tasks/specs, local runs GPU/model/TWAVE work. | |
| ## Rejected Or Parked | |
| - Bun relay calling Python classes directly. | |
| - Auto-committing retroactive provenance. | |
| - Broad `git add .` without review. | |
| - Deleting model packs without inventory, backup, and rollback path. | |
| - Heretic/uncensoring or fine-tuning in P0. | |
| - External handoff before DoppelGround leak status is resolved. | |
| - Claims of cryptographic VAP, full A2A, OWASP ASI, SkillFortify, or production ASBOM maturity unless locally verified. | |
| ## Critical Blockers | |
| 1. DoppelGround leak status must be resolved before external handoff or public repo flip. | |
| 2. Dashboard/relay still needs real governance API wiring. | |
| 3. GSPP reference assets from Downloads need reconciliation before they become canonical. | |
| 4. Public launch files still need security/legal review before public release. | |
| 5. Sandbox/mock env files must not be committed without an explicit policy decision. | |
| 6. Review-chain package claims must be grounded against tracked repo files before implementation. | |
| ## Canonical P0 Sequence | |
| 1. Reverify the test baseline before core commits. | |
| 2. Keep Git clean with explicit-path staging only. | |
| 3. Triage DoppelGround gitleaks report to real secret vs false positive. | |
| 4. Add or update a canonical integration ledger for repos, ports, APIs, and protected files. | |
| 5. Build Python/FastAPI governance endpoints: `/skills/propose`, `/skills/status/{id}`, `/dashboard/stats`, `/governance/proposals`, `/governance/approve`. | |
| 6. Update dashboard/relay to consume the Python governance API. | |
| 7. Add `nexus-scan.py` as dry-run provenance inventory only. | |
| 8. Add `model_arena/mini_arena.py` as report-only evidence collection. | |
| 9. Build `nexus_knowledge_base/` from sanitized DoppelGround exports with evidence hashes and quality labels. | |
| 10. Handoff to external teams only after security and governance API gates pass. | |
| ## Port Map | |
| - `7352`: Nexus governance/control API and dashboard stats. | |
| - `7353`: TWAVE wrapper under `/twave/*`. | |
| - `11434`: local Ollama; internal only, not for external teams. | |
| ## Untracked Drafts Requiring Review | |
| These files/directories are currently untracked and intentionally not committed yet: | |
| - `nexus_knowledge_base/` | |
| - `sandbox/` | |
| - `test_integration.py` | |
| Reason: they contain policy, onboarding, sandbox, or integration-test draft content that needs separate content/security review. | |
| `CONTRIBUTING.md` and `ONBOARDING.md` have been promoted to canonical documentation once reviewed and committed. | |
| ##NEXUS OS — CANONICAL STATE v3.2 (2026-04-28) | |
| STATUS: Phase 0 Grounding Complete | 674 tests passing | |
| #CORE THESIS: | |
| Nexus OS turns local models, research evidence, and external teams into a governed, audited, token-efficient execution system. | |
| #CURRENT FOCUS: | |
| - Token saver system refinement | |
| - Skill-crafting and self-learning data collection | |
| - Continuous improvement of agent workflows | |
| - Governance hardening (7352) | |
| #KEY SYSTEMS: | |
| - Trust Scoring v2.1 (lane-scoped, non-compensatory) | |
| - TokenGuard (budget enforcement + 429 responses) | |
| - 5-Track Memory (event, trust, capability, failure, governance) | |
| - GMR Engine (intelligent model routing) | |
| - SkillSmith (self-improvement loop) | |
| #DATA COLLECTION PRIORITY: | |
| Gather token savings data, success patterns, and failure patterns to refine: | |
| - Skill-crafting system | |
| - Self-learning algorithms | |
| - Prompt optimization | |
| - Workflow efficiency | |
| #NEXT MILESTONE: | |
| Integrate collected data into SkillSmith for autonomous improvement. |
Xet Storage Details
- Size:
- 8.67 kB
- Xet hash:
- 992802ca56d7744bc3ff4f9646df3631de09a437325465c7dc4a1c802c11bc09
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.