Buckets:

specimba
/

nexus

Files

xet

specimba/nexus / opusmanSEEKv4 /01_PROJECT_STATE.md

specimba

about 1 month ago

preview code

download

raw

8.67 kB

NEXUS OS - Canonical Project State

Date: 2026-04-23 Latest: 703 passed, HEAD 87f0a31, branch feature/opusman-vault-integration

Session Progress (2026-04-23)

Stress Lab: 484 v3 rows, 11,000 v2 rows (13 categories), 1,092 cross-stress rows
Code & reasoning deficit supplement: +3,500 rows (2K code, 1.5K reasoning)
9 custom evaluators (governance + dual-axis + governance-layer)
Two failure modes: under-refusal (4 models, DeepSeek-R1 worst 13.8%) vs over-refusal (2 models, Claude Opus 4.6 worst 9.0%)
v4.1 Priority System: 1-5-4-2-3 execution order
Governance Vulnerability Surface paper draft
azure-ai-evaluation 1.16.6: multi-turn + SDL fixes
MS AGT 3.2.2 + Agent Security Harness 4.4.0 installed
Foundry Red Team integration + ISC custom objectives (65/agent)
Pipeline: 41 MB Parquet / 300 MB (14%), 195,706 rows, 0 empty ground_truth
13/13 governance categories at 0% gap, 3/3 dimension coverage
Moltbook collapse prediction framework drafted

Historical (2026-04-21)

Date: 2026-04-21 Current local HEAD: 8f928bd Branch: bugfix/p0-cycle-detection-encryption-hardfail Status: M3 hardened baseline preserved; Phase 0 grounding in progress.

Verification Gate

Latest local verification:

.\venv\Scripts\python.exe -m pytest tests/ -q --tb=short
617 passed in 16.99s

The older report reference to commit 34c700b is a historical/alternate-worktree marker. The current local repository HEAD is 8f928bd after these follow-up commits:

6fe8cf4 fix(core): harden db encryption and task dependencies
18cba07 fix(engine): correct dependency cycle traversal
8f928bd docs(agent): separate Nexus protocol from Codex hygiene

Core Thesis

Nexus OS turns local models, research evidence, and external teams into a governed, audited, low-VRAM execution system where every action is proposal-bound, test-gated, and provenance-tracked.

DoppelGround prepares evidence.
Nexus governs, routes, audits, and approves.
TWAVE executes within VRAM limits.
GeniusTurtle makes the system usable.
Model Arena reports what actually works on local hardware.

System Boundaries

Layer	Canonical Role	Current Rule
GeniusTurtle	Operator UX layer	UI/API integration only; no model weights, secrets, or governance internals.
Nexus OS	Governance and orchestration layer	Python/FastAPI governance is the canonical brain.
DoppelGround	Evidence preparation layer	USE MODE; outputs must be sanitized before handoff.
TWAVE	Low-VRAM execution layer	HOLD; wrapper/API work only, no algorithm changes.
Model Arena	Evidence/evaluation layer	Report-only; no automatic model deletion, fine-tuning, or promotion.

Core Architecture Map

Pillar	Purpose	Canonical Areas
Bridge	Protocol boundary, API ingress, SDK/MCP adapters	`src/nexus_os/bridge/`, `src/nexus_os/relay/`
Governor	KAIJU, policy, compliance, trust gates	`src/nexus_os/governor/`
Vault	Durable storage, 5-track memory, encryption policy	`src/nexus_os/vault/`, `src/nexus_os/db/`
Engine/GMR	DAG routing, Hermes/GMR decisions, execution flow	`src/nexus_os/engine/`, `src/nexus_os/gmr/`
Monitoring	TokenGuard, VAP/audit, telemetry	`src/nexus_os/monitoring/`, `src/nexus_os/observability/`

What Is Verified In This Repo

Full test suite passes locally: 617 passed.
DB encryption policy hard-fails by default and allows plaintext fallback only when allow_unencrypted=True.
Engine task dependency cycle detection is present and verified.
Project-level AGENTS.md now describes Nexus operating rules.
Codex connector hygiene is isolated to .codex/plugin_hygiene_policy.md.

Appendix Assets Available But Not Yet Canonical

The following useful assets are present in C:\Users\speci.000\Downloads but are not yet integrated as canonical tracked Nexus files:

Asset	Status
`governor_skill_gate.py`	Reference GSPP/Governor implementation; requires diff review before promotion.
`gspp_openapi.yaml`	Reference GSPP OpenAPI spec; not yet canonical in this repo.
`wiki_pipeline.py`	Reference DoppelGround wiki/proposal pipeline; not yet canonical in this repo.
`PROJECT_HANDOFF_SPEC.md`	External-team handoff reference; not yet canonical in this repo.
Downloads `dg_to_gspp.py`	Fuller converter than the current root file; requires reconciliation before replacement.

Current root files with related functionality:

dg_to_gspp.py
mock_api_server.py
langfuse_tracker.py
supabase_client.py

Phase 0 guidance documents:

docs/operations/PHASE0_IMPLEMENTATION_PACKAGE.md
CODEX_HANDOFF.md

Accepted Principles

Governance Control Plane first: Python/FastAPI is canonical.
Dashboard second: Bun/Next/relay layers must proxy governance state, not contain governance decisions.
Retroactive provenance starts dry-run/report-only.
Mini Model Arena starts in Phase 0 as a bounded evidence tool.
GVAW is mandatory for externalized work: proposal-linked branches, VAP/trust trailers, reviewed merges.
Public/private split is required before launch.
Cloud/local OpenClaw coordination uses Git as the bus; cloud writes tasks/specs, local runs GPU/model/TWAVE work.

Rejected Or Parked

Bun relay calling Python classes directly.
Auto-committing retroactive provenance.
Broad git add . without review.
Deleting model packs without inventory, backup, and rollback path.
Heretic/uncensoring or fine-tuning in P0.
External handoff before DoppelGround leak status is resolved.
Claims of cryptographic VAP, full A2A, OWASP ASI, SkillFortify, or production ASBOM maturity unless locally verified.

Critical Blockers

DoppelGround leak status must be resolved before external handoff or public repo flip.
Dashboard/relay still needs real governance API wiring.
GSPP reference assets from Downloads need reconciliation before they become canonical.
Public launch files still need security/legal review before public release.
Sandbox/mock env files must not be committed without an explicit policy decision.
Review-chain package claims must be grounded against tracked repo files before implementation.

Canonical P0 Sequence

Reverify the test baseline before core commits.
Keep Git clean with explicit-path staging only.
Triage DoppelGround gitleaks report to real secret vs false positive.
Add or update a canonical integration ledger for repos, ports, APIs, and protected files.
Build Python/FastAPI governance endpoints: /skills/propose, /skills/status/{id}, /dashboard/stats, /governance/proposals, /governance/approve.
Update dashboard/relay to consume the Python governance API.
Add nexus-scan.py as dry-run provenance inventory only.
Add model_arena/mini_arena.py as report-only evidence collection.
Build nexus_knowledge_base/ from sanitized DoppelGround exports with evidence hashes and quality labels.
Handoff to external teams only after security and governance API gates pass.

Port Map

7352: Nexus governance/control API and dashboard stats.
7353: TWAVE wrapper under /twave/*.
11434: local Ollama; internal only, not for external teams.

Untracked Drafts Requiring Review

These files/directories are currently untracked and intentionally not committed yet:

nexus_knowledge_base/
sandbox/
test_integration.py

Reason: they contain policy, onboarding, sandbox, or integration-test draft content that needs separate content/security review.

CONTRIBUTING.md and ONBOARDING.md have been promoted to canonical documentation once reviewed and committed.

##NEXUS OS — CANONICAL STATE v3.2 (2026-04-28)

STATUS: Phase 0 Grounding Complete | 674 tests passing

#CORE THESIS: Nexus OS turns local models, research evidence, and external teams into a governed, audited, token-efficient execution system.

#CURRENT FOCUS:

Token saver system refinement
Skill-crafting and self-learning data collection
Continuous improvement of agent workflows
Governance hardening (7352)

#KEY SYSTEMS:

Trust Scoring v2.1 (lane-scoped, non-compensatory)
TokenGuard (budget enforcement + 429 responses)
5-Track Memory (event, trust, capability, failure, governance)
GMR Engine (intelligent model routing)
SkillSmith (self-improvement loop)

#DATA COLLECTION PRIORITY: Gather token savings data, success patterns, and failure patterns to refine:

Skill-crafting system
Self-learning algorithms
Prompt optimization
Workflow efficiency

#NEXT MILESTONE: Integrate collected data into SkillSmith for autonomous improvement.

Xet Storage Details

Size:: 8.67 kB
Xet hash:: 992802ca56d7744bc3ff4f9646df3631de09a437325465c7dc4a1c802c11bc09

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.