Spaces:
Runtime error
Runtime error
| title: Bit Vector Tensor Control Policy | |
| emoji: "🧭" | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: gradio | |
| sdk_version: "4.44.1" | |
| python_version: "3.11" | |
| app_file: app.py | |
| pinned: false | |
| # Bit/Vector/Tensor Control Policy | |
| Clean product one-liner: a standalone control-system repo with one local graph kernel stack, one conversational API, one orchestrator UX, and first-class control benchmarks driven by bits, vectors, and tensors. | |
| Layman version: this should be one self-contained machine, not a folder of pointers to other machines. | |
| ## Inference Surface | |
| Clean product one-liner: graph-first reasoning can now be delegated to the local Codex CLI through one repo-root YAML config. | |
| | Surface | What it does | Why it matters | | |
| | --- | --- | --- | | |
| | `inference.yaml` | selects the inference backend and model | keeps inference policy in repo, not hidden in shell history | | |
| | `scripts/run_codex_inference.py` | runs schema-shaped `codex exec` turns | uses the signed-in ChatGPT subscription path | | |
| | `api/run_turn.sh` | keeps lane choice and response packet deterministic | model output is bounded by runtime policy, not vice versa | | |
| ## Self-Improve Loop | |
| Clean product one-liner: the repo can now propose one bounded self-improvement through Codex CLI, package it as a manifest, and optionally apply and benchmark it. | |
| | Surface | What it does | Why it matters | | |
| | --- | --- | --- | | |
| | `self_improve.yaml` | defines allowed roots and default benchmark | keeps self-edit policy explicit | | |
| | `scripts/propose_self_improvement.py` | uses Codex CLI to emit a bounded manifest proposal | the model proposes changes as structured state, not loose prose | | |
| | `scripts/run_self_improve.py` | writes proposal artifacts, optionally applies them, and runs the benchmark | self-building stays receipt-backed and reviewable | | |
| | `bvtctl self-improve` | operator front door for proposal-only runs | lets the system suggest its next bounded change | | |
| | `bvtctl self-improve-apply` | proposal + apply + benchmark | closes the loop without opening broad ungated recursion | | |
| ## What This Is | |
| Pareto read: | |
| | Question | Accurate answer | | |
| | --- | --- | | |
| | are we an agent harness? | yes | | |
| | metacognitive? | yes | | |
| | metacybernetic? | yes | | |
| | already a federated learning runtime? | not in the strict ML sense | | |
| Better wording: | |
| This repo is a **standalone agent harness** that should operate: | |
| - metacognitively: inspect state, confidence, and reasoning posture | |
| - metacybernetically: regulate routes, gates, escalation, and proof loops | |
| - cross-surface adaptively: distill patterns across memory, runtime, API, UX, and benchmarks | |
| It is **not yet** a classical federated-learning runtime with gradient or weight aggregation. | |
| It is closer to **receipt-backed policy distillation over a graph-mediated runtime**. | |
| ## Product Boundary | |
| `conversation -> control policy -> graph state -> benchmarked runtime -> operator UX` | |
| ## Orchestrator Loop | |
| ```mermaid | |
| flowchart LR | |
| U["User Ask"] --> A["POST /turn"] | |
| A --> P["Policy: bits, vectors, tensors"] | |
| P --> R["Runtime route + lane choice"] | |
| R --> G["Graph state"] | |
| R --> E["Bounded execution lane"] | |
| E --> X["Receipt + graph update"] | |
| G --> A | |
| X --> A | |
| A --> V["Chat / Confirm / Inspect UX"] | |
| ``` | |
| ## Cause And Effect | |
| | In | Internal cause | Out | Why it matters for emergence | | |
| | --- | --- | --- | --- | | |
| | user ask | policy classifies task pressure | lane + UI mode | the harness chooses how to think before it acts | | |
| | graph + receipts | runtime gets continuity and proof | constrained decision brief | the system works from memory, not just prompt text | | |
| | execution gate | authority is evaluated | bounded action or refusal | action becomes governed, not automatic | | |
| | bounded action | receipt + graph update emitted | durable consequence | the next run starts from a changed world | | |
| | benchmark signal | policy can revise control weights | better future routing | the system learns from work | | |
| Layman version: emergence comes from consequences feeding back into the same loop, not from hidden magic in the model. | |
| ## Core Visibility | |
| Start here if you want to understand the machine, not just the file tree: | |
| - [`docs/core_workflows_v0.md`](/Users/jobs/Desktop/bit-vector-tensor-control-policy/docs/core_workflows_v0.md) | |
| - [`docs/api_surface_v0.md`](/Users/jobs/Desktop/bit-vector-tensor-control-policy/docs/api_surface_v0.md) | |
| - [`docs/emergent_feature_taxonomy_v0.md`](/Users/jobs/Desktop/bit-vector-tensor-control-policy/docs/emergent_feature_taxonomy_v0.md) | |
| - [`docs/corpus_reasoning_substrate_v0.md`](/Users/jobs/Desktop/bit-vector-tensor-control-policy/docs/corpus_reasoning_substrate_v0.md) | |
| - [`docs/adding_benchmarks_v0.md`](/Users/jobs/Desktop/bit-vector-tensor-control-policy/docs/adding_benchmarks_v0.md) | |
| This repo is the shipped product surface. | |
| It should contain: | |
| - one local graph kernel | |
| - one local product runtime | |
| - one local control-language pack | |
| - one local conversational API contract | |
| - one local orchestrator UX contract | |
| - one local control benchmark suite | |
| The current cross-repo hotgraph is a bootstrap intake lane only. | |
| It helps import and compare source systems, but it is not the long-term shipped boundary. | |
| ## Why This Is More Than A Chat Agent | |
| | Baseline pattern | This repo adds | | |
| | --- | --- | | |
| | prompt in, answer out | policy-governed orchestrator loop | | |
| | tool call when needed | lane-based authority and execution gates | | |
| | memory as context stuffing | graph + receipt-backed operational memory | | |
| | hidden heuristics | explicit bits, vectors, tensors | | |
| | evals as sidecar | benchmark-governed adaptation | | |
| ## Standalone Benchmark Results | |
| Latest standalone run: | |
| - [`runs/benchmark/standalone-control-20260421T104519Z/summary.json`](/Users/jobs/Desktop/bit-vector-tensor-control-policy/runs/benchmark/standalone-control-20260421T104519Z/summary.json) | |
| - [`runs/benchmark/standalone-control-20260421T104519Z/report.md`](/Users/jobs/Desktop/bit-vector-tensor-control-policy/runs/benchmark/standalone-control-20260421T104519Z/report.md) | |
| | Case | Lane | Execution gate | UI mode | Latency ms | Causal read | | |
| | --- | --- | --- | --- | ---: | --- | | |
| | graph-first | `memory` | `false` | `chat` | `153.5` | no authority opened; graph-first reasoning stayed active | | |
| | freshness override | `memory` | `false` | `chat` | `224.6` | latest receipt-backed artifact answered from graph state, not raw execution | | |
| | unsupported schedule | `memory` | `false` | `chat` | `153.7` | unsupported query abstained instead of inventing a schedule | | |
| | UTIR execution | `execution` | `true` | `confirm` | `266.5` | imported kernel language opened bounded action with receipt | | |
| | allowlisted exec | `execution` | `true` | `confirm` | `271.1` | local policy allowed bounded shell action with receipt | | |
| ### Scorecard metrics | |
| | Metric | Result | Why it matters in layman terms | | |
| | --- | ---: | --- | | |
| | `freshness_override_accuracy` | `1.0` | the system now respects the newest durable artifact instead of stale state | | |
| | `unsupported_query_abstention_rate` | `1.0` | the harness knows when not to pretend it knows | | |
| | `cost_per_successful_task` | `1.4` | routing is now measurable in simple operational units, not vibes | | |
| | `settling_time_turns` | `1` | after an execution disturbance, the controller returns to steady memory mode in one turn | | |
| | `oscillation_count` | `2` | the loop flips lanes only when the work really changes, not continuously | | |
| | `avg_latency_ms` | `213.9` | the local loop stays fast while handling both memory and execution cases | | |
| ### Why this is on product | |
| | Benchmark feature | Product effect | | |
| | --- | --- | | |
| | freshness override | proves graph memory can answer from the latest receipt-backed state | | |
| | unsupported-query abstention | proves the chat surface can fail closed instead of hallucinating | | |
| | cost per successful task | proves routing can be tuned economically, not just qualitatively | | |
| | settling time | proves the controller can recover from execution back to a stable reasoning lane | | |
| | oscillation count | proves the runtime is not thrashing between lanes unnecessarily | | |
| Run it locally: | |
| ```bash | |
| rtk ./bin/bvtctl benchmark | |
| ``` | |
| ### Vector frontier | |
| | Mode | plan | execute | review | promote | | |
| | --- | ---: | ---: | ---: | ---: | | |
| | graph-first | `0.7` | `0.1` | `0.6` | `0.2` | | |
| | execution | `0.7` | `0.8` | `0.6` | `0.2` | | |
| ### Tensor frontier | |
| | Mode | strongest source | dominant stage | weakest metric | frontier read | | |
| | --- | --- | --- | --- | --- | | |
| | graph-first | `thread_state` | `discover` | `promotion_readiness` | `graph_first_reasoning` | | |
| | execution | `receipt_state` | `execute` | `promotion_readiness` | `bounded_execution_with_receipt` | | |
| Layman version: the standalone harness now proves that the same API surface can stay in reasoning mode when authority should stay closed, and switch into receipt-backed execution when authority should open. | |
| ### Stability frontier | |
| | Sequence | Lane path | Result | Why it matters | | |
| | --- | --- | --- | --- | | |
| | `graph_first -> utir_execution -> freshness_override -> unsupported_schedule` | `memory -> execution -> memory -> memory` | `settling_time_turns=1`, `oscillation_count=2` | one real disturbance, one recovery step, no pointless thrashing | | |
| ## Quickstart | |
| ### 1. Start from one conversational front door | |
| ```bash | |
| ./bin/bvtctl "summarise the current runtime" | |
| ./bin/bvtctl chat | |
| ./bin/bvtctl ask "run the demo manifest" runtime/examples/demo_manifest.json | |
| ./bin/bvtctl context | |
| ./bin/bvtctl bootstrap-context | |
| ``` | |
| Why in plain English: the product should meet the operator as one sentence-driven CLI first, and it should explain its own lineage and policy without repo spelunking. | |
| ### 2. Bootstrap from the current source field | |
| ```bash | |
| ./scripts/build_hotgraph.sh | |
| ``` | |
| Outputs: | |
| - `hotgraph/source_hotgraph.json` | |
| - `hotgraph/source_hotgraph.md` | |
| ### 3. Generate one bootstrap control packet | |
| ```bash | |
| ./scripts/run_v0.sh migration_brief | |
| ``` | |
| Outputs: | |
| - `runs/<timestamp>/control_packet.json` | |
| - `runs/<timestamp>/brief.md` | |
| ### 4. Read the product slices | |
| Start with: | |
| - `docs/core_workflows_v0.md` | |
| - `docs/api_surface_v0.md` | |
| - `docs/emergent_feature_taxonomy_v0.md` | |
| - `docs/standalone_layout.md` | |
| - `docs/runtime_and_cli.md` | |
| - `docs/kernel_import_plan.md` | |
| - `docs/usecases.md` | |
| - `docs/research_and_generation.md` | |
| - `hotgraph/source_hotgraph.md` | |
| ### 5. Build the corpus reasoning packets | |
| ```bash | |
| ./bin/bvtctl corpus-packets | |
| ``` | |
| Why in plain English: this should build one mathematical packet per promoted abstraction so Codex can reason over state packets first and reopen raw evidence only when the packet state says it must. | |
| ### 6. Generate from reduced state | |
| The intended flow is: | |
| 1. use the bootstrap hotgraph to decide what to import or vendor | |
| 2. move the chosen slice into the local product layout | |
| 3. benchmark the local slice through the control language | |
| 4. expose it through the conversational API and orchestrator UX | |
| ## Primary Use Cases | |
| | Use case | Why it exists | Real-world analog | | |
| | --- | --- | --- | | |
| | runtime migration planning | decide what to import into the standalone product | editorial desk | | |
| | control scorecard generation | measure bits, vectors, tensors, memory, safety, and economics | cockpit dashboard | | |
| | conversational API shaping | keep one front door over graph state | front desk | | |
| | orchestrator UX shaping | expose one operator surface over the API | control console | | |
| | product quickstart packs | expose the use cases before the machinery | storefront brochure | | |
| ## Repo Layout | |
| | Path | Purpose | | |
| | --- | --- | | |
| | `configs/product_slices.json` | canonical standalone product slices | | |
| | `configs/source_registry.json` | canonical source list | | |
| | `configs/usecase_registry.json` | canonical use-case list | | |
| | `configs/kernel_stack.json` | local kernel/runtime/control stack target | | |
| | `schemas/control_packet_v0.json` | minimal packet contract | | |
| | `scripts/build_hotgraph.sh` | thin hotgraph compiler | | |
| | `scripts/generate_control_packet.sh` | thin packet compiler | | |
| | `scripts/run_v0.sh` | one-pass runner | | |
| | `bin/bvtctl` | thin CLI entrypoint | | |
| | `vendor/` | vendored off-the-shelf substrate imports | | |
| | `runtime/` | local runtime contract and entrypoint | | |
| | `policy/` | bits/vector/tensor control language and rules | | |
| | `benchmarks/` | local control evals and scorecards | | |
| | `api/` | conversational API contracts | | |
| | `ux/` | orchestrator UX contracts | | |
| | `docs/usecases.md` | product-first quickstart | | |
| | `docs/core_workflows_v0.md` | orchestrator loop and core workflows | | |
| | `docs/api_surface_v0.md` | visible API spec | | |
| | `docs/emergent_feature_taxonomy_v0.md` | emergent feature taxonomy | | |
| | `docs/standalone_layout.md` | shipped product layout | | |
| | `docs/runtime_and_cli.md` | single runtime and CLI contract | | |
| | `docs/research_and_generation.md` | how research and generation should work | | |
| | `hotgraph/` | generated graph and summaries | | |
| | `runs/` | future generated packets and scorecards | | |
| ## Product Slices | |
| Pareto read: | |
| | Slice | Why it is first-class | Real-world analog | | |
| | --- | --- | --- | | |
| | graph kernel | holds durable graph state and receipts | engine block | | |
| | control language | expresses bits, vectors, tensors | instrument cluster | | |
| | control benchmarks | proves the control layers work | test track | | |
| | conversational API | one user-facing front door | reception desk | | |
| | orchestrator UX | one operator surface over the API | cockpit | | |
| Layman version: the benchmark, API, and UX are not sidecars. They are part of the shipped machine. | |
| ## Bootstrap Intake | |
| The intake field is still: | |
| - `NIX.codecli` | |
| - `tmp-meta3-engine-test` | |
| - `meta3-graph-core` | |
| - `dreaming-kernel` | |
| - `nix-star` | |
| - `causal-workbench` | |
| - `tiny_graph_engine` | |
| These are source systems to compare and import from. | |
| They are not the final runtime boundary of this repo. | |
| ## Kernel Stack Target | |
| | Layer | Local target | Upstream reference | Why | | |
| | --- | --- | --- | --- | | |
| | graph kernel | `vendor/meta3-graph-core` | `meta3-graph-core` | deterministic graph reifier | | |
| | product runtime | `runtime/` | `causal-workbench` | thin routed manifest runtime | | |
| | control plane | `policy/` | `nix-star` | ledger-first confidence and policy | | |
| | benchmark plane | `benchmarks/` | local eval doctrine | prove quality, cost, and stability | | |
| | API plane | `api/` | conversational front door doctrine | one interface | | |
| | UX plane | `ux/` | orchestrator surface doctrine | one operator desk | | |
| Layman version: import the best engine, keep one dashboard, and mount the gauges inside the same car. | |
| ## CLI | |
| Use the current CLI like this: | |
| ```bash | |
| ./bin/bvtctl "summarise the current runtime" | |
| ./bin/bvtctl ask "run the demo manifest" runtime/examples/demo_manifest.json | |
| ./bin/bvtctl chat | |
| ./bin/bvtctl context | |
| ./bin/bvtctl bootstrap-context | |
| ./bin/bvtctl bootstrap | |
| ./bin/bvtctl run migration_brief | |
| ./bin/bvtctl execute | |
| ./bin/bvtctl kernel-plan | |
| ./bin/bvtctl benchmark | |
| ./bin/bvtctl packet migration_brief | |
| ./bin/bvtctl policy | |
| ./bin/bvtctl scorecard | |
| ./bin/bvtctl runtime | |
| ./bin/bvtctl api | |
| ./bin/bvtctl ux | |
| ``` | |
| Why in plain English: the sentence form is now the default front door; the named commands are the operator shelf behind the desk. | |
| ## First Local Imports | |
| The first real local slices are now: | |
| - `policy/control_language_v0.json` | |
| - `policy/runtime_profiles_v0.json` | |
| - `benchmarks/control_scorecard_v0.json` | |
| - `benchmarks/control_scorecard_v0.md` | |
| - `runtime/work_manifest_v0.json` | |
| - `runtime/work_manifest_packet_v0.json` | |
| - `runtime/examples/demo_manifest.json` | |
| - `runtime/examples/demo_exec_manifest.json` | |
| - `api/conversational_api_v0.json` | |
| - `ux/orchestrator_ux_v0.md` | |
| - `vendor/meta3-graph-core/schema/receipt_v1.json` | |
| Why: the control language and proof surface are the fastest way to make this repo operational before deeper kernel import. | |
| ## Working Rule | |
| Pareto frontier: | |
| - do not ship a repo of external path pointers | |
| - do not invent a new kernel if an off-the-shelf one is already stronger | |
| - do not let docs become the only state | |
| - do not treat benchmarks, API, or UX as optional afterthoughts | |
| Instead: | |
| 1. compare source systems | |
| 2. choose the strongest slice | |
| 3. import it into the local layout | |
| 4. benchmark it through bits/vectors/tensors | |
| 5. expose it through one API and one UX | |
| ## Current Read | |
| Best direction today: | |
| - `meta3-graph-core` = graph kernel import target | |
| - `causal-workbench` = product-runtime pattern | |
| - `nix-star` = control-language and policy pattern | |
| - `NIX.codecli` = conversational API/orchestrator UX reference field | |
| - local `benchmarks/` = mandatory proof surface | |
| That is why this repo exists: to turn the best scattered ideas into one standalone control product. | |