Spaces:

J94
/

bit-vector-tensor-control-policy

Runtime error

Initial Space upload

3436bdd verified 19 days ago

1.42 kB

	# Adding Benchmarks v0

	Clean product one-liner: benchmark features belong in the shipped product when they directly tune or prove the control loop.

	Layman version: if a benchmark changes how the harness steers, it is on-product, not just lab work.

	## Is this on product?

	Yes, when the benchmark does one of these:

	\| Benchmark feature \| On product? \| Why \|
	\| --- \| --- \| --- \|
	\| lane-selection check \| yes \| it tunes runtime control \|
	\| memory / freshness check \| yes \| it tunes graph-backed memory \|
	\| tensor posture check \| yes \| it tunes the control language \|
	\| receipt coverage check \| yes \| it governs trust \|
	\| random research probe with no routing consequence \| no \| keep it in research \|

	## How to add a benchmark feature

	1. Add a focused runner under `benchmarks/`.
	2. Write outputs under `runs/benchmark/<name>-<timestamp>/`.
	3. Emit:
	- `summary.json`
	- `report.md`
	4. Add the benchmark to the README if it informs product behavior.
	5. If it changes steering, connect its result back into:
	- `policy/`
	- `benchmarks/control_scorecard_v0.*`

	## Minimal output contract

	\| File \| Purpose \|
	\| --- \| --- \|
	\| `summary.json` \| machine-readable metric surface \|
	\| `report.md` \| human-readable benchmark brief \|

	## Current shipped benchmark

	Use:

	```bash
	./bin/bvtctl benchmark
	```

	Why: this proves the standalone repo can benchmark its own graph-first and execution lanes through the local CLI.