File size: 1,420 Bytes
3436bdd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# Adding Benchmarks v0

Clean product one-liner: benchmark features belong in the shipped product when they directly tune or prove the control loop.

Layman version: if a benchmark changes how the harness steers, it is on-product, not just lab work.

## Is this on product?

Yes, when the benchmark does one of these:

| Benchmark feature | On product? | Why |
| --- | --- | --- |
| lane-selection check | yes | it tunes runtime control |
| memory / freshness check | yes | it tunes graph-backed memory |
| tensor posture check | yes | it tunes the control language |
| receipt coverage check | yes | it governs trust |
| random research probe with no routing consequence | no | keep it in research |

## How to add a benchmark feature

1. Add a focused runner under `benchmarks/`.
2. Write outputs under `runs/benchmark/<name>-<timestamp>/`.
3. Emit:
   - `summary.json`
   - `report.md`
4. Add the benchmark to the README if it informs product behavior.
5. If it changes steering, connect its result back into:
   - `policy/`
   - `benchmarks/control_scorecard_v0.*`

## Minimal output contract

| File | Purpose |
| --- | --- |
| `summary.json` | machine-readable metric surface |
| `report.md` | human-readable benchmark brief |

## Current shipped benchmark

Use:

```bash
./bin/bvtctl benchmark
```

Why: this proves the standalone repo can benchmark its own graph-first and execution lanes through the local CLI.