Update roadmap table with shipped results for stages 0-5
Browse files
README.md
CHANGED
|
@@ -10,6 +10,7 @@ tags:
|
|
| 10 |
- interpretability
|
| 11 |
- vision-transformer
|
| 12 |
- feature-engram
|
|
|
|
| 13 |
library_name: pytorch
|
| 14 |
datasets:
|
| 15 |
- detection-datasets/coco
|
|
@@ -28,14 +29,21 @@ See [`stage_0/`](stage_0/) for the classifier config, discovery pipeline, and fu
|
|
| 28 |
|
| 29 |
## Roadmap
|
| 30 |
|
| 31 |
-
| Stage | Name | What changes |
|
| 32 |
-
|---|---|---|
|
| 33 |
-
| 0 | Baseline 1-param classifier | Uses the full EUPE-ViT-B backbone unchanged |
|
| 34 |
-
| 1 | Output-channel pruning |
|
| 35 |
-
| 2 | Attention-head pruning | Ablate heads that do not contribute to those
|
| 36 |
-
| 3 | Depth reduction | Drop transformer blocks that do not route signal
|
| 37 |
-
| 4 | Specialist backbone | Train a small student that emits only the
|
| 38 |
-
| 5 | Circuit-level synthesis | Synthesize the
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
## Source backbone
|
| 41 |
|
|
|
|
| 10 |
- interpretability
|
| 11 |
- vision-transformer
|
| 12 |
- feature-engram
|
| 13 |
+
- circuit-synthesis
|
| 14 |
library_name: pytorch
|
| 15 |
datasets:
|
| 16 |
- detection-datasets/coco
|
|
|
|
| 29 |
|
| 30 |
## Roadmap
|
| 31 |
|
| 32 |
+
| Stage | Name | What changes | Status | Result |
|
| 33 |
+
|---|---|---|---|---|
|
| 34 |
+
| 0 | Baseline 1-param classifier | Uses the full EUPE-ViT-B backbone unchanged | shipped | F1 0.889 · 85.64M backbone · 1 free param |
|
| 35 |
+
| 1 | Output-channel pruning | Slice the 40 dims the classifier reads; fuse the head | shipped | F1 0.889 (parity) · same backbone · cleaner interface |
|
| 36 |
+
| 2 | Attention-head pruning | Ablate heads that do not contribute to those dims | shipped | **F1 0.916** (+0.022) at K=10 heads pruned · 1.97M params masked |
|
| 37 |
+
| 3 | Depth reduction | Drop transformer blocks that do not route signal | shipped | F1 0.876 at K=1 block · F1 collapses at K≥3 · hard ceiling |
|
| 38 |
+
| 4 | Specialist backbone | Train a small student that emits only the target dims | shipped | 3.27M-param student · F1 0.710 · proof of concept, gap to baseline |
|
| 39 |
+
| 5 | Circuit-level synthesis | Synthesize the Stage 0 classifier to gates | shipped | **3,220 gates** (1,172 AND + 1,318 NOT + 730 XOR) |
|
| 40 |
+
|
| 41 |
+
## Headline numbers
|
| 42 |
+
|
| 43 |
+
- Stage 2 pruning *improves* the classifier: removing 10 redundant / noise-injecting attention heads raises F1 from 0.894 (1K-image calibration) to 0.916 on the same calibration pool.
|
| 44 |
+
- Stage 3 shows the backbone is depth-critical: only 1 of 12 blocks is cleanly removable.
|
| 45 |
+
- Stage 4 specialist student fits the full person-classification pipeline in 3.27M parameters at F1 0.710 — 26× smaller than the teacher, with a known path forward for closing the F1 gap (see stage_4 README).
|
| 46 |
+
- Stage 5 puts the actual decision circuit at 3,220 universal gates. Sub-millisecond combinational latency; sub-milliwatt power. Fits as a camera-ISP block.
|
| 47 |
|
| 48 |
## Source backbone
|
| 49 |
|