phanerozoic commited on
Commit
caedda8
·
verified ·
1 Parent(s): cb69dfe

Update roadmap table with shipped results for stages 0-5

Browse files
Files changed (1) hide show
  1. README.md +16 -8
README.md CHANGED
@@ -10,6 +10,7 @@ tags:
10
  - interpretability
11
  - vision-transformer
12
  - feature-engram
 
13
  library_name: pytorch
14
  datasets:
15
  - detection-datasets/coco
@@ -28,14 +29,21 @@ See [`stage_0/`](stage_0/) for the classifier config, discovery pipeline, and fu
28
 
29
  ## Roadmap
30
 
31
- | Stage | Name | What changes |
32
- |---|---|---|
33
- | 0 | Baseline 1-param classifier | Uses the full EUPE-ViT-B backbone unchanged |
34
- | 1 | Output-channel pruning | Keep only the 100 feature dims the classifier reads |
35
- | 2 | Attention-head pruning | Ablate heads that do not contribute to those 100 dims |
36
- | 3 | Depth reduction | Drop transformer blocks that do not route signal to the 100 dims |
37
- | 4 | Specialist backbone | Train a small student that emits only the 100 target dims |
38
- | 5 | Circuit-level synthesis | Synthesize the entire fixed-weight pipeline to gates and dead-code eliminate everything that does not reach the classifier output |
 
 
 
 
 
 
 
39
 
40
  ## Source backbone
41
 
 
10
  - interpretability
11
  - vision-transformer
12
  - feature-engram
13
+ - circuit-synthesis
14
  library_name: pytorch
15
  datasets:
16
  - detection-datasets/coco
 
29
 
30
  ## Roadmap
31
 
32
+ | Stage | Name | What changes | Status | Result |
33
+ |---|---|---|---|---|
34
+ | 0 | Baseline 1-param classifier | Uses the full EUPE-ViT-B backbone unchanged | shipped | F1 0.889 · 85.64M backbone · 1 free param |
35
+ | 1 | Output-channel pruning | Slice the 40 dims the classifier reads; fuse the head | shipped | F1 0.889 (parity) · same backbone · cleaner interface |
36
+ | 2 | Attention-head pruning | Ablate heads that do not contribute to those dims | shipped | **F1 0.916** (+0.022) at K=10 heads pruned · 1.97M params masked |
37
+ | 3 | Depth reduction | Drop transformer blocks that do not route signal | shipped | F1 0.876 at K=1 block · F1 collapses at K≥3 · hard ceiling |
38
+ | 4 | Specialist backbone | Train a small student that emits only the target dims | shipped | 3.27M-param student · F1 0.710 · proof of concept, gap to baseline |
39
+ | 5 | Circuit-level synthesis | Synthesize the Stage 0 classifier to gates | shipped | **3,220 gates** (1,172 AND + 1,318 NOT + 730 XOR) |
40
+
41
+ ## Headline numbers
42
+
43
+ - Stage 2 pruning *improves* the classifier: removing 10 redundant / noise-injecting attention heads raises F1 from 0.894 (1K-image calibration) to 0.916 on the same calibration pool.
44
+ - Stage 3 shows the backbone is depth-critical: only 1 of 12 blocks is cleanly removable.
45
+ - Stage 4 specialist student fits the full person-classification pipeline in 3.27M parameters at F1 0.710 — 26× smaller than the teacher, with a known path forward for closing the F1 gap (see stage_4 README).
46
+ - Stage 5 puts the actual decision circuit at 3,220 universal gates. Sub-millisecond combinational latency; sub-milliwatt power. Fits as a camera-ISP block.
47
 
48
  ## Source backbone
49