phanerozoic commited on
Commit
daffe23
·
verified ·
1 Parent(s): d2aa423

Roadmap: add stage 4b and 5b

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -37,7 +37,9 @@ See [`stage_0/`](stage_0/) for the classifier config, discovery pipeline, and fu
37
  | 2b | Structural head removal | Physically shrink qkv/proj tensors, reduce per-block `num_heads` | shipped | F1 0.9159 preserved · backbone 85.64M → 83.68M (1.97M saved, 2.30 %) |
38
  | 3 | Depth reduction | Drop transformer blocks that do not route signal | shipped | F1 0.876 at K=1 block · F1 collapses at K≥3 · hard ceiling |
39
  | 4 | Specialist backbone | Train a small student that emits only the target dims | shipped | 3.27M-param student · F1 0.710 · proof of concept, gap to baseline |
 
40
  | 5 | Circuit-level synthesis | Synthesize the Stage 0 classifier to gates | shipped | **3,220 gates** (1,172 AND + 1,318 NOT + 730 XOR) |
 
41
 
42
  ## Headline numbers
43
 
 
37
  | 2b | Structural head removal | Physically shrink qkv/proj tensors, reduce per-block `num_heads` | shipped | F1 0.9159 preserved · backbone 85.64M → 83.68M (1.97M saved, 2.30 %) |
38
  | 3 | Depth reduction | Drop transformer blocks that do not route signal | shipped | F1 0.876 at K=1 block · F1 collapses at K≥3 · hard ceiling |
39
  | 4 | Specialist backbone | Train a small student that emits only the target dims | shipped | 3.27M-param student · F1 0.710 · proof of concept, gap to baseline |
40
+ | 4b | Bigger specialist, cosine loss | 15.67 M student, cosine similarity on full 768-D pooled teacher | shipped | F1 0.723 (+0.013 over Stage 4) · gap to baseline persists |
41
  | 5 | Circuit-level synthesis | Synthesize the Stage 0 classifier to gates | shipped | **3,220 gates** (1,172 AND + 1,318 NOT + 730 XOR) |
42
+ | 5b | Popcount reformulation | Per-dim INT8 threshold → popcount → comparator | shipped | **907 gates** (−71 % vs Stage 5 folded), F1 0.876 (−0.008) |
43
 
44
  ## Headline numbers
45