phanerozoic commited on
Commit
a4e28a5
Β·
verified Β·
1 Parent(s): 272cd6a

Update roadmap

Browse files
Files changed (1) hide show
  1. todo.md +145 -120
todo.md CHANGED
@@ -1,148 +1,173 @@
1
  # Threshold Logic Neural Turing Machine
2
 
3
- ## Vision
4
 
5
- A verified computational coprocessor embedded in transformer architecture:
6
- - **Frozen circuits**: Exhaustively tested threshold logic (can't compute wrong)
7
- - **ACT execution**: Runs until HALT within single forward pass
8
- - **Dual memory**: Hidden state integration + dedicated 64KB address space
9
- - **LLM integration**: Router/Extract/Inject learned, computation exact
10
 
11
- ## Architecture Overview
 
 
12
 
13
  ```
14
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
15
- β”‚ Transformer Layer β”‚
16
- β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
17
- β”‚ β”‚ Attention β”‚ β”‚ MLP β”‚ β”‚ ThresholdCPU β”‚ β”‚
18
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ (ACT-style) β”‚ β”‚
19
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
20
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚
21
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚Router (learn)β”‚ β”‚ β”‚
22
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ β”‚
23
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚Extract (learnβ”‚ β”‚ β”‚
24
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ β”‚
25
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚CPU (frozen) β”‚ β”‚ β”‚
26
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ ↻ until HALT β”‚ β”‚ β”‚
27
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ β”‚
28
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚Inject (learn)β”‚ β”‚ β”‚
29
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
30
- β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
31
- β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
32
- β”‚ ↓ β”‚
33
- β”‚ Residual + CPU State β”‚
34
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ```
36
 
37
  ## Memory Architecture
38
 
39
- ### Hidden State Integration (Hot Memory)
40
- Reserve dimensions of residual stream for CPU state:
41
- ```
42
- dims 0-511: CPU memory (512 bits = 64 bytes hot cache)
43
- dims 512-543: Registers (32 bits = 4 Γ— 8-bit)
44
- dims 544-551: PC (8 bits)
45
- dims 552-555: Flags (4 bits: Z, N, C, V)
46
- dims 556-559: Control (halt, interrupt, etc.)
47
- dims 560-959: Normal embeddings (400 dims)
48
- ```
49
-
50
- ### Dedicated Memory Bank (Cold Storage)
51
- Full 64KB addressable memory via routing circuits:
52
  ```
53
- Address space: 0x0000 - 0xFFFF (65,536 bytes)
54
- Tensors: ~1.6M (routing overhead)
55
- Access: Via 16-bit address decoder + mux/demux
 
56
  ```
57
 
58
  ### Memory Hierarchy
59
- | Level | Size | Access | Use Case |
60
- |-------|------|--------|----------|
61
- | Registers | 4 Γ— 8-bit | Direct | Operands, accumulators |
62
- | Hot cache | 64 bytes | Embedded in hidden state | Stack, scratch |
63
- | Cold bank | 64KB | Circuit-routed | Programs, data, heap |
 
 
 
 
 
64
 
65
  ## Phase 1: Memory Infrastructure
66
 
67
  | Component | Description | Tensors | Status |
68
  |-----------|-------------|---------|--------|
 
69
  | Address Decoder 16-bit | 16-bit β†’ 65536 one-hot | ~65,600 | Pending |
70
- | Memory Read MUX | 65536-to-1 Γ— 8 bits | ~524,288 | Pending |
71
- | Memory Write Demux | Route to addressed byte | ~524,288 | Pending |
72
- | Memory Cell Logic | Conditional update per byte | ~524,288 | Pending |
73
- | Bank Controller | Page/bank switching | ~1,000 | Pending |
74
-
75
- **Estimated Phase 1 total: ~1.64M tensors**
76
 
77
- ## Phase 2: ACT Execution Engine
78
 
79
  | Component | Description | Status |
80
  |-----------|-------------|--------|
81
- | Cycle Block | One fetch/decode/execute iteration | Pending |
82
- | Halt Detector | HALT instruction β†’ stop signal | Pending |
83
- | Cycle Counter | Track pondering steps | Pending |
84
- | State Checkpointing | Save state for gradient flow | Pending |
85
-
86
- ## Phase 3: LLM Integration Layers
87
-
88
- | Component | Description | Trainable | Status |
89
- |-----------|-------------|-----------|--------|
90
- | Router | Detect computation need | Yes | Pending |
91
- | State Extractor | Embeddings β†’ CPU state | Yes | Pending |
92
- | State Injector | CPU state β†’ embedding delta | Yes | Pending |
93
- | KV Cache Binding | CPU state persists with cache | No | Pending |
94
-
95
- ## Phase 4: Instruction Set
96
-
97
- | Category | Instructions | Status |
98
- |----------|--------------|--------|
99
- | Arithmetic | ADD, SUB, MUL, DIV, NEG, ADC, SBC | Done |
100
- | Logic | AND, OR, XOR, NOT, shifts, rotates | Done |
101
- | Compare | CMP (sets flags) | Done |
102
- | Control | JMP, Jcc (conditional), CALL, RET | Partial |
103
- | Memory | LOAD, STORE (8/16-bit addressing) | Pending |
104
- | Stack | PUSH, POP | Partial |
105
- | System | NOP, HALT | Done |
106
-
107
- ## Completed Building Blocks
108
-
109
- ### Arithmetic Core (2,756 tensors)
110
- - NEG: 76 tensors, 256/256 tests
111
- - SUB: 162 tensors, 65536/65536 tests
112
- - ADC: 144 tensors, 131072/131072 tests
113
- - SBC: 160 tensors, 131072/131072 tests
114
- - DIV: 1984 tensors, 65280/65280 tests
115
- - CMP: 168 tensors, 65536/65536 tests
116
- - ASR/ROL/ROR: 62 tensors total
117
-
118
- ### Control Core (306 tensors)
119
- - NOP: 24 tensors, 4096/4096 tests
120
- - HALT: 42 tensors, 24576/24576 tests
121
- - PC Incrementer: 62 tensors, 256/256 tests
122
- - PC Load MUX: 50 tensors, 1536/1536 tests
123
- - Register MUX: 84 tensors, 1036/1036 tests
124
- - Instruction Decoder: 44 tensors, 16/16 tests
125
-
126
- ### Original Model
127
- - Boolean gates, adders, multiplier, comparators
128
- - Threshold gates, pattern recognition
129
- - Modular arithmetic, error detection
130
- - ~3,100 tensors
131
-
132
- **Current total: 6,184 tensors**
133
- **Projected with 64KB memory: ~1.65M tensors**
134
 
135
- ## Design Principles
136
-
137
- 1. **Frozen correctness**: Circuit weights never change, exhaustively verified
138
- 2. **Learned interface**: Router/Extract/Inject are trainable, CPU is not
139
- 3. **Functional state**: Memory flows through as data, not mutated weights
140
- 4. **Halting semantics**: HALT instruction terminates ACT loop
141
- 5. **Composable**: Each circuit tested in isolation, composed at runtime
142
 
143
- ## Key Insight
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
144
 
145
- The LLM learns **when** to compute and **how** to format input/output.
146
- The CPU defines **what** computation means - exactly, verifiably, always.
147
 
148
- This is not a learned calculator. This is a proven calculator with a learned interface.
 
 
 
 
 
1
  # Threshold Logic Neural Turing Machine
2
 
3
+ ## Core Vision
4
 
5
+ A self-contained, autonomous computational machine:
6
+ - **Pure tensor computation**: State in, state out
7
+ - **Frozen verified circuits**: Exhaustively tested, can't compute wrong
8
+ - **ACT execution**: Internal loop until HALT
9
+ - **No external orchestration**: One forward pass = complete program execution
10
 
11
+ The machine runs. Callers just provide initial state and collect results.
12
+
13
+ ## Architecture
14
 
15
  ```
16
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
17
+ β”‚ Initial State β”‚
18
+ β”‚ [PC|Regs|Flags|Memory...] β”‚
19
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
20
+ β–Ό
21
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
22
+ β”‚ β”‚
23
+ β”‚ Threshold Circuit Layer β”‚
24
+ β”‚ β”‚
25
+ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
26
+ β”‚ β”‚ Fetch: PC β†’ Instr β”‚ β”‚
27
+ β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚
28
+ β”‚ β”‚ Decode: Opcode/Ops β”‚ β”‚
29
+ β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚
30
+ β”‚ β”‚ Execute: ALU/Mem β”‚ β”‚
31
+ β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚
32
+ β”‚ β”‚ Writeback: Results β”‚ β”‚
33
+ β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚
34
+ β”‚ β”‚ PC Update β”‚ β”‚
35
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
36
+ β”‚ β”‚ β”‚
37
+ β”‚ β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β” β”‚
38
+ β”‚ β”‚ HALTED? β”‚ β”‚
39
+ β”‚ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β”‚
40
+ β”‚ β”‚ β”‚
41
+ β”‚ no ──┴── yes β”‚
42
+ β”‚ β”‚ β”‚ β”‚
43
+ β”‚ β–Ό β–Ό οΏ½οΏ½
44
+ β”‚ [loop] [exit] β”‚
45
+ β”‚ β”‚
46
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
47
+ β–Ό
48
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
49
+ β”‚ Final State β”‚
50
+ β”‚ [PC|Regs|Flags|Memory...] β”‚
51
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
52
  ```
53
 
54
  ## Memory Architecture
55
 
56
+ ### State Tensor Layout
 
 
 
 
 
 
 
 
 
 
 
 
57
  ```
58
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
59
+ β”‚ PC [8] β”‚ Regs[32] β”‚Flags[4β”‚Ctrl[4] β”‚ Memory [N Γ— 8] β”‚
60
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
61
+ 8 + 32 + 4 + 4 + N Γ— 8 bits
62
  ```
63
 
64
  ### Memory Hierarchy
65
+ | Level | Size | Tensors | Access |
66
+ |-------|------|---------|--------|
67
+ | Registers | 4 Γ— 8-bit | Direct wiring | Immediate |
68
+ | Hot cache | 256 bytes | ~6,400 | 8-bit addressed |
69
+ | Cold bank | 64KB | ~1.6M | 16-bit addressed |
70
+
71
+ ### Full 64KB Configuration
72
+ - Address space: 0x0000 - 0xFFFF
73
+ - Routing circuits: ~1.64M tensors
74
+ - State tensor: 48 + 524,288 = 524,336 bits per instance
75
 
76
  ## Phase 1: Memory Infrastructure
77
 
78
  | Component | Description | Tensors | Status |
79
  |-----------|-------------|---------|--------|
80
+ | Address Decoder 8-bit | 8-bit β†’ 256 one-hot | ~520 | Pending |
81
  | Address Decoder 16-bit | 16-bit β†’ 65536 one-hot | ~65,600 | Pending |
82
+ | Memory Read MUX 256 | 256-to-1 Γ— 8 bits | ~2,048 | Pending |
83
+ | Memory Read MUX 64K | 65536-to-1 Γ— 8 bits | ~524,288 | Pending |
84
+ | Memory Write Demux | Route write to address | ~524,288 | Pending |
85
+ | Memory Cell Logic | Conditional update | ~524,288 | Pending |
 
 
86
 
87
+ ## Phase 2: Execution Engine
88
 
89
  | Component | Description | Status |
90
  |-----------|-------------|--------|
91
+ | Instruction Fetch | PC β†’ Memory β†’ IR | Pending |
92
+ | Operand Fetch | Decode β†’ Register/Memory Read | Pending |
93
+ | ALU Dispatch | Opcode β†’ Operation Select | Pending |
94
+ | Result Writeback | Route to destination | Pending |
95
+ | Flag Update | Compute Z/N/C/V | Partial |
96
+ | PC Advance | Increment or Jump | Done |
97
+ | Halt Detection | HALT opcode β†’ stop | Done |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
98
 
99
+ ## Phase 3: ACT Integration
 
 
 
 
 
 
100
 
101
+ | Component | Description | Status |
102
+ |-----------|-------------|--------|
103
+ | Cycle Block | All Phase 2 as single layer | Pending |
104
+ | Recurrence Wrapper | Loop until halt signal | Pending |
105
+ | Max Cycles Guard | Prevent infinite loops | Pending |
106
+ | State I/O | Pack/unpack state tensor | Pending |
107
+
108
+ ## Instruction Set
109
+
110
+ | Opcode | Mnemonic | Operation | Status |
111
+ |--------|----------|-----------|--------|
112
+ | 0x0 | ADD | R[d] = R[a] + R[b] | Done |
113
+ | 0x1 | SUB | R[d] = R[a] - R[b] | Done |
114
+ | 0x2 | AND | R[d] = R[a] & R[b] | Done |
115
+ | 0x3 | OR | R[d] = R[a] \| R[b] | Done |
116
+ | 0x4 | XOR | R[d] = R[a] ^ R[b] | Done |
117
+ | 0x5 | SHL | R[d] = R[a] << 1 | Done |
118
+ | 0x6 | SHR | R[d] = R[a] >> 1 | Done |
119
+ | 0x7 | MUL | R[d] = R[a] * R[b] | Done |
120
+ | 0x8 | DIV | R[d] = R[a] / R[b] | Done |
121
+ | 0x9 | CMP | flags = R[a] - R[b] | Done |
122
+ | 0xA | LOAD | R[d] = M[addr] | Pending |
123
+ | 0xB | STORE | M[addr] = R[s] | Pending |
124
+ | 0xC | JMP | PC = addr | Partial |
125
+ | 0xD | JZ/JNZ | PC = addr if flag | Done |
126
+ | 0xE | CALL | push PC; PC = addr | Partial |
127
+ | 0xF | HALT | stop execution | Done |
128
+
129
+ ## Completed Circuits
130
+
131
+ ### Arithmetic (2,756 tensors)
132
+ - ADD, SUB, MUL, DIV, NEG
133
+ - ADC, SBC (with carry)
134
+ - CMP (compare, sets flags)
135
+
136
+ ### Bit Operations (62 tensors)
137
+ - ASR (arithmetic shift right)
138
+ - ROL, ROR (rotate through carry)
139
+ - SHL, SHR (from original)
140
+
141
+ ### Control (306 tensors)
142
+ - NOP, HALT
143
+ - PC Increment, PC Load MUX
144
+ - Register MUX 4-to-1
145
+ - Instruction Decoder 4-to-16
146
+
147
+ ### Original Model (~3,100 tensors)
148
+ - Boolean gates (AND, OR, XOR, NOT, NAND, NOR)
149
+ - Ripple carry adders (2/4/8-bit)
150
+ - 8Γ—8 multiplier
151
+ - Comparators, threshold gates
152
+ - Conditional jumps
153
+
154
+ **Current: 6,184 tensors**
155
+ **Projected: ~1.65M tensors (with 64KB memory)**
156
+
157
+ ## Applications
158
+
159
+ The machine is general-purpose. Possible callers:
160
+
161
+ 1. **Direct invocation**: Load state, call forward(), read result
162
+ 2. **LLM coprocessor**: Embedded layer for exact computation
163
+ 3. **Neuromorphic deployment**: Map to spiking hardware
164
+ 4. **Verified computation**: Provably correct execution
165
+ 5. **Educational**: Transparent, inspectable CPU
166
 
167
+ ## Design Principles
 
168
 
169
+ 1. **Autonomy**: Machine runs without external logic
170
+ 2. **Purity**: forward(state) β†’ state', no side effects
171
+ 3. **Verification**: Every circuit exhaustively tested
172
+ 4. **Transparency**: All weights inspectable, all operations traceable
173
+ 5. **Universality**: Turing complete, runs arbitrary programs