phanerozoic commited on
Commit
dc52122
Β·
verified Β·
1 Parent(s): 204acbf

Update roadmap: self-contained tensor CPU architecture

Browse files
Files changed (1) hide show
  1. todo.md +101 -42
todo.md CHANGED
@@ -1,42 +1,101 @@
1
- # Missing CPU Components
2
-
3
- ## Core CPU Components
4
-
5
- | Component | Description | Status |
6
- |-------------------------|-----------------------------|--------------------------------|
7
- | SUB | Subtraction circuit | DONE - 162 tensors, 65536/65536 tests pass |
8
- | DIV | Division circuit | DONE - 1984 tensors, 65280/65280 tests pass |
9
- | NEG | Two's complement negate | DONE - 76 tensors, 256/256 tests pass |
10
- | Program Counter | PC incrementer | DONE - 62 tensors, 256/256 tests pass |
11
- | PC Load | 2-to-1 MUX for PC/jump | DONE - 50 tensors, 1536/1536 tests pass |
12
- | Register File MUX | Select 1-of-4 GPRs | DONE - 84 tensors, 1036/1036 tests pass |
13
- | Register Write Enable | Write back to register | Missing |
14
- | Memory Address Register | MAR latch | Missing |
15
- | Memory Data Register | MDR latch | Missing |
16
- | Memory Read/Write | R/W enable signals | Missing |
17
- | Instruction Register | IR latch | Missing |
18
- | Instruction Decoder | 4-bit β†’ 16 one-hot | DONE - 44 tensors, 16/16 tests pass |
19
- | Sequencer FSM | Fetch/Decode/Execute states | Missing |
20
-
21
- ## Extended Operations
22
-
23
- | Component | Description | Status |
24
- |-----------|--------------------------------------|-----------------------------------------------------|
25
- | ROL | Rotate left through carry | DONE - 18 tensors, 512/512 tests pass |
26
- | ROR | Rotate right through carry | DONE - 18 tensors, 512/512 tests pass |
27
- | ASR | Arithmetic shift right (sign-extend) | DONE - 26 tensors, 256/256 tests pass |
28
- | ADC | Add with carry input | DONE - 144 tensors, 131072/131072 tests pass |
29
- | SBC | Subtract with borrow | DONE - 160 tensors, 131072/131072 tests pass |
30
- | CMP | Compare (SUB without writeback) | DONE - 168 tensors, 65536/65536 tests pass |
31
-
32
- ## System
33
-
34
- | Component | Description | Status |
35
- |------------------------|------------------|---------|
36
- | Interrupt Controller | IRQ/NMI handling | Missing |
37
- | Interrupt Vector Table | Jump addresses | Missing |
38
- | Interrupt Enable Flag | Global IE bit | Missing |
39
- | I/O Ports | IN/OUT data path | Missing |
40
- | HALT | Stop execution | DONE - 42 tensors, 24576/24576 tests pass |
41
- | NOP | No operation | DONE - 24 tensors, 4096/4096 tests pass |
42
- | Watchdog Timer | Reset on timeout | Missing |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Self-Contained Tensor CPU Roadmap
2
+
3
+ ## Vision
4
+ A fully self-contained CPU where:
5
+ - All computation is threshold circuits (frozen weights)
6
+ - Memory is a tensor partition (data flows through)
7
+ - Stepper logic is encoded as circuits (no external orchestration)
8
+ - One forward pass = one clock tick
9
+
10
+ ## Architecture
11
+
12
+ ```
13
+ Input State Tensor:
14
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
15
+ β”‚ PC [8] β”‚ Regs [32] β”‚ Flags β”‚ Memory [NΓ—8] β”‚
16
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
17
+ ↓
18
+ Threshold Circuits
19
+ (fetch/decode/execute)
20
+ ↓
21
+ Output State Tensor:
22
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
23
+ β”‚ PC' [8] β”‚ Regs' [32]β”‚ Flags' β”‚ Memory' [NΓ—8] β”‚
24
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
25
+ ```
26
+
27
+ ## Phase 1: Memory Infrastructure
28
+
29
+ | Component | Description | Status |
30
+ |-----------|-------------|--------|
31
+ | Memory Address Decoder | 8-bit address β†’ 256 one-hot select | Pending |
32
+ | Memory Read MUX | 256-to-1 mux, select byte by address | Pending |
33
+ | Memory Write Demux | Route write data to addressed location | Pending |
34
+ | Memory Cell Logic | Conditional update: new or keep old | Pending |
35
+
36
+ ## Phase 2: Instruction Fetch
37
+
38
+ | Component | Description | Status |
39
+ |-----------|-------------|--------|
40
+ | PC β†’ Memory Read | Fetch instruction at PC address | Pending |
41
+ | Instruction Split | Separate opcode from operands | Pending |
42
+ | Operand Decode | Extract src/dst register indices | Pending |
43
+
44
+ ## Phase 3: Execute Cycle
45
+
46
+ | Component | Description | Status |
47
+ |-----------|-------------|--------|
48
+ | Register Read MUX | Select source register(s) | Done (regmux4to1) |
49
+ | ALU Dispatch | Route to correct operation circuit | Pending |
50
+ | Result MUX | Select ALU output | Pending |
51
+ | Writeback Logic | Route result to register or memory | Pending |
52
+ | PC Update | Increment or load jump target | Done (pc_inc, pc_load) |
53
+
54
+ ## Phase 4: Full Integration
55
+
56
+ | Component | Description | Status |
57
+ |-----------|-------------|--------|
58
+ | State Packer | Combine all outputs into state tensor | Pending |
59
+ | State Unpacker | Split input state into components | Pending |
60
+ | Single-Pass Execute | One forward pass = one instruction | Pending |
61
+
62
+ ## Completed Building Blocks
63
+
64
+ These circuits are ready to use:
65
+
66
+ ### Arithmetic
67
+ - NEG (76 tensors)
68
+ - SUB (162 tensors)
69
+ - ADC (144 tensors)
70
+ - SBC (160 tensors)
71
+ - DIV (1984 tensors)
72
+ - ADD, MUL (from original model)
73
+
74
+ ### Comparison & Logic
75
+ - CMP (168 tensors)
76
+ - ASR, ROL, ROR (62 tensors total)
77
+ - All boolean gates (from original model)
78
+
79
+ ### Control
80
+ - NOP (24 tensors)
81
+ - HALT (42 tensors)
82
+ - PC Incrementer (62 tensors)
83
+ - PC Load MUX (50 tensors)
84
+ - Instruction Decoder (44 tensors)
85
+ - Register File MUX (84 tensors)
86
+ - Conditional jumps (from original model)
87
+
88
+ ## Memory Size Options
89
+
90
+ | Size | Bytes | Bit-Tensors | Use Case |
91
+ |------|-------|-------------|----------|
92
+ | Tiny | 256 | ~2K | Proof of concept |
93
+ | Small | 4KB | ~32K | Simple programs |
94
+ | Medium | 64KB | ~512K | Full 8-bit address space |
95
+
96
+ ## Notes
97
+
98
+ - Memory is DATA flowing through, not stored in weights
99
+ - Weights remain frozen - only input/output tensors change
100
+ - "Stepper" = calling forward() repeatedly
101
+ - No Python logic in the loop - just tensor→forward→tensor