CharlesCNorton commited on
Commit Β·
ca99a3e
1
Parent(s): 6c2c63e
Refresh README format + repro details
Browse files- fix encoding artifacts and ASCII-only notation
- add float16 architecture overview and step-by-step
- document format_version, signal registry, and .inputs resolution
- add reproduce steps with expected runtimes
README.md
CHANGED
|
@@ -16,7 +16,7 @@ pipeline_tag: other
|
|
| 16 |
|
| 17 |
Digital circuits encoded as neural network weights.
|
| 18 |
|
| 19 |
-
Each gate is a threshold logic unit: `output = step(weights
|
| 20 |
|
| 21 |
## What's Here
|
| 22 |
|
|
@@ -76,27 +76,47 @@ Accuracy/rounding:
|
|
| 76 |
A threshold gate computes:
|
| 77 |
|
| 78 |
```
|
| 79 |
-
output = 1 if (
|
| 80 |
```
|
| 81 |
|
| 82 |
This is a perceptron with Heaviside step activation.
|
| 83 |
|
| 84 |
**AND gate**: weights = [1, 1], bias = -1.5
|
| 85 |
-
- (0,0): 0 + 0 - 1.5 = -1.5 < 0
|
| 86 |
-
- (0,1): 0 + 1 - 1.5 = -0.5 < 0
|
| 87 |
-
- (1,0): 1 + 0 - 1.5 = -0.5 < 0
|
| 88 |
-
- (1,1): 1 + 1 - 1.5 = 0.5
|
| 89 |
|
| 90 |
**XOR** requires two layers (not linearly separable):
|
| 91 |
- Layer 1: OR and NAND in parallel
|
| 92 |
- Layer 2: AND of both outputs
|
| 93 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 94 |
## Self-Documenting Format
|
| 95 |
|
| 96 |
Each gate has three tensors in `arithmetic.safetensors`:
|
| 97 |
-
- `.weight`
|
| 98 |
-
- `.bias`
|
| 99 |
-
- `.inputs`
|
| 100 |
|
| 101 |
Signal registry in metadata maps IDs to names:
|
| 102 |
|
|
@@ -112,9 +132,39 @@ with safe_open('arithmetic.safetensors', framework='pt') as f:
|
|
| 112 |
```
|
| 113 |
|
| 114 |
Signal naming:
|
| 115 |
-
- `$name`
|
| 116 |
-
- `#0`, `#1`
|
| 117 |
-
- `gate.path`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 118 |
|
| 119 |
## Running Eval
|
| 120 |
|
|
@@ -122,7 +172,7 @@ Signal naming:
|
|
| 122 |
python eval.py
|
| 123 |
```
|
| 124 |
|
| 125 |
-
Tests all circuits
|
| 126 |
Eval runs full + verbose by default; there is no quick/verbose mode. Use --circuit to filter reported circuits.
|
| 127 |
|
| 128 |
For coverage and input-routing validation:
|
|
@@ -135,9 +185,9 @@ python eval.py --coverage --inputs-coverage
|
|
| 135 |
|
| 136 |
## Development History
|
| 137 |
|
| 138 |
-
Started as an 8-bit CPU project. Built boolean gates, then arithmetic (adders
|
| 139 |
|
| 140 |
-
Float16 was added later. The commit history shows the iterative process
|
| 141 |
|
| 142 |
## Project Origin
|
| 143 |
|
|
|
|
| 16 |
|
| 17 |
Digital circuits encoded as neural network weights.
|
| 18 |
|
| 19 |
+
Each gate is a threshold logic unit: `output = step(weights * inputs + bias)`. The step function fires when the weighted sum >= 0. This maps digital logic to tensor operations.
|
| 20 |
|
| 21 |
## What's Here
|
| 22 |
|
|
|
|
| 76 |
A threshold gate computes:
|
| 77 |
|
| 78 |
```
|
| 79 |
+
output = 1 if (w1*x1 + w2*x2 + ... + wn*xn + bias) >= 0 else 0
|
| 80 |
```
|
| 81 |
|
| 82 |
This is a perceptron with Heaviside step activation.
|
| 83 |
|
| 84 |
**AND gate**: weights = [1, 1], bias = -1.5
|
| 85 |
+
- (0,0): 0 + 0 - 1.5 = -1.5 < 0 -> 0
|
| 86 |
+
- (0,1): 0 + 1 - 1.5 = -0.5 < 0 -> 0
|
| 87 |
+
- (1,0): 1 + 0 - 1.5 = -0.5 < 0 -> 0
|
| 88 |
+
- (1,1): 1 + 1 - 1.5 = 0.5 >= 0 -> 1
|
| 89 |
|
| 90 |
**XOR** requires two layers (not linearly separable):
|
| 91 |
- Layer 1: OR and NAND in parallel
|
| 92 |
- Layer 2: AND of both outputs
|
| 93 |
|
| 94 |
+
## Float16 Architecture (Short)
|
| 95 |
+
|
| 96 |
+
High-level dataflow:
|
| 97 |
+
|
| 98 |
+
```
|
| 99 |
+
float16.<op>
|
| 100 |
+
a,b -> unpack -> classify -> core op -> normalize/round -> pack -> out
|
| 101 |
+
```
|
| 102 |
+
|
| 103 |
+
Step-by-step (condensed):
|
| 104 |
+
1) Unpack sign/exponent/mantissa. Subnormals use implicit 0, normals use implicit 1.
|
| 105 |
+
2) Classify inputs: zero, subnormal, normal, inf, NaN.
|
| 106 |
+
3) Core op:
|
| 107 |
+
- add/sub: align exponents, add/sub mantissas, compute sign.
|
| 108 |
+
- mul/div: add/sub exponents (minus bias), multiply/divide mantissas.
|
| 109 |
+
- unary LUT: lookup output for each 16-bit input (torch.float16), with NaN canonicalization.
|
| 110 |
+
- pow: ln(a) -> mul(b, ln(a)) -> exp, rounded at each stage.
|
| 111 |
+
4) Normalize and round-to-nearest-even (CLZ + shifts).
|
| 112 |
+
5) Pack sign/exponent/mantissa and mux special cases (NaN/Inf/zero).
|
| 113 |
+
|
| 114 |
## Self-Documenting Format
|
| 115 |
|
| 116 |
Each gate has three tensors in `arithmetic.safetensors`:
|
| 117 |
+
- `.weight` -- input weights
|
| 118 |
+
- `.bias` -- threshold
|
| 119 |
+
- `.inputs` -- int64 tensor of signal IDs (ordered to match `.weight`)
|
| 120 |
|
| 121 |
Signal registry in metadata maps IDs to names:
|
| 122 |
|
|
|
|
| 132 |
```
|
| 133 |
|
| 134 |
Signal naming:
|
| 135 |
+
- `$name` -- circuit input (e.g., `$a`, `$dividend[0]`)
|
| 136 |
+
- `#0`, `#1` -- constants
|
| 137 |
+
- `gate.path` -- output of another gate
|
| 138 |
+
|
| 139 |
+
Format details:
|
| 140 |
+
- Metadata includes `signal_registry` (JSON map from ID to name) and `format_version` (currently `2.0`).
|
| 141 |
+
- `.inputs` stores global signal IDs; these IDs are resolved through `signal_registry`.
|
| 142 |
+
- External inputs are names starting with `$` or containing `.$` (e.g., `float16.add.$a[3]`).
|
| 143 |
+
- All gates include `.inputs`; `build.py` infers them and `--inputs-coverage` fails if resolution is missing.
|
| 144 |
+
|
| 145 |
+
## How to Reproduce
|
| 146 |
+
|
| 147 |
+
Rebuild tensors:
|
| 148 |
+
|
| 149 |
+
```bash
|
| 150 |
+
python build.py
|
| 151 |
+
```
|
| 152 |
+
|
| 153 |
+
Run full evaluation (always full + verbose):
|
| 154 |
+
|
| 155 |
+
```bash
|
| 156 |
+
python eval.py
|
| 157 |
+
```
|
| 158 |
+
|
| 159 |
+
Run coverage and input-routing validation:
|
| 160 |
+
|
| 161 |
+
```bash
|
| 162 |
+
python eval.py --coverage --inputs-coverage
|
| 163 |
+
```
|
| 164 |
+
|
| 165 |
+
Expected runtimes (ballpark, CPU dependent):
|
| 166 |
+
- `build.py`: ~1-2 minutes, produces ~247 MB `arithmetic.safetensors`
|
| 167 |
+
- `eval.py --coverage --inputs-coverage`: ~3-4 minutes for 211,581 tests
|
| 168 |
|
| 169 |
## Running Eval
|
| 170 |
|
|
|
|
| 172 |
python eval.py
|
| 173 |
```
|
| 174 |
|
| 175 |
+
Tests all circuits. Small circuits are exhaustive; 16-bit arithmetic is sampled on grids (plus edge cases). Float16 tests cover special cases (NaN, Inf, +/-0, subnormals) plus normal arithmetic.
|
| 176 |
Eval runs full + verbose by default; there is no quick/verbose mode. Use --circuit to filter reported circuits.
|
| 177 |
|
| 178 |
For coverage and input-routing validation:
|
|
|
|
| 185 |
|
| 186 |
## Development History
|
| 187 |
|
| 188 |
+
Started as an 8-bit CPU project. Built boolean gates, then arithmetic (adders -> multipliers -> dividers), then CPU control logic. The CPU worked but the arithmetic core turned out to be the useful part, so it was extracted.
|
| 189 |
|
| 190 |
+
Float16 was added later. The commit history shows the iterative process--float16.add went through multiple rounds of bug fixes for edge cases (zero handling, sign logic, normalization). Mul and div required multi-bit carry infrastructure.
|
| 191 |
|
| 192 |
## Project Origin
|
| 193 |
|