|
|
---
|
|
|
license: mit
|
|
|
tags:
|
|
|
- pytorch
|
|
|
- safetensors
|
|
|
- threshold-logic
|
|
|
- neuromorphic
|
|
|
- parity
|
|
|
---
|
|
|
|
|
|
# threshold-parity
|
|
|
|
|
|
Computes 8-bit parity (XOR of all bits). A canonical hard function for threshold networks - parity requires depth.
|
|
|
|
|
|
## Circuit
|
|
|
|
|
|
```
|
|
|
xβ xβ xβ xβ xβ xβ
xβ xβ
|
|
|
β β β β β β β β
|
|
|
ββββ΄βββ΄βββ΄βββΌβββ΄βββ΄βββ΄βββ
|
|
|
βΌ
|
|
|
βββββββββββββββ
|
|
|
β Layer 1 β 11 neurons (pruned)
|
|
|
β Β±1 weights β
|
|
|
βββββββββββββββ
|
|
|
β
|
|
|
βΌ
|
|
|
βββββββββββββββ
|
|
|
β Layer 2 β 3 neurons
|
|
|
βββββββββββββββ
|
|
|
β
|
|
|
βΌ
|
|
|
βββββββββββββββ
|
|
|
β Output β
|
|
|
βββββββββββββββ
|
|
|
β
|
|
|
βΌ
|
|
|
{0, 1} = HW mod 2
|
|
|
```
|
|
|
|
|
|
## Why Is Parity Hard?
|
|
|
|
|
|
Parity is HW mod 2 - the simplest modular function. Yet it's notoriously difficult:
|
|
|
|
|
|
1. **Not linearly separable**: No single hyperplane separates odd from even HW
|
|
|
2. **Flat loss landscape**: Gradient descent fails because parity depends on a global property
|
|
|
3. **Requires depth**: Minimum 3 layers for threshold networks
|
|
|
|
|
|
This network was found via evolutionary search (10,542 generations), not gradient descent.
|
|
|
|
|
|
## Algebraic Insight
|
|
|
|
|
|
For any Β±1 weight vector w, the dot product wΒ·x has the same parity as HW(x):
|
|
|
|
|
|
```
|
|
|
(-1)^(wΒ·x) = (-1)^HW(x)
|
|
|
```
|
|
|
|
|
|
This is because each +1 weight contributes its input's value, and each -1 weight contributes -(input) β‘ input (mod 2).
|
|
|
|
|
|
The network uses this to detect whether HW is even or odd.
|
|
|
|
|
|
## Architecture
|
|
|
|
|
|
| Variant | Architecture | Parameters |
|
|
|
|---------|--------------|------------|
|
|
|
| Full | 8 β 32 β 16 β 1 | 833 |
|
|
|
| Pruned | 8 β 11 β 3 β 1 | 139 |
|
|
|
|
|
|
The pruned variant keeps only the 14 critical neurons identified through ablation.
|
|
|
|
|
|
## How It Works
|
|
|
|
|
|
The pruned network has three L2 neurons:
|
|
|
- L2-N0: Always fires (large positive bias)
|
|
|
- L2-N1: Fires iff HW is even (parity discriminator)
|
|
|
- L2-N2: Always fires (large positive bias)
|
|
|
|
|
|
Output: `L2-N0 - L2-N1 + L2-N2 - 2 β₯ 0` = NOT(L2-N1)
|
|
|
|
|
|
Since L2-N1 fires when even, output fires when odd. That's parity.
|
|
|
|
|
|
## Usage
|
|
|
|
|
|
```python
|
|
|
from safetensors.torch import load_file
|
|
|
import torch
|
|
|
|
|
|
w = load_file('model.safetensors')
|
|
|
|
|
|
def forward(x):
|
|
|
x = x.float()
|
|
|
x = (x @ w['layer1.weight'].T + w['layer1.bias'] >= 0).float()
|
|
|
x = (x @ w['layer2.weight'].T + w['layer2.bias'] >= 0).float()
|
|
|
x = (x @ w['output.weight'].T + w['output.bias'] >= 0).float()
|
|
|
return x.squeeze(-1)
|
|
|
|
|
|
inp = torch.tensor([[1,0,1,1,0,0,1,0]]) # HW=4
|
|
|
print(int(forward(inp).item())) # 0 (even)
|
|
|
```
|
|
|
|
|
|
## Files
|
|
|
|
|
|
```
|
|
|
threshold-parity/
|
|
|
βββ model.safetensors
|
|
|
βββ model.py
|
|
|
βββ config.json
|
|
|
βββ README.md
|
|
|
βββ pruned/
|
|
|
βββ model.safetensors
|
|
|
βββ ...
|
|
|
```
|
|
|
|
|
|
## License
|
|
|
|
|
|
MIT
|
|
|
|