threshold-parity / README.md
phanerozoic's picture
Rename from tiny-parity-verified
5d07688 verified
---
license: mit
tags:
- pytorch
- safetensors
- threshold-logic
- neuromorphic
- parity
---
# threshold-parity
Computes 8-bit parity (XOR of all bits). A canonical hard function for threshold networks - parity requires depth.
## Circuit
```
xβ‚€ x₁ xβ‚‚ x₃ xβ‚„ xβ‚… x₆ x₇
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β””β”€β”€β”΄β”€β”€β”΄β”€β”€β”΄β”€β”€β”Όβ”€β”€β”΄β”€β”€β”΄β”€β”€β”΄β”€β”€β”˜
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Layer 1 β”‚ 11 neurons (pruned)
β”‚ Β±1 weights β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Layer 2 β”‚ 3 neurons
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Output β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
{0, 1} = HW mod 2
```
## Why Is Parity Hard?
Parity is HW mod 2 - the simplest modular function. Yet it's notoriously difficult:
1. **Not linearly separable**: No single hyperplane separates odd from even HW
2. **Flat loss landscape**: Gradient descent fails because parity depends on a global property
3. **Requires depth**: Minimum 3 layers for threshold networks
This network was found via evolutionary search (10,542 generations), not gradient descent.
## Algebraic Insight
For any Β±1 weight vector w, the dot product wΒ·x has the same parity as HW(x):
```
(-1)^(wΒ·x) = (-1)^HW(x)
```
This is because each +1 weight contributes its input's value, and each -1 weight contributes -(input) ≑ input (mod 2).
The network uses this to detect whether HW is even or odd.
## Architecture
| Variant | Architecture | Parameters |
|---------|--------------|------------|
| Full | 8 β†’ 32 β†’ 16 β†’ 1 | 833 |
| Pruned | 8 β†’ 11 β†’ 3 β†’ 1 | 139 |
The pruned variant keeps only the 14 critical neurons identified through ablation.
## How It Works
The pruned network has three L2 neurons:
- L2-N0: Always fires (large positive bias)
- L2-N1: Fires iff HW is even (parity discriminator)
- L2-N2: Always fires (large positive bias)
Output: `L2-N0 - L2-N1 + L2-N2 - 2 β‰₯ 0` = NOT(L2-N1)
Since L2-N1 fires when even, output fires when odd. That's parity.
## Usage
```python
from safetensors.torch import load_file
import torch
w = load_file('model.safetensors')
def forward(x):
x = x.float()
x = (x @ w['layer1.weight'].T + w['layer1.bias'] >= 0).float()
x = (x @ w['layer2.weight'].T + w['layer2.bias'] >= 0).float()
x = (x @ w['output.weight'].T + w['output.bias'] >= 0).float()
return x.squeeze(-1)
inp = torch.tensor([[1,0,1,1,0,0,1,0]]) # HW=4
print(int(forward(inp).item())) # 0 (even)
```
## Files
```
threshold-parity/
β”œβ”€β”€ model.safetensors
β”œβ”€β”€ model.py
β”œβ”€β”€ config.json
β”œβ”€β”€ README.md
└── pruned/
β”œβ”€β”€ model.safetensors
└── ...
```
## License
MIT