phanerozoic
/

threshold-parity

threshold_network

threshold-logic

Model card Files Files and versions

threshold-parity / README.md

phanerozoic's picture

Rename from tiny-parity-verified

5d07688 verified 30 days ago

|

history blame contribute delete

3.17 kB

	---
	license: mit
	tags:
	- pytorch
	- safetensors
	- threshold-logic
	- neuromorphic
	- parity
	---

	# threshold-parity

	Computes 8-bit parity (XOR of all bits). A canonical hard function for threshold networks - parity requires depth.

	## Circuit

	```
	x₀ x₁ x₂ x₃ x₄ x₅ x₆ x₇
	│ │ │ │ │ │ │ │
	└──┴──┴──┴──┼──┴──┴──┴──┘
	▼
	┌─────────────┐
	│ Layer 1 │ 11 neurons (pruned)
	│ ±1 weights │
	└─────────────┘
	│
	▼
	┌─────────────┐
	│ Layer 2 │ 3 neurons
	└─────────────┘
	│
	▼
	┌─────────────┐
	│ Output │
	└─────────────┘
	│
	▼
	{0, 1} = HW mod 2
	```

	## Why Is Parity Hard?

	Parity is HW mod 2 - the simplest modular function. Yet it's notoriously difficult:

	1. Not linearly separable: No single hyperplane separates odd from even HW
	2. Flat loss landscape: Gradient descent fails because parity depends on a global property
	3. Requires depth: Minimum 3 layers for threshold networks

	This network was found via evolutionary search (10,542 generations), not gradient descent.

	## Algebraic Insight

	For any ±1 weight vector w, the dot product w·x has the same parity as HW(x):

	```
	(-1)^(w·x) = (-1)^HW(x)
	```

	This is because each +1 weight contributes its input's value, and each -1 weight contributes -(input) ≡ input (mod 2).

	The network uses this to detect whether HW is even or odd.

	## Architecture

	\| Variant \| Architecture \| Parameters \|
	\|---------\|--------------\|------------\|
	\| Full \| 8 → 32 → 16 → 1 \| 833 \|
	\| Pruned \| 8 → 11 → 3 → 1 \| 139 \|

	The pruned variant keeps only the 14 critical neurons identified through ablation.

	## How It Works

	The pruned network has three L2 neurons:
	- L2-N0: Always fires (large positive bias)
	- L2-N1: Fires iff HW is even (parity discriminator)
	- L2-N2: Always fires (large positive bias)

	Output: `L2-N0 - L2-N1 + L2-N2 - 2 ≥ 0` = NOT(L2-N1)

	Since L2-N1 fires when even, output fires when odd. That's parity.

	## Usage

	```python
	from safetensors.torch import load_file
	import torch

	w = load_file('model.safetensors')

	def forward(x):
	x = x.float()
	x = (x @ w['layer1.weight'].T + w['layer1.bias'] >= 0).float()
	x = (x @ w['layer2.weight'].T + w['layer2.bias'] >= 0).float()
	x = (x @ w['output.weight'].T + w['output.bias'] >= 0).float()
	return x.squeeze(-1)

	inp = torch.tensor([[1,0,1,1,0,0,1,0]]) # HW=4
	print(int(forward(inp).item())) # 0 (even)
	```

	## Files

	```
	threshold-parity/
	├── model.safetensors
	├── model.py
	├── config.json
	├── README.md
	└── pruned/
	├── model.safetensors
	└── ...
	```

	## License

	MIT