File size: 11,098 Bytes
156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 560feb3 914753d 560feb3 914753d 560feb3 914753d 560feb3 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 560feb3 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 914753d 156dd54 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 |
---
license: mit
tags:
- pytorch
- safetensors
- formal-verification
- coq
- parity
- threshold-network
- ternary-weights
- binary-classification
- neuromorphic
---
# tiny-parity-prover
Trained weights for a formally verified neural network that computes 8-bit parity. This repository contains the model artifacts; for proof development and Coq source code, see [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified).
## Overview
This is a ternary threshold network that computes `parity(x) = xβ β xβ β ... β xβ` for 8-bit binary inputs. The network outputs 1 if the number of set bits is odd, 0 if even.
**Key properties:**
- 100% accuracy on all 256 possible inputs
- Correctness proven in Coq via constructive algebraic proof
- Weights constrained to {-1, 0, +1} (ternary quantization)
- Heaviside step activation (x β₯ 0 β 1, else 0)
- Suitable for neuromorphic hardware deployment
## Model Variants
| Variant | Architecture | Parameters | Weights | Verification |
|---------|--------------|------------|---------|--------------|
| Full | 8 β 32 β 16 β 1 | 833 | `model.safetensors` | Coq-proven |
| Pruned | 8 β 11 β 3 β 1 | 139 | `pruned/model.safetensors` | Coq-proven |
The pruned variant retains only the 14 neurons (11 in L1, 3 in L2) identified as critical through systematic ablation analysis. Both variants compute identical functions and are independently verified.
## Quick Start
```python
import torch
from safetensors.torch import load_file
# Load weights
weights = load_file('model.safetensors') # or 'pruned/model.safetensors'
# Manual forward pass (Heaviside activation)
def forward(x, weights):
x = x.float()
x = (x @ weights['layer1.weight'].T + weights['layer1.bias'] >= 0).float()
x = (x @ weights['layer2.weight'].T + weights['layer2.bias'] >= 0).float()
x = (x @ weights['output.weight'].T + weights['output.bias'] >= 0).float()
return x.squeeze(-1)
# Test
inputs = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)
output = forward(inputs, weights)
print(f"Parity of [1,0,1,1,0,0,1,0]: {int(output.item())}") # 0 (4 bits set, even)
```
Or use the provided model classes:
```python
from model import ThresholdNetwork
model = ThresholdNetwork.from_safetensors('model.safetensors')
from pruned.model import PrunedThresholdNetwork
pruned = PrunedThresholdNetwork.from_safetensors('pruned/model.safetensors')
```
## Weight Structure
### Full Network (833 parameters)
| Tensor | Shape | Values | Description |
|--------|-------|--------|-------------|
| `layer1.weight` | [32, 8] | {-1, 0, +1} | 256 ternary weights |
| `layer1.bias` | [32] | [-8, +8] | Bounded integers |
| `layer2.weight` | [16, 32] | {-1, 0, +1} | 512 ternary weights |
| `layer2.bias` | [16] | [-32, +32] | Bounded integers |
| `output.weight` | [1, 16] | {-1, 0, +1} | 16 ternary weights |
| `output.bias` | [1] | [-16, +16] | Bounded integer |
### Pruned Network (139 parameters)
| Tensor | Shape | Values |
|--------|-------|--------|
| `layer1.weight` | [11, 8] | {-1, 0, +1} |
| `layer1.bias` | [11] | Bounded integers |
| `layer2.weight` | [3, 11] | {-1, 0, +1} |
| `layer2.bias` | [3] | Bounded integers |
| `output.weight` | [1, 3] | {-1, 0, +1} |
| `output.bias` | [1] | Bounded integer |
## Training
The network was trained using evolutionary search (genetic algorithm), not gradient descent.
**Why not backpropagation?** Parity is a canonical hard function for neural networks. The loss landscape is essentially flat because parity depends on a global property (bit count mod 2) rather than local features. Small weight perturbations produce negligible gradient signal until the network is already near a solution.
**Evolutionary search parameters:**
- Population size: 500
- Generations to solution: 10,542
- Selection: Top 20% elite
- Mutation rates: 0.05, 0.15, 0.30 (mixed)
- Fitness: Classification accuracy on all 256 inputs
The search finds a working configuration by directly optimizing discrete accuracy rather than relying on gradient information.
## Formal Verification
Both network variants are proven correct in the Coq proof assistant. The verification evolved through 12 proof versions, culminating in a fully constructive algebraic proof.
**Core theorem (proven for all 256 inputs):**
```coq
Theorem network_correct : forall x0 x1 x2 x3 x4 x5 x6 x7,
network x0 x1 x2 x3 x4 x5 x6 x7 = parity [x0;x1;x2;x3;x4;x5;x6;x7].
```
**Proof strategy (V9_FullyConstructive.v):**
The proof decomposes the critical Layer-2 neuron's pre-activation into three components:
- **A**: Contribution from h-function neurons (depends on a specific linear form h(x))
- **B**: Contribution from Hamming-weight-dependent neurons
- **C**: Contribution from g-function neurons (where g(x) β‘ HW(x) mod 2)
Key algebraic insight: For any Β±1 weight vector, the dot product has the same parity as the Hamming weight. This is proven constructively:
```coq
Theorem g_parity : forall x0 x1 x2 x3 x4 x5 x6 x7,
Z.even (g_func x0 x1 x2 x3 x4 x5 x6 x7) =
Nat.even (hamming_weight [x0;x1;x2;x3;x4;x5;x6;x7]).
```
The network implements parity by:
1. Two "always-on" neurons (L2-N0, L2-N2) with large biases
2. One parity discriminator (L2-N1) that fires iff Hamming weight is even
3. Output wiring that negates the discriminator
**Parametric generalization (V10-V11):**
The algebraic foundation extends to arbitrary input size n:
```coq
Theorem dot_parity_equals_hw_parity : forall ws xs,
length ws = length xs ->
Z.even (signed_dot ws xs) = Nat.even (hamming_weight xs).
```
A verified n-bit parity network construction is provided using n+3 neurons total.
Full proof development: [threshold-logic-verified/coq](https://github.com/CharlesCNorton/threshold-logic-verified/tree/main/coq)
## Test Results
### Functional Validation
| Test | Result |
|------|--------|
| Exhaustive (256 inputs) | 256/256 |
| Batch inference consistency | Pass |
| Determinism (1000 runs) | Pass |
| Throughput | 18,161 inf/sec (CPU) |
### Analytical Properties
| Property | Finding |
|----------|---------|
| Weight sparsity | 32.9% zeros (258/784) |
| Critical L1 neurons | 11 of 32 |
| Critical L2 neurons | 3 of 16 |
| Noise sensitivity (Ο=0.05) | Accuracy drops to 51% |
| Minimum depth | 3 layers required |
**Neuron ablation results:**
| Neuron Removed | Accuracy Impact |
|----------------|-----------------|
| L2-N8 | 256 β 128 (chance) |
| L1-N11 | 256 β 162 |
| Any redundant neuron | 256 β 256 (no change) |
**Random pruning sensitivity:**
| Weights Pruned | Accuracy |
|----------------|----------|
| 0% | 256/256 |
| 10% | 138-191/256 |
| 20%+ | ~128/256 (chance) |
### Noise Robustness
The network is extremely fragile to input noise due to the hard Heaviside threshold:
| Gaussian Noise Ο | Accuracy |
|------------------|----------|
| 0.00 | 100.0% |
| 0.05 | 51.2% |
| 0.10 | 51.2% |
| 0.20 | 48.8% |
This is inherent to threshold networks, not a defect.
### Bias Sensitivity
| Layer | Sensitive to Β±1 Perturbation |
|-------|------------------------------|
| Layer 1 | 9 of 32 biases |
| Layer 2 | 1 of 16 biases |
| Output | 1 of 1 biases |
## Practical Applications
The network has been validated on standard parity-based protocols:
| Application | Test Cases | Result |
|-------------|------------|--------|
| Memory ECC parity | 100 bytes | 100% |
| RAID-5 reconstruction | 50 stripes | 100% |
| UART frame validation | 100 frames | 100% |
| XOR checksum | 100 bytes | Pass |
| LFSR keystream | 32 bits | Deterministic |
| Barcode validation | 50 codes | 100% |
| Packet checksum | 20 packets | 100% |
| Hamming (7,4) syndrome | 20 codewords | 100% |
| PS/2 keyboard codes | 20 scancodes | 100% |
| DNA codon parity | 8 codons | 100% |
## Neuromorphic Deployment
The primary use case is neuromorphic hardware where threshold networks are the native compute primitive and XOR gates cannot be directly instantiated:
| Platform | Status |
|----------|--------|
| Intel Loihi | Compatible (untested) |
| IBM TrueNorth | Compatible (untested) |
| BrainChip Akida | Compatible (untested) |
**Resource comparison:**
| Implementation | Gates/Neurons |
|----------------|---------------|
| Threshold network (pruned) | 14 neurons |
| Threshold network (full) | 49 neurons |
| XOR tree (conventional) | 7 gates |
The 70-140x overhead vs XOR gates is acceptable only when XOR is unavailable.
## Pruned Network Details
The pruned network was extracted by identifying critical neurons through systematic ablation:
**Retained neurons:**
- Layer 1: indices [3, 7, 9, 11, 12, 16, 17, 20, 24, 27, 28]
- Layer 2: indices [8, 11, 14]
**Interpretation:**
The pruned network reveals the minimal structure for parity computation:
- L2-N0 (originally N8): Always fires (bias=30 dominates)
- L2-N1 (originally N11): Parity discriminator (fires iff HW even)
- L2-N2 (originally N14): Always fires (bias=20 dominates)
- Output: Computes `L2-N0 - L2-N1 + L2-N2 - 2 >= 0`, which equals `NOT(L2-N1)`
Since L2-N1 fires when Hamming weight is even, the output fires when Hamming weight is oddβexactly parity.
## Limitations
- **Fixed input size**: 8 bits only (parametric construction in Coq extends to any n)
- **Binary inputs**: Expects {0, 1}, not continuous values
- **No noise margin**: Heaviside threshold at exactly 0
- **Not differentiable**: Cannot be fine-tuned with gradient descent
## Files
```
tiny-parity-prover/
βββ model.safetensors # Full network weights (833 params)
βββ model.py # Full network inference code
βββ config.json # Full network metadata
βββ results.md # Comprehensive test results
βββ todo.md # Future work notes
βββ README.md # This file
βββ pruned/
βββ model.safetensors # Pruned network weights (139 params)
βββ model.py # Pruned network inference code
βββ config.json # Pruned network metadata
βββ README.md # Pruned variant description
```
## Citation
```bibtex
@software{tiny_parity_prover_2025,
title={tiny-parity-prover: Formally Verified Neural Network for 8-bit Parity},
author={Norton, Charles},
url={https://huggingface.co/phanerozoic/tiny-parity-prover},
year={2025},
note={Weights and inference code. Proofs at github.com/CharlesCNorton/threshold-logic-verified}
}
```
## Related
- **Proof repository**: [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified) β Coq proofs, training code, full documentation
- **Coq proof files**: V0_Core.v through V11_ParametricNetwork.v
- **Key proof**: V9_FullyConstructive.v (complete algebraic proof without vm_compute)
## License
MIT
|