File size: 11,098 Bytes
156dd54
 
 
 
 
 
 
 
 
 
 
914753d
156dd54
 
 
 
914753d
156dd54
914753d
156dd54
914753d
156dd54
914753d
 
 
 
 
 
156dd54
914753d
560feb3
914753d
 
 
 
560feb3
914753d
560feb3
914753d
560feb3
914753d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
156dd54
914753d
156dd54
914753d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
156dd54
 
 
914753d
 
 
156dd54
914753d
 
 
 
 
 
156dd54
914753d
156dd54
914753d
156dd54
914753d
156dd54
914753d
156dd54
914753d
 
156dd54
 
914753d
156dd54
914753d
 
 
 
156dd54
914753d
156dd54
914753d
 
 
 
 
156dd54
914753d
 
 
 
 
 
560feb3
914753d
 
 
 
 
 
156dd54
 
914753d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
156dd54
914753d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
156dd54
 
 
914753d
 
 
 
156dd54
914753d
156dd54
914753d
 
 
 
 
 
 
 
 
 
 
 
 
 
156dd54
 
 
 
 
 
914753d
 
 
 
156dd54
 
 
914753d
 
 
 
 
 
156dd54
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
---

license: mit
tags:
- pytorch
- safetensors
- formal-verification
- coq
- parity
- threshold-network
- ternary-weights
- binary-classification
- neuromorphic
---


# tiny-parity-prover

Trained weights for a formally verified neural network that computes 8-bit parity. This repository contains the model artifacts; for proof development and Coq source code, see [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified).

## Overview

This is a ternary threshold network that computes `parity(x) = xβ‚€ βŠ• x₁ βŠ• ... βŠ• x₇` for 8-bit binary inputs. The network outputs 1 if the number of set bits is odd, 0 if even.

**Key properties:**
- 100% accuracy on all 256 possible inputs
- Correctness proven in Coq via constructive algebraic proof
- Weights constrained to {-1, 0, +1} (ternary quantization)
- Heaviside step activation (x β‰₯ 0 β†’ 1, else 0)
- Suitable for neuromorphic hardware deployment

## Model Variants

| Variant | Architecture | Parameters | Weights | Verification |
|---------|--------------|------------|---------|--------------|
| Full | 8 β†’ 32 β†’ 16 β†’ 1 | 833 | `model.safetensors` | Coq-proven |
| Pruned | 8 β†’ 11 β†’ 3 β†’ 1 | 139 | `pruned/model.safetensors` | Coq-proven |

The pruned variant retains only the 14 neurons (11 in L1, 3 in L2) identified as critical through systematic ablation analysis. Both variants compute identical functions and are independently verified.

## Quick Start

```python

import torch

from safetensors.torch import load_file



# Load weights

weights = load_file('model.safetensors')  # or 'pruned/model.safetensors'



# Manual forward pass (Heaviside activation)

def forward(x, weights):

    x = x.float()

    x = (x @ weights['layer1.weight'].T + weights['layer1.bias'] >= 0).float()

    x = (x @ weights['layer2.weight'].T + weights['layer2.bias'] >= 0).float()

    x = (x @ weights['output.weight'].T + weights['output.bias'] >= 0).float()

    return x.squeeze(-1)



# Test

inputs = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)

output = forward(inputs, weights)

print(f"Parity of [1,0,1,1,0,0,1,0]: {int(output.item())}")  # 0 (4 bits set, even)

```

Or use the provided model classes:

```python

from model import ThresholdNetwork

model = ThresholdNetwork.from_safetensors('model.safetensors')



from pruned.model import PrunedThresholdNetwork

pruned = PrunedThresholdNetwork.from_safetensors('pruned/model.safetensors')

```

## Weight Structure

### Full Network (833 parameters)

| Tensor | Shape | Values | Description |
|--------|-------|--------|-------------|
| `layer1.weight` | [32, 8] | {-1, 0, +1} | 256 ternary weights |
| `layer1.bias` | [32] | [-8, +8] | Bounded integers |
| `layer2.weight` | [16, 32] | {-1, 0, +1} | 512 ternary weights |
| `layer2.bias` | [16] | [-32, +32] | Bounded integers |
| `output.weight` | [1, 16] | {-1, 0, +1} | 16 ternary weights |
| `output.bias` | [1] | [-16, +16] | Bounded integer |

### Pruned Network (139 parameters)

| Tensor | Shape | Values |
|--------|-------|--------|
| `layer1.weight` | [11, 8] | {-1, 0, +1} |
| `layer1.bias` | [11] | Bounded integers |
| `layer2.weight` | [3, 11] | {-1, 0, +1} |
| `layer2.bias` | [3] | Bounded integers |
| `output.weight` | [1, 3] | {-1, 0, +1} |
| `output.bias` | [1] | Bounded integer |

## Training

The network was trained using evolutionary search (genetic algorithm), not gradient descent.

**Why not backpropagation?** Parity is a canonical hard function for neural networks. The loss landscape is essentially flat because parity depends on a global property (bit count mod 2) rather than local features. Small weight perturbations produce negligible gradient signal until the network is already near a solution.

**Evolutionary search parameters:**
- Population size: 500
- Generations to solution: 10,542
- Selection: Top 20% elite
- Mutation rates: 0.05, 0.15, 0.30 (mixed)
- Fitness: Classification accuracy on all 256 inputs

The search finds a working configuration by directly optimizing discrete accuracy rather than relying on gradient information.

## Formal Verification

Both network variants are proven correct in the Coq proof assistant. The verification evolved through 12 proof versions, culminating in a fully constructive algebraic proof.

**Core theorem (proven for all 256 inputs):**
```coq

Theorem network_correct : forall x0 x1 x2 x3 x4 x5 x6 x7,

  network x0 x1 x2 x3 x4 x5 x6 x7 = parity [x0;x1;x2;x3;x4;x5;x6;x7].

```

**Proof strategy (V9_FullyConstructive.v):**



The proof decomposes the critical Layer-2 neuron's pre-activation into three components:

- **A**: Contribution from h-function neurons (depends on a specific linear form h(x))

- **B**: Contribution from Hamming-weight-dependent neurons

- **C**: Contribution from g-function neurons (where g(x) ≑ HW(x) mod 2)



Key algebraic insight: For any Β±1 weight vector, the dot product has the same parity as the Hamming weight. This is proven constructively:



```coq

Theorem g_parity : forall x0 x1 x2 x3 x4 x5 x6 x7,

  Z.even (g_func x0 x1 x2 x3 x4 x5 x6 x7) =

  Nat.even (hamming_weight [x0;x1;x2;x3;x4;x5;x6;x7]).

```



The network implements parity by:

1. Two "always-on" neurons (L2-N0, L2-N2) with large biases

2. One parity discriminator (L2-N1) that fires iff Hamming weight is even

3. Output wiring that negates the discriminator



**Parametric generalization (V10-V11):**

The algebraic foundation extends to arbitrary input size n:

```coq

Theorem dot_parity_equals_hw_parity : forall ws xs,

  length ws = length xs ->

  Z.even (signed_dot ws xs) = Nat.even (hamming_weight xs).

```

A verified n-bit parity network construction is provided using n+3 neurons total.

Full proof development: [threshold-logic-verified/coq](https://github.com/CharlesCNorton/threshold-logic-verified/tree/main/coq)

## Test Results

### Functional Validation

| Test | Result |
|------|--------|
| Exhaustive (256 inputs) | 256/256 |
| Batch inference consistency | Pass |
| Determinism (1000 runs) | Pass |
| Throughput | 18,161 inf/sec (CPU) |

### Analytical Properties

| Property | Finding |
|----------|---------|
| Weight sparsity | 32.9% zeros (258/784) |
| Critical L1 neurons | 11 of 32 |
| Critical L2 neurons | 3 of 16 |
| Noise sensitivity (Οƒ=0.05) | Accuracy drops to 51% |
| Minimum depth | 3 layers required |

**Neuron ablation results:**

| Neuron Removed | Accuracy Impact |
|----------------|-----------------|
| L2-N8 | 256 β†’ 128 (chance) |
| L1-N11 | 256 β†’ 162 |
| Any redundant neuron | 256 β†’ 256 (no change) |

**Random pruning sensitivity:**

| Weights Pruned | Accuracy |
|----------------|----------|
| 0% | 256/256 |
| 10% | 138-191/256 |
| 20%+ | ~128/256 (chance) |

### Noise Robustness

The network is extremely fragile to input noise due to the hard Heaviside threshold:

| Gaussian Noise Οƒ | Accuracy |
|------------------|----------|
| 0.00 | 100.0% |
| 0.05 | 51.2% |
| 0.10 | 51.2% |
| 0.20 | 48.8% |

This is inherent to threshold networks, not a defect.

### Bias Sensitivity

| Layer | Sensitive to Β±1 Perturbation |
|-------|------------------------------|
| Layer 1 | 9 of 32 biases |
| Layer 2 | 1 of 16 biases |
| Output | 1 of 1 biases |

## Practical Applications

The network has been validated on standard parity-based protocols:

| Application | Test Cases | Result |
|-------------|------------|--------|
| Memory ECC parity | 100 bytes | 100% |
| RAID-5 reconstruction | 50 stripes | 100% |
| UART frame validation | 100 frames | 100% |
| XOR checksum | 100 bytes | Pass |
| LFSR keystream | 32 bits | Deterministic |
| Barcode validation | 50 codes | 100% |
| Packet checksum | 20 packets | 100% |
| Hamming (7,4) syndrome | 20 codewords | 100% |
| PS/2 keyboard codes | 20 scancodes | 100% |
| DNA codon parity | 8 codons | 100% |

## Neuromorphic Deployment

The primary use case is neuromorphic hardware where threshold networks are the native compute primitive and XOR gates cannot be directly instantiated:

| Platform | Status |
|----------|--------|
| Intel Loihi | Compatible (untested) |
| IBM TrueNorth | Compatible (untested) |
| BrainChip Akida | Compatible (untested) |

**Resource comparison:**

| Implementation | Gates/Neurons |
|----------------|---------------|
| Threshold network (pruned) | 14 neurons |
| Threshold network (full) | 49 neurons |
| XOR tree (conventional) | 7 gates |

The 70-140x overhead vs XOR gates is acceptable only when XOR is unavailable.

## Pruned Network Details

The pruned network was extracted by identifying critical neurons through systematic ablation:

**Retained neurons:**
- Layer 1: indices [3, 7, 9, 11, 12, 16, 17, 20, 24, 27, 28]
- Layer 2: indices [8, 11, 14]

**Interpretation:**

The pruned network reveals the minimal structure for parity computation:
- L2-N0 (originally N8): Always fires (bias=30 dominates)
- L2-N1 (originally N11): Parity discriminator (fires iff HW even)
- L2-N2 (originally N14): Always fires (bias=20 dominates)
- Output: Computes `L2-N0 - L2-N1 + L2-N2 - 2 >= 0`, which equals `NOT(L2-N1)`

Since L2-N1 fires when Hamming weight is even, the output fires when Hamming weight is oddβ€”exactly parity.

## Limitations

- **Fixed input size**: 8 bits only (parametric construction in Coq extends to any n)
- **Binary inputs**: Expects {0, 1}, not continuous values
- **No noise margin**: Heaviside threshold at exactly 0
- **Not differentiable**: Cannot be fine-tuned with gradient descent

## Files

```

tiny-parity-prover/

β”œβ”€β”€ model.safetensors      # Full network weights (833 params)

β”œβ”€β”€ model.py               # Full network inference code

β”œβ”€β”€ config.json            # Full network metadata

β”œβ”€β”€ results.md             # Comprehensive test results

β”œβ”€β”€ todo.md                # Future work notes

β”œβ”€β”€ README.md              # This file

└── pruned/

    β”œβ”€β”€ model.safetensors  # Pruned network weights (139 params)

    β”œβ”€β”€ model.py           # Pruned network inference code

    β”œβ”€β”€ config.json        # Pruned network metadata

    └── README.md          # Pruned variant description

```

## Citation

```bibtex

@software{tiny_parity_prover_2025,

  title={tiny-parity-prover: Formally Verified Neural Network for 8-bit Parity},

  author={Norton, Charles},

  url={https://huggingface.co/phanerozoic/tiny-parity-prover},

  year={2025},

  note={Weights and inference code. Proofs at github.com/CharlesCNorton/threshold-logic-verified}

}

```

## Related

- **Proof repository**: [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified) β€” Coq proofs, training code, full documentation
- **Coq proof files**: V0_Core.v through V11_ParametricNetwork.v
- **Key proof**: V9_FullyConstructive.v (complete algebraic proof without vm_compute)

## License

MIT