phanerozoic commited on
Commit
914753d
Β·
verified Β·
1 Parent(s): f06d212

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +251 -58
README.md CHANGED
@@ -9,111 +9,304 @@ tags:
9
  - threshold-network
10
  - ternary-weights
11
  - binary-classification
 
12
  ---
13
 
14
  # tiny-parity-prover
15
 
16
- A formally verified neural network that computes 8-bit parity (XOR of all input bits).
17
 
18
- ## Model Description
19
 
20
- This is a ternary threshold network trained to compute the parity function on 8-bit binary inputs. The network outputs 1 if the number of set bits is odd, 0 if even.
21
 
22
- **Key feature**: Correctness is mathematically proven in Coq for all 256 possible inputs.
 
 
 
 
 
23
 
24
- ## Variants
25
 
26
- Two variants are available:
 
 
 
27
 
28
- | Variant | Architecture | Parameters | Location |
29
- |---------|--------------|------------|----------|
30
- | Full | 8 β†’ 32 β†’ 16 β†’ 1 | 833 | `model.safetensors` |
31
- | Pruned | 8 β†’ 11 β†’ 3 β†’ 1 | 139 | `pruned/model.safetensors` |
32
 
33
- The pruned variant was extracted via neuron ablation analysis, which identified that only 11 of 32 Layer-1 neurons and 3 of 16 Layer-2 neurons are critical for correct operation. Both variants achieve 100% accuracy and are formally verified in Coq.
34
 
35
- ## Architecture (Full Network)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
- | Layer | Neurons | Weights | Activation |
38
- |-------|---------|---------|------------|
39
- | Input | 8 | - | - |
40
- | Hidden 1 | 32 | Ternary {-1, 0, +1} | Heaviside |
41
- | Hidden 2 | 16 | Ternary {-1, 0, +1} | Heaviside |
42
- | Output | 1 | Ternary {-1, 0, +1} | Heaviside |
43
 
44
- - **Total parameters**: 833 (full), 139 (pruned)
45
- - **Weight quantization**: Ternary (-1, 0, +1)
46
- - **Bias quantization**: Bounded integers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
  ## Training
49
 
50
- - **Method**: Evolutionary search (genetic algorithm)
51
- - **Generations**: 10,542
52
- - **Population size**: 500
53
- - **Selection**: Top 20% elite, mutation rates 0.05-0.30
54
 
55
- The parity function presents well-known difficulties for gradient-based optimization. Because parity depends on a global property of the input (the count of set bits modulo 2), small perturbations to weights produce negligible gradient signal until the network is already close to a correct solution. This results in a loss landscape that is essentially flat across most of the parameter space.
 
 
 
 
 
56
 
57
- Evolutionary search circumvents this problem by evaluating candidate solutions directly on classification accuracy rather than relying on gradient information. The discrete fitness signalβ€”number of correctly classified inputsβ€”provides sufficient selection pressure to guide the search toward a working configuration.
58
 
59
- ## Verification
60
 
61
- Correctness is proven in the Coq proof assistant for both variants:
62
 
 
63
  ```coq
64
- Theorem network_verified :
65
- forall xs, In xs (all_inputs 8) -> network xs = parity xs.
66
  ```
67
 
68
- The proof uses exhaustive enumeration via `vm_compute` over all 256 inputs.
69
 
70
- ## Usage
 
 
 
71
 
72
- ```python
73
- from model import ThresholdNetwork
74
- import torch
75
 
76
- # Full network
77
- model = ThresholdNetwork.from_safetensors('model.safetensors')
 
 
 
78
 
79
- # Or pruned network
80
- from pruned.model import PrunedThresholdNetwork
81
- model = PrunedThresholdNetwork.from_safetensors('pruned/model.safetensors')
 
 
 
82
 
83
- input_bits = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)
84
- output = model(input_bits)
85
- print(f"Parity: {int(output.item())}") # 0 (four 1-bits, even count)
 
 
 
86
  ```
87
 
88
- ## Performance
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89
 
90
- | Metric | Full | Pruned |
91
- |--------|------|--------|
92
- | Accuracy | 100% (256/256) | 100% (256/256) |
93
- | Parameters | 833 | 139 |
94
- | Verification | Coq-proven | Coq-proven |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
 
96
  ## Limitations
97
 
98
- - Fixed 8-bit input size
99
- - Threshold activation (not differentiable)
100
- - Fragile to input noise (Heaviside has zero margin)
 
101
 
102
- ## Repository
103
 
104
- Full source code, training scripts, and Coq proofs available at:
105
- [https://github.com/CharlesCNorton/threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified)
 
 
 
 
 
 
 
 
 
 
 
 
106
 
107
  ## Citation
108
 
109
  ```bibtex
110
  @software{tiny_parity_prover_2025,
111
  title={tiny-parity-prover: Formally Verified Neural Network for 8-bit Parity},
112
- url={https://github.com/CharlesCNorton/threshold-logic-verified},
113
- year={2025}
 
 
114
  }
115
  ```
116
 
 
 
 
 
 
 
117
  ## License
118
 
119
  MIT
 
9
  - threshold-network
10
  - ternary-weights
11
  - binary-classification
12
+ - neuromorphic
13
  ---
14
 
15
  # tiny-parity-prover
16
 
17
+ Trained weights for a formally verified neural network that computes 8-bit parity. This repository contains the model artifacts; for proof development and Coq source code, see [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified).
18
 
19
+ ## Overview
20
 
21
+ This is a ternary threshold network that computes `parity(x) = xβ‚€ βŠ• x₁ βŠ• ... βŠ• x₇` for 8-bit binary inputs. The network outputs 1 if the number of set bits is odd, 0 if even.
22
 
23
+ **Key properties:**
24
+ - 100% accuracy on all 256 possible inputs
25
+ - Correctness proven in Coq via constructive algebraic proof
26
+ - Weights constrained to {-1, 0, +1} (ternary quantization)
27
+ - Heaviside step activation (x β‰₯ 0 β†’ 1, else 0)
28
+ - Suitable for neuromorphic hardware deployment
29
 
30
+ ## Model Variants
31
 
32
+ | Variant | Architecture | Parameters | Weights | Verification |
33
+ |---------|--------------|------------|---------|--------------|
34
+ | Full | 8 β†’ 32 β†’ 16 β†’ 1 | 833 | `model.safetensors` | Coq-proven |
35
+ | Pruned | 8 β†’ 11 β†’ 3 β†’ 1 | 139 | `pruned/model.safetensors` | Coq-proven |
36
 
37
+ The pruned variant retains only the 14 neurons (11 in L1, 3 in L2) identified as critical through systematic ablation analysis. Both variants compute identical functions and are independently verified.
 
 
 
38
 
39
+ ## Quick Start
40
 
41
+ ```python
42
+ import torch
43
+ from safetensors.torch import load_file
44
+
45
+ # Load weights
46
+ weights = load_file('model.safetensors') # or 'pruned/model.safetensors'
47
+
48
+ # Manual forward pass (Heaviside activation)
49
+ def forward(x, weights):
50
+ x = x.float()
51
+ x = (x @ weights['layer1.weight'].T + weights['layer1.bias'] >= 0).float()
52
+ x = (x @ weights['layer2.weight'].T + weights['layer2.bias'] >= 0).float()
53
+ x = (x @ weights['output.weight'].T + weights['output.bias'] >= 0).float()
54
+ return x.squeeze(-1)
55
+
56
+ # Test
57
+ inputs = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)
58
+ output = forward(inputs, weights)
59
+ print(f"Parity of [1,0,1,1,0,0,1,0]: {int(output.item())}") # 0 (4 bits set, even)
60
+ ```
61
 
62
+ Or use the provided model classes:
 
 
 
 
 
63
 
64
+ ```python
65
+ from model import ThresholdNetwork
66
+ model = ThresholdNetwork.from_safetensors('model.safetensors')
67
+
68
+ from pruned.model import PrunedThresholdNetwork
69
+ pruned = PrunedThresholdNetwork.from_safetensors('pruned/model.safetensors')
70
+ ```
71
+
72
+ ## Weight Structure
73
+
74
+ ### Full Network (833 parameters)
75
+
76
+ | Tensor | Shape | Values | Description |
77
+ |--------|-------|--------|-------------|
78
+ | `layer1.weight` | [32, 8] | {-1, 0, +1} | 256 ternary weights |
79
+ | `layer1.bias` | [32] | [-8, +8] | Bounded integers |
80
+ | `layer2.weight` | [16, 32] | {-1, 0, +1} | 512 ternary weights |
81
+ | `layer2.bias` | [16] | [-32, +32] | Bounded integers |
82
+ | `output.weight` | [1, 16] | {-1, 0, +1} | 16 ternary weights |
83
+ | `output.bias` | [1] | [-16, +16] | Bounded integer |
84
+
85
+ ### Pruned Network (139 parameters)
86
+
87
+ | Tensor | Shape | Values |
88
+ |--------|-------|--------|
89
+ | `layer1.weight` | [11, 8] | {-1, 0, +1} |
90
+ | `layer1.bias` | [11] | Bounded integers |
91
+ | `layer2.weight` | [3, 11] | {-1, 0, +1} |
92
+ | `layer2.bias` | [3] | Bounded integers |
93
+ | `output.weight` | [1, 3] | {-1, 0, +1} |
94
+ | `output.bias` | [1] | Bounded integer |
95
 
96
  ## Training
97
 
98
+ The network was trained using evolutionary search (genetic algorithm), not gradient descent.
99
+
100
+ **Why not backpropagation?** Parity is a canonical hard function for neural networks. The loss landscape is essentially flat because parity depends on a global property (bit count mod 2) rather than local features. Small weight perturbations produce negligible gradient signal until the network is already near a solution.
 
101
 
102
+ **Evolutionary search parameters:**
103
+ - Population size: 500
104
+ - Generations to solution: 10,542
105
+ - Selection: Top 20% elite
106
+ - Mutation rates: 0.05, 0.15, 0.30 (mixed)
107
+ - Fitness: Classification accuracy on all 256 inputs
108
 
109
+ The search finds a working configuration by directly optimizing discrete accuracy rather than relying on gradient information.
110
 
111
+ ## Formal Verification
112
 
113
+ Both network variants are proven correct in the Coq proof assistant. The verification evolved through 12 proof versions, culminating in a fully constructive algebraic proof.
114
 
115
+ **Core theorem (proven for all 256 inputs):**
116
  ```coq
117
+ Theorem network_correct : forall x0 x1 x2 x3 x4 x5 x6 x7,
118
+ network x0 x1 x2 x3 x4 x5 x6 x7 = parity [x0;x1;x2;x3;x4;x5;x6;x7].
119
  ```
120
 
121
+ **Proof strategy (V9_FullyConstructive.v):**
122
 
123
+ The proof decomposes the critical Layer-2 neuron's pre-activation into three components:
124
+ - **A**: Contribution from h-function neurons (depends on a specific linear form h(x))
125
+ - **B**: Contribution from Hamming-weight-dependent neurons
126
+ - **C**: Contribution from g-function neurons (where g(x) ≑ HW(x) mod 2)
127
 
128
+ Key algebraic insight: For any Β±1 weight vector, the dot product has the same parity as the Hamming weight. This is proven constructively:
 
 
129
 
130
+ ```coq
131
+ Theorem g_parity : forall x0 x1 x2 x3 x4 x5 x6 x7,
132
+ Z.even (g_func x0 x1 x2 x3 x4 x5 x6 x7) =
133
+ Nat.even (hamming_weight [x0;x1;x2;x3;x4;x5;x6;x7]).
134
+ ```
135
 
136
+ The network implements parity by:
137
+ 1. Two "always-on" neurons (L2-N0, L2-N2) with large biases
138
+ 2. One parity discriminator (L2-N1) that fires iff Hamming weight is even
139
+ 3. Output wiring that negates the discriminator
140
+
141
+ **Parametric generalization (V10-V11):**
142
 
143
+ The algebraic foundation extends to arbitrary input size n:
144
+
145
+ ```coq
146
+ Theorem dot_parity_equals_hw_parity : forall ws xs,
147
+ length ws = length xs ->
148
+ Z.even (signed_dot ws xs) = Nat.even (hamming_weight xs).
149
  ```
150
 
151
+ A verified n-bit parity network construction is provided using n+3 neurons total.
152
+
153
+ Full proof development: [threshold-logic-verified/coq](https://github.com/CharlesCNorton/threshold-logic-verified/tree/main/coq)
154
+
155
+ ## Test Results
156
+
157
+ ### Functional Validation
158
+
159
+ | Test | Result |
160
+ |------|--------|
161
+ | Exhaustive (256 inputs) | 256/256 |
162
+ | Batch inference consistency | Pass |
163
+ | Determinism (1000 runs) | Pass |
164
+ | Throughput | 18,161 inf/sec (CPU) |
165
+
166
+ ### Analytical Properties
167
 
168
+ | Property | Finding |
169
+ |----------|---------|
170
+ | Weight sparsity | 32.9% zeros (258/784) |
171
+ | Critical L1 neurons | 11 of 32 |
172
+ | Critical L2 neurons | 3 of 16 |
173
+ | Noise sensitivity (Οƒ=0.05) | Accuracy drops to 51% |
174
+ | Minimum depth | 3 layers required |
175
+
176
+ **Neuron ablation results:**
177
+
178
+ | Neuron Removed | Accuracy Impact |
179
+ |----------------|-----------------|
180
+ | L2-N8 | 256 β†’ 128 (chance) |
181
+ | L1-N11 | 256 β†’ 162 |
182
+ | Any redundant neuron | 256 β†’ 256 (no change) |
183
+
184
+ **Random pruning sensitivity:**
185
+
186
+ | Weights Pruned | Accuracy |
187
+ |----------------|----------|
188
+ | 0% | 256/256 |
189
+ | 10% | 138-191/256 |
190
+ | 20%+ | ~128/256 (chance) |
191
+
192
+ ### Noise Robustness
193
+
194
+ The network is extremely fragile to input noise due to the hard Heaviside threshold:
195
+
196
+ | Gaussian Noise Οƒ | Accuracy |
197
+ |------------------|----------|
198
+ | 0.00 | 100.0% |
199
+ | 0.05 | 51.2% |
200
+ | 0.10 | 51.2% |
201
+ | 0.20 | 48.8% |
202
+
203
+ This is inherent to threshold networks, not a defect.
204
+
205
+ ### Bias Sensitivity
206
+
207
+ | Layer | Sensitive to Β±1 Perturbation |
208
+ |-------|------------------------------|
209
+ | Layer 1 | 9 of 32 biases |
210
+ | Layer 2 | 1 of 16 biases |
211
+ | Output | 1 of 1 biases |
212
+
213
+ ## Practical Applications
214
+
215
+ The network has been validated on standard parity-based protocols:
216
+
217
+ | Application | Test Cases | Result |
218
+ |-------------|------------|--------|
219
+ | Memory ECC parity | 100 bytes | 100% |
220
+ | RAID-5 reconstruction | 50 stripes | 100% |
221
+ | UART frame validation | 100 frames | 100% |
222
+ | XOR checksum | 100 bytes | Pass |
223
+ | LFSR keystream | 32 bits | Deterministic |
224
+ | Barcode validation | 50 codes | 100% |
225
+ | Packet checksum | 20 packets | 100% |
226
+ | Hamming (7,4) syndrome | 20 codewords | 100% |
227
+ | PS/2 keyboard codes | 20 scancodes | 100% |
228
+ | DNA codon parity | 8 codons | 100% |
229
+
230
+ ## Neuromorphic Deployment
231
+
232
+ The primary use case is neuromorphic hardware where threshold networks are the native compute primitive and XOR gates cannot be directly instantiated:
233
+
234
+ | Platform | Status |
235
+ |----------|--------|
236
+ | Intel Loihi | Compatible (untested) |
237
+ | IBM TrueNorth | Compatible (untested) |
238
+ | BrainChip Akida | Compatible (untested) |
239
+
240
+ **Resource comparison:**
241
+
242
+ | Implementation | Gates/Neurons |
243
+ |----------------|---------------|
244
+ | Threshold network (pruned) | 14 neurons |
245
+ | Threshold network (full) | 49 neurons |
246
+ | XOR tree (conventional) | 7 gates |
247
+
248
+ The 70-140x overhead vs XOR gates is acceptable only when XOR is unavailable.
249
+
250
+ ## Pruned Network Details
251
+
252
+ The pruned network was extracted by identifying critical neurons through systematic ablation:
253
+
254
+ **Retained neurons:**
255
+ - Layer 1: indices [3, 7, 9, 11, 12, 16, 17, 20, 24, 27, 28]
256
+ - Layer 2: indices [8, 11, 14]
257
+
258
+ **Interpretation:**
259
+
260
+ The pruned network reveals the minimal structure for parity computation:
261
+ - L2-N0 (originally N8): Always fires (bias=30 dominates)
262
+ - L2-N1 (originally N11): Parity discriminator (fires iff HW even)
263
+ - L2-N2 (originally N14): Always fires (bias=20 dominates)
264
+ - Output: Computes `L2-N0 - L2-N1 + L2-N2 - 2 >= 0`, which equals `NOT(L2-N1)`
265
+
266
+ Since L2-N1 fires when Hamming weight is even, the output fires when Hamming weight is oddβ€”exactly parity.
267
 
268
  ## Limitations
269
 
270
+ - **Fixed input size**: 8 bits only (parametric construction in Coq extends to any n)
271
+ - **Binary inputs**: Expects {0, 1}, not continuous values
272
+ - **No noise margin**: Heaviside threshold at exactly 0
273
+ - **Not differentiable**: Cannot be fine-tuned with gradient descent
274
 
275
+ ## Files
276
 
277
+ ```
278
+ tiny-parity-prover/
279
+ β”œβ”€β”€ model.safetensors # Full network weights (833 params)
280
+ β”œβ”€β”€ model.py # Full network inference code
281
+ β”œβ”€β”€ config.json # Full network metadata
282
+ β”œβ”€β”€ results.md # Comprehensive test results
283
+ β”œβ”€β”€ todo.md # Future work notes
284
+ β”œβ”€β”€ README.md # This file
285
+ └── pruned/
286
+ β”œβ”€β”€ model.safetensors # Pruned network weights (139 params)
287
+ β”œβ”€β”€ model.py # Pruned network inference code
288
+ β”œβ”€β”€ config.json # Pruned network metadata
289
+ └── README.md # Pruned variant description
290
+ ```
291
 
292
  ## Citation
293
 
294
  ```bibtex
295
  @software{tiny_parity_prover_2025,
296
  title={tiny-parity-prover: Formally Verified Neural Network for 8-bit Parity},
297
+ author={Norton, Charles},
298
+ url={https://huggingface.co/phanerozoic/tiny-parity-prover},
299
+ year={2025},
300
+ note={Weights and inference code. Proofs at github.com/CharlesCNorton/threshold-logic-verified}
301
  }
302
  ```
303
 
304
+ ## Related
305
+
306
+ - **Proof repository**: [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified) β€” Coq proofs, training code, full documentation
307
+ - **Coq proof files**: V0_Core.v through V11_ParametricNetwork.v
308
+ - **Key proof**: V9_FullyConstructive.v (complete algebraic proof without vm_compute)
309
+
310
  ## License
311
 
312
  MIT