phanerozoic
/

tiny-parity-verified

@@ -9,111 +9,304 @@ tags:
 - threshold-network
 - ternary-weights
 - binary-classification
 ---
 # tiny-parity-prover
-A formally verified neural network that computes 8-bit parity (XOR of all input bits).
-## Model Description
-This is a ternary threshold network trained to compute the parity function on 8-bit binary inputs. The network outputs 1 if the number of set bits is odd, 0 if even.
-**Key feature**: Correctness is mathematically proven in Coq for all 256 possible inputs.
-## Variants
-Two variants are available:
-| Variant | Architecture | Parameters | Location |
-|---------|--------------|------------|----------|
-| Full | 8 → 32 → 16 → 1 | 833 | `model.safetensors` |
-| Pruned | 8 → 11 → 3 → 1 | 139 | `pruned/model.safetensors` |
-The pruned variant was extracted via neuron ablation analysis, which identified that only 11 of 32 Layer-1 neurons and 3 of 16 Layer-2 neurons are critical for correct operation. Both variants achieve 100% accuracy and are formally verified in Coq.
-## Architecture (Full Network)
-| Layer | Neurons | Weights | Activation |
-|-------|---------|---------|------------|
-| Input | 8 | - | - |
-| Hidden 1 | 32 | Ternary {-1, 0, +1} | Heaviside |
-| Hidden 2 | 16 | Ternary {-1, 0, +1} | Heaviside |
-| Output | 1 | Ternary {-1, 0, +1} | Heaviside |
-- **Total parameters**: 833 (full), 139 (pruned)
-- **Weight quantization**: Ternary (-1, 0, +1)
-- **Bias quantization**: Bounded integers
 ## Training
-- **Method**: Evolutionary search (genetic algorithm)
-- **Generations**: 10,542
-- **Population size**: 500
-- **Selection**: Top 20% elite, mutation rates 0.05-0.30
-The parity function presents well-known difficulties for gradient-based optimization. Because parity depends on a global property of the input (the count of set bits modulo 2), small perturbations to weights produce negligible gradient signal until the network is already close to a correct solution. This results in a loss landscape that is essentially flat across most of the parameter space.
-Evolutionary search circumvents this problem by evaluating candidate solutions directly on classification accuracy rather than relying on gradient information. The discrete fitness signal—number of correctly classified inputs—provides sufficient selection pressure to guide the search toward a working configuration.
-## Verification
-Correctness is proven in the Coq proof assistant for both variants:
 ```coq
-Theorem network_verified :
-  forall xs, In xs (all_inputs 8) -> network xs = parity xs.
 ```
-The proof uses exhaustive enumeration via `vm_compute` over all 256 inputs.
-## Usage
-```python
-from model import ThresholdNetwork
-import torch
-# Full network
-model = ThresholdNetwork.from_safetensors('model.safetensors')
-# Or pruned network
-from pruned.model import PrunedThresholdNetwork
-model = PrunedThresholdNetwork.from_safetensors('pruned/model.safetensors')
-input_bits = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)
-output = model(input_bits)
-print(f"Parity: {int(output.item())}")  # 0 (four 1-bits, even count)
 ```
-## Performance
-| Metric | Full | Pruned |
-|--------|------|--------|
-| Accuracy | 100% (256/256) | 100% (256/256) |
-| Parameters | 833 | 139 |
-| Verification | Coq-proven | Coq-proven |
 ## Limitations
-- Fixed 8-bit input size
-- Threshold activation (not differentiable)
-- Fragile to input noise (Heaviside has zero margin)
-## Repository
-Full source code, training scripts, and Coq proofs available at:
-[https://github.com/CharlesCNorton/threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified)
 ## Citation
 ```bibtex
 @software{tiny_parity_prover_2025,
   title={tiny-parity-prover: Formally Verified Neural Network for 8-bit Parity},
-  url={https://github.com/CharlesCNorton/threshold-logic-verified},
-  year={2025}
 }
 ```
 ## License
 MIT

 - threshold-network
 - ternary-weights
 - binary-classification
+- neuromorphic
 ---
 # tiny-parity-prover
+Trained weights for a formally verified neural network that computes 8-bit parity. This repository contains the model artifacts; for proof development and Coq source code, see [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified).
+## Overview
+This is a ternary threshold network that computes `parity(x) = x₀ ⊕ x₁ ⊕ ... ⊕ x₇` for 8-bit binary inputs. The network outputs 1 if the number of set bits is odd, 0 if even.
+**Key properties:**
+- 100% accuracy on all 256 possible inputs
+- Correctness proven in Coq via constructive algebraic proof
+- Weights constrained to {-1, 0, +1} (ternary quantization)
+- Heaviside step activation (x ≥ 0 → 1, else 0)
+- Suitable for neuromorphic hardware deployment
+## Model Variants
+| Variant | Architecture | Parameters | Weights | Verification |
+|---------|--------------|------------|---------|--------------|
+| Full | 8 → 32 → 16 → 1 | 833 | `model.safetensors` | Coq-proven |
+| Pruned | 8 → 11 → 3 → 1 | 139 | `pruned/model.safetensors` | Coq-proven |
+The pruned variant retains only the 14 neurons (11 in L1, 3 in L2) identified as critical through systematic ablation analysis. Both variants compute identical functions and are independently verified.
+## Quick Start
+```python
+import torch
+from safetensors.torch import load_file
+# Load weights
+weights = load_file('model.safetensors')  # or 'pruned/model.safetensors'
+# Manual forward pass (Heaviside activation)
+def forward(x, weights):
+    x = x.float()
+    x = (x @ weights['layer1.weight'].T + weights['layer1.bias'] >= 0).float()
+    x = (x @ weights['layer2.weight'].T + weights['layer2.bias'] >= 0).float()
+    x = (x @ weights['output.weight'].T + weights['output.bias'] >= 0).float()
+    return x.squeeze(-1)
+# Test
+inputs = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)
+output = forward(inputs, weights)
+print(f"Parity of [1,0,1,1,0,0,1,0]: {int(output.item())}")  # 0 (4 bits set, even)
+```
+Or use the provided model classes:
+```python
+from model import ThresholdNetwork
+model = ThresholdNetwork.from_safetensors('model.safetensors')
+from pruned.model import PrunedThresholdNetwork
+pruned = PrunedThresholdNetwork.from_safetensors('pruned/model.safetensors')
+```
+## Weight Structure
+### Full Network (833 parameters)
+| Tensor | Shape | Values | Description |
+|--------|-------|--------|-------------|
+| `layer1.weight` | [32, 8] | {-1, 0, +1} | 256 ternary weights |
+| `layer1.bias` | [32] | [-8, +8] | Bounded integers |
+| `layer2.weight` | [16, 32] | {-1, 0, +1} | 512 ternary weights |
+| `layer2.bias` | [16] | [-32, +32] | Bounded integers |
+| `output.weight` | [1, 16] | {-1, 0, +1} | 16 ternary weights |
+| `output.bias` | [1] | [-16, +16] | Bounded integer |
+### Pruned Network (139 parameters)
+| Tensor | Shape | Values |
+|--------|-------|--------|
+| `layer1.weight` | [11, 8] | {-1, 0, +1} |
+| `layer1.bias` | [11] | Bounded integers |
+| `layer2.weight` | [3, 11] | {-1, 0, +1} |
+| `layer2.bias` | [3] | Bounded integers |
+| `output.weight` | [1, 3] | {-1, 0, +1} |
+| `output.bias` | [1] | Bounded integer |
 ## Training
+The network was trained using evolutionary search (genetic algorithm), not gradient descent.
+**Why not backpropagation?** Parity is a canonical hard function for neural networks. The loss landscape is essentially flat because parity depends on a global property (bit count mod 2) rather than local features. Small weight perturbations produce negligible gradient signal until the network is already near a solution.
+**Evolutionary search parameters:**
+- Population size: 500
+- Generations to solution: 10,542
+- Selection: Top 20% elite
+- Mutation rates: 0.05, 0.15, 0.30 (mixed)
+- Fitness: Classification accuracy on all 256 inputs
+The search finds a working configuration by directly optimizing discrete accuracy rather than relying on gradient information.
+## Formal Verification
+Both network variants are proven correct in the Coq proof assistant. The verification evolved through 12 proof versions, culminating in a fully constructive algebraic proof.
+**Core theorem (proven for all 256 inputs):**
 ```coq
+Theorem network_correct : forall x0 x1 x2 x3 x4 x5 x6 x7,
+  network x0 x1 x2 x3 x4 x5 x6 x7 = parity [x0;x1;x2;x3;x4;x5;x6;x7].
 ```
+**Proof strategy (V9_FullyConstructive.v):**
+The proof decomposes the critical Layer-2 neuron's pre-activation into three components:
+- **A**: Contribution from h-function neurons (depends on a specific linear form h(x))
+- **B**: Contribution from Hamming-weight-dependent neurons
+- **C**: Contribution from g-function neurons (where g(x) ≡ HW(x) mod 2)
+Key algebraic insight: For any ±1 weight vector, the dot product has the same parity as the Hamming weight. This is proven constructively:
+```coq
+Theorem g_parity : forall x0 x1 x2 x3 x4 x5 x6 x7,
+  Z.even (g_func x0 x1 x2 x3 x4 x5 x6 x7) =
+  Nat.even (hamming_weight [x0;x1;x2;x3;x4;x5;x6;x7]).
+```
+The network implements parity by:
+1. Two "always-on" neurons (L2-N0, L2-N2) with large biases
+2. One parity discriminator (L2-N1) that fires iff Hamming weight is even
+3. Output wiring that negates the discriminator
+**Parametric generalization (V10-V11):**
+The algebraic foundation extends to arbitrary input size n:
+```coq
+Theorem dot_parity_equals_hw_parity : forall ws xs,
+  length ws = length xs ->
+  Z.even (signed_dot ws xs) = Nat.even (hamming_weight xs).
 ```
+A verified n-bit parity network construction is provided using n+3 neurons total.
+Full proof development: [threshold-logic-verified/coq](https://github.com/CharlesCNorton/threshold-logic-verified/tree/main/coq)
+## Test Results
+### Functional Validation
+| Test | Result |
+|------|--------|
+| Exhaustive (256 inputs) | 256/256 |
+| Batch inference consistency | Pass |
+| Determinism (1000 runs) | Pass |
+| Throughput | 18,161 inf/sec (CPU) |
+### Analytical Properties
+| Property | Finding |
+|----------|---------|
+| Weight sparsity | 32.9% zeros (258/784) |
+| Critical L1 neurons | 11 of 32 |
+| Critical L2 neurons | 3 of 16 |
+| Noise sensitivity (σ=0.05) | Accuracy drops to 51% |
+| Minimum depth | 3 layers required |
+**Neuron ablation results:**
+| Neuron Removed | Accuracy Impact |
+|----------------|-----------------|
+| L2-N8 | 256 → 128 (chance) |
+| L1-N11 | 256 → 162 |
+| Any redundant neuron | 256 → 256 (no change) |
+**Random pruning sensitivity:**
+| Weights Pruned | Accuracy |
+|----------------|----------|
+| 0% | 256/256 |
+| 10% | 138-191/256 |
+| 20%+ | ~128/256 (chance) |
+### Noise Robustness
+The network is extremely fragile to input noise due to the hard Heaviside threshold:
+| Gaussian Noise σ | Accuracy |
+|------------------|----------|
+| 0.00 | 100.0% |
+| 0.05 | 51.2% |
+| 0.10 | 51.2% |
+| 0.20 | 48.8% |
+This is inherent to threshold networks, not a defect.
+### Bias Sensitivity
+| Layer | Sensitive to ±1 Perturbation |
+|-------|------------------------------|
+| Layer 1 | 9 of 32 biases |
+| Layer 2 | 1 of 16 biases |
+| Output | 1 of 1 biases |
+## Practical Applications
+The network has been validated on standard parity-based protocols:
+| Application | Test Cases | Result |
+|-------------|------------|--------|
+| Memory ECC parity | 100 bytes | 100% |
+| RAID-5 reconstruction | 50 stripes | 100% |
+| UART frame validation | 100 frames | 100% |
+| XOR checksum | 100 bytes | Pass |
+| LFSR keystream | 32 bits | Deterministic |
+| Barcode validation | 50 codes | 100% |
+| Packet checksum | 20 packets | 100% |
+| Hamming (7,4) syndrome | 20 codewords | 100% |
+| PS/2 keyboard codes | 20 scancodes | 100% |
+| DNA codon parity | 8 codons | 100% |
+## Neuromorphic Deployment
+The primary use case is neuromorphic hardware where threshold networks are the native compute primitive and XOR gates cannot be directly instantiated:
+| Platform | Status |
+|----------|--------|
+| Intel Loihi | Compatible (untested) |
+| IBM TrueNorth | Compatible (untested) |
+| BrainChip Akida | Compatible (untested) |
+**Resource comparison:**
+| Implementation | Gates/Neurons |
+|----------------|---------------|
+| Threshold network (pruned) | 14 neurons |
+| Threshold network (full) | 49 neurons |
+| XOR tree (conventional) | 7 gates |
+The 70-140x overhead vs XOR gates is acceptable only when XOR is unavailable.
+## Pruned Network Details
+The pruned network was extracted by identifying critical neurons through systematic ablation:
+**Retained neurons:**
+- Layer 1: indices [3, 7, 9, 11, 12, 16, 17, 20, 24, 27, 28]
+- Layer 2: indices [8, 11, 14]
+**Interpretation:**
+The pruned network reveals the minimal structure for parity computation:
+- L2-N0 (originally N8): Always fires (bias=30 dominates)
+- L2-N1 (originally N11): Parity discriminator (fires iff HW even)
+- L2-N2 (originally N14): Always fires (bias=20 dominates)
+- Output: Computes `L2-N0 - L2-N1 + L2-N2 - 2 >= 0`, which equals `NOT(L2-N1)`
+Since L2-N1 fires when Hamming weight is even, the output fires when Hamming weight is odd—exactly parity.
 ## Limitations
+- **Fixed input size**: 8 bits only (parametric construction in Coq extends to any n)
+- **Binary inputs**: Expects {0, 1}, not continuous values
+- **No noise margin**: Heaviside threshold at exactly 0
+- **Not differentiable**: Cannot be fine-tuned with gradient descent
+## Files
+```
+tiny-parity-prover/
+├── model.safetensors      # Full network weights (833 params)
+├── model.py               # Full network inference code
+├── config.json            # Full network metadata
+├── results.md             # Comprehensive test results
+├── todo.md                # Future work notes
+├── README.md              # This file
+└── pruned/
+    ├── model.safetensors  # Pruned network weights (139 params)
+    ├── model.py           # Pruned network inference code
+    ├── config.json        # Pruned network metadata
+    └── README.md          # Pruned variant description
+```
 ## Citation
 ```bibtex
 @software{tiny_parity_prover_2025,
   title={tiny-parity-prover: Formally Verified Neural Network for 8-bit Parity},
+  author={Norton, Charles},
+  url={https://huggingface.co/phanerozoic/tiny-parity-prover},
+  year={2025},
+  note={Weights and inference code. Proofs at github.com/CharlesCNorton/threshold-logic-verified}
 }
 ```
+## Related
+- **Proof repository**: [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified) — Coq proofs, training code, full documentation
+- **Coq proof files**: V0_Core.v through V11_ParametricNetwork.v
+- **Key proof**: V9_FullyConstructive.v (complete algebraic proof without vm_compute)
 ## License
 MIT