phanerozoic's picture
Upload README.md with huggingface_hub
914753d verified
---
license: mit
tags:
- pytorch
- safetensors
- formal-verification
- coq
- parity
- threshold-network
- ternary-weights
- binary-classification
- neuromorphic
---
# tiny-parity-prover
Trained weights for a formally verified neural network that computes 8-bit parity. This repository contains the model artifacts; for proof development and Coq source code, see [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified).
## Overview
This is a ternary threshold network that computes `parity(x) = xβ‚€ βŠ• x₁ βŠ• ... βŠ• x₇` for 8-bit binary inputs. The network outputs 1 if the number of set bits is odd, 0 if even.
**Key properties:**
- 100% accuracy on all 256 possible inputs
- Correctness proven in Coq via constructive algebraic proof
- Weights constrained to {-1, 0, +1} (ternary quantization)
- Heaviside step activation (x β‰₯ 0 β†’ 1, else 0)
- Suitable for neuromorphic hardware deployment
## Model Variants
| Variant | Architecture | Parameters | Weights | Verification |
|---------|--------------|------------|---------|--------------|
| Full | 8 β†’ 32 β†’ 16 β†’ 1 | 833 | `model.safetensors` | Coq-proven |
| Pruned | 8 β†’ 11 β†’ 3 β†’ 1 | 139 | `pruned/model.safetensors` | Coq-proven |
The pruned variant retains only the 14 neurons (11 in L1, 3 in L2) identified as critical through systematic ablation analysis. Both variants compute identical functions and are independently verified.
## Quick Start
```python
import torch
from safetensors.torch import load_file
# Load weights
weights = load_file('model.safetensors') # or 'pruned/model.safetensors'
# Manual forward pass (Heaviside activation)
def forward(x, weights):
x = x.float()
x = (x @ weights['layer1.weight'].T + weights['layer1.bias'] >= 0).float()
x = (x @ weights['layer2.weight'].T + weights['layer2.bias'] >= 0).float()
x = (x @ weights['output.weight'].T + weights['output.bias'] >= 0).float()
return x.squeeze(-1)
# Test
inputs = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)
output = forward(inputs, weights)
print(f"Parity of [1,0,1,1,0,0,1,0]: {int(output.item())}") # 0 (4 bits set, even)
```
Or use the provided model classes:
```python
from model import ThresholdNetwork
model = ThresholdNetwork.from_safetensors('model.safetensors')
from pruned.model import PrunedThresholdNetwork
pruned = PrunedThresholdNetwork.from_safetensors('pruned/model.safetensors')
```
## Weight Structure
### Full Network (833 parameters)
| Tensor | Shape | Values | Description |
|--------|-------|--------|-------------|
| `layer1.weight` | [32, 8] | {-1, 0, +1} | 256 ternary weights |
| `layer1.bias` | [32] | [-8, +8] | Bounded integers |
| `layer2.weight` | [16, 32] | {-1, 0, +1} | 512 ternary weights |
| `layer2.bias` | [16] | [-32, +32] | Bounded integers |
| `output.weight` | [1, 16] | {-1, 0, +1} | 16 ternary weights |
| `output.bias` | [1] | [-16, +16] | Bounded integer |
### Pruned Network (139 parameters)
| Tensor | Shape | Values |
|--------|-------|--------|
| `layer1.weight` | [11, 8] | {-1, 0, +1} |
| `layer1.bias` | [11] | Bounded integers |
| `layer2.weight` | [3, 11] | {-1, 0, +1} |
| `layer2.bias` | [3] | Bounded integers |
| `output.weight` | [1, 3] | {-1, 0, +1} |
| `output.bias` | [1] | Bounded integer |
## Training
The network was trained using evolutionary search (genetic algorithm), not gradient descent.
**Why not backpropagation?** Parity is a canonical hard function for neural networks. The loss landscape is essentially flat because parity depends on a global property (bit count mod 2) rather than local features. Small weight perturbations produce negligible gradient signal until the network is already near a solution.
**Evolutionary search parameters:**
- Population size: 500
- Generations to solution: 10,542
- Selection: Top 20% elite
- Mutation rates: 0.05, 0.15, 0.30 (mixed)
- Fitness: Classification accuracy on all 256 inputs
The search finds a working configuration by directly optimizing discrete accuracy rather than relying on gradient information.
## Formal Verification
Both network variants are proven correct in the Coq proof assistant. The verification evolved through 12 proof versions, culminating in a fully constructive algebraic proof.
**Core theorem (proven for all 256 inputs):**
```coq
Theorem network_correct : forall x0 x1 x2 x3 x4 x5 x6 x7,
network x0 x1 x2 x3 x4 x5 x6 x7 = parity [x0;x1;x2;x3;x4;x5;x6;x7].
```
**Proof strategy (V9_FullyConstructive.v):**
The proof decomposes the critical Layer-2 neuron's pre-activation into three components:
- **A**: Contribution from h-function neurons (depends on a specific linear form h(x))
- **B**: Contribution from Hamming-weight-dependent neurons
- **C**: Contribution from g-function neurons (where g(x) ≑ HW(x) mod 2)
Key algebraic insight: For any Β±1 weight vector, the dot product has the same parity as the Hamming weight. This is proven constructively:
```coq
Theorem g_parity : forall x0 x1 x2 x3 x4 x5 x6 x7,
Z.even (g_func x0 x1 x2 x3 x4 x5 x6 x7) =
Nat.even (hamming_weight [x0;x1;x2;x3;x4;x5;x6;x7]).
```
The network implements parity by:
1. Two "always-on" neurons (L2-N0, L2-N2) with large biases
2. One parity discriminator (L2-N1) that fires iff Hamming weight is even
3. Output wiring that negates the discriminator
**Parametric generalization (V10-V11):**
The algebraic foundation extends to arbitrary input size n:
```coq
Theorem dot_parity_equals_hw_parity : forall ws xs,
length ws = length xs ->
Z.even (signed_dot ws xs) = Nat.even (hamming_weight xs).
```
A verified n-bit parity network construction is provided using n+3 neurons total.
Full proof development: [threshold-logic-verified/coq](https://github.com/CharlesCNorton/threshold-logic-verified/tree/main/coq)
## Test Results
### Functional Validation
| Test | Result |
|------|--------|
| Exhaustive (256 inputs) | 256/256 |
| Batch inference consistency | Pass |
| Determinism (1000 runs) | Pass |
| Throughput | 18,161 inf/sec (CPU) |
### Analytical Properties
| Property | Finding |
|----------|---------|
| Weight sparsity | 32.9% zeros (258/784) |
| Critical L1 neurons | 11 of 32 |
| Critical L2 neurons | 3 of 16 |
| Noise sensitivity (Οƒ=0.05) | Accuracy drops to 51% |
| Minimum depth | 3 layers required |
**Neuron ablation results:**
| Neuron Removed | Accuracy Impact |
|----------------|-----------------|
| L2-N8 | 256 β†’ 128 (chance) |
| L1-N11 | 256 β†’ 162 |
| Any redundant neuron | 256 β†’ 256 (no change) |
**Random pruning sensitivity:**
| Weights Pruned | Accuracy |
|----------------|----------|
| 0% | 256/256 |
| 10% | 138-191/256 |
| 20%+ | ~128/256 (chance) |
### Noise Robustness
The network is extremely fragile to input noise due to the hard Heaviside threshold:
| Gaussian Noise Οƒ | Accuracy |
|------------------|----------|
| 0.00 | 100.0% |
| 0.05 | 51.2% |
| 0.10 | 51.2% |
| 0.20 | 48.8% |
This is inherent to threshold networks, not a defect.
### Bias Sensitivity
| Layer | Sensitive to Β±1 Perturbation |
|-------|------------------------------|
| Layer 1 | 9 of 32 biases |
| Layer 2 | 1 of 16 biases |
| Output | 1 of 1 biases |
## Practical Applications
The network has been validated on standard parity-based protocols:
| Application | Test Cases | Result |
|-------------|------------|--------|
| Memory ECC parity | 100 bytes | 100% |
| RAID-5 reconstruction | 50 stripes | 100% |
| UART frame validation | 100 frames | 100% |
| XOR checksum | 100 bytes | Pass |
| LFSR keystream | 32 bits | Deterministic |
| Barcode validation | 50 codes | 100% |
| Packet checksum | 20 packets | 100% |
| Hamming (7,4) syndrome | 20 codewords | 100% |
| PS/2 keyboard codes | 20 scancodes | 100% |
| DNA codon parity | 8 codons | 100% |
## Neuromorphic Deployment
The primary use case is neuromorphic hardware where threshold networks are the native compute primitive and XOR gates cannot be directly instantiated:
| Platform | Status |
|----------|--------|
| Intel Loihi | Compatible (untested) |
| IBM TrueNorth | Compatible (untested) |
| BrainChip Akida | Compatible (untested) |
**Resource comparison:**
| Implementation | Gates/Neurons |
|----------------|---------------|
| Threshold network (pruned) | 14 neurons |
| Threshold network (full) | 49 neurons |
| XOR tree (conventional) | 7 gates |
The 70-140x overhead vs XOR gates is acceptable only when XOR is unavailable.
## Pruned Network Details
The pruned network was extracted by identifying critical neurons through systematic ablation:
**Retained neurons:**
- Layer 1: indices [3, 7, 9, 11, 12, 16, 17, 20, 24, 27, 28]
- Layer 2: indices [8, 11, 14]
**Interpretation:**
The pruned network reveals the minimal structure for parity computation:
- L2-N0 (originally N8): Always fires (bias=30 dominates)
- L2-N1 (originally N11): Parity discriminator (fires iff HW even)
- L2-N2 (originally N14): Always fires (bias=20 dominates)
- Output: Computes `L2-N0 - L2-N1 + L2-N2 - 2 >= 0`, which equals `NOT(L2-N1)`
Since L2-N1 fires when Hamming weight is even, the output fires when Hamming weight is oddβ€”exactly parity.
## Limitations
- **Fixed input size**: 8 bits only (parametric construction in Coq extends to any n)
- **Binary inputs**: Expects {0, 1}, not continuous values
- **No noise margin**: Heaviside threshold at exactly 0
- **Not differentiable**: Cannot be fine-tuned with gradient descent
## Files
```
tiny-parity-prover/
β”œβ”€β”€ model.safetensors # Full network weights (833 params)
β”œβ”€β”€ model.py # Full network inference code
β”œβ”€β”€ config.json # Full network metadata
β”œβ”€β”€ results.md # Comprehensive test results
β”œβ”€β”€ todo.md # Future work notes
β”œβ”€β”€ README.md # This file
└── pruned/
β”œβ”€β”€ model.safetensors # Pruned network weights (139 params)
β”œβ”€β”€ model.py # Pruned network inference code
β”œβ”€β”€ config.json # Pruned network metadata
└── README.md # Pruned variant description
```
## Citation
```bibtex
@software{tiny_parity_prover_2025,
title={tiny-parity-prover: Formally Verified Neural Network for 8-bit Parity},
author={Norton, Charles},
url={https://huggingface.co/phanerozoic/tiny-parity-prover},
year={2025},
note={Weights and inference code. Proofs at github.com/CharlesCNorton/threshold-logic-verified}
}
```
## Related
- **Proof repository**: [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified) β€” Coq proofs, training code, full documentation
- **Coq proof files**: V0_Core.v through V11_ParametricNetwork.v
- **Key proof**: V9_FullyConstructive.v (complete algebraic proof without vm_compute)
## License
MIT