---
license: mit
tags:
- pytorch
- safetensors
- formal-verification
- coq
- parity
- threshold-network
- ternary-weights
- binary-classification
- neuromorphic
---

# tiny-parity-prover

Trained weights for a formally verified neural network that computes 8-bit parity. This repository contains the model artifacts; for proof development and Coq source code, see [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified).

## Overview

This is a ternary threshold network that computes `parity(x) = x₀ ⊕ x₁ ⊕ ... ⊕ x₇` for 8-bit binary inputs. The network outputs 1 if the number of set bits is odd, 0 if even.

**Key properties:**
- 100% accuracy on all 256 possible inputs
- Correctness proven in Coq via constructive algebraic proof
- Weights constrained to {-1, 0, +1} (ternary quantization)
- Heaviside step activation (x ≥ 0 → 1, else 0)
- Suitable for neuromorphic hardware deployment

## Model Variants

| Variant | Architecture | Parameters | Weights | Verification |
|---------|--------------|------------|---------|--------------|
| Full | 8 → 32 → 16 → 1 | 833 | `model.safetensors` | Coq-proven |
| Pruned | 8 → 11 → 3 → 1 | 139 | `pruned/model.safetensors` | Coq-proven |

The pruned variant retains only the 14 neurons (11 in L1, 3 in L2) identified as critical through systematic ablation analysis. Both variants compute identical functions and are independently verified.

## Quick Start

```python
import torch
from safetensors.torch import load_file

# Load weights
weights = load_file('model.safetensors')  # or 'pruned/model.safetensors'

# Manual forward pass (Heaviside activation)
def forward(x, weights):
    x = x.float()
    x = (x @ weights['layer1.weight'].T + weights['layer1.bias'] >= 0).float()
    x = (x @ weights['layer2.weight'].T + weights['layer2.bias'] >= 0).float()
    x = (x @ weights['output.weight'].T + weights['output.bias'] >= 0).float()
    return x.squeeze(-1)

# Test
inputs = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)
output = forward(inputs, weights)
print(f"Parity of [1,0,1,1,0,0,1,0]: {int(output.item())}")  # 0 (4 bits set, even)
```

Or use the provided model classes:

```python
from model import ThresholdNetwork
model = ThresholdNetwork.from_safetensors('model.safetensors')

from pruned.model import PrunedThresholdNetwork
pruned = PrunedThresholdNetwork.from_safetensors('pruned/model.safetensors')
```

## Weight Structure

### Full Network (833 parameters)

| Tensor | Shape | Values | Description |
|--------|-------|--------|-------------|
| `layer1.weight` | [32, 8] | {-1, 0, +1} | 256 ternary weights |
| `layer1.bias` | [32] | [-8, +8] | Bounded integers |
| `layer2.weight` | [16, 32] | {-1, 0, +1} | 512 ternary weights |
| `layer2.bias` | [16] | [-32, +32] | Bounded integers |
| `output.weight` | [1, 16] | {-1, 0, +1} | 16 ternary weights |
| `output.bias` | [1] | [-16, +16] | Bounded integer |

### Pruned Network (139 parameters)

| Tensor | Shape | Values |
|--------|-------|--------|
| `layer1.weight` | [11, 8] | {-1, 0, +1} |
| `layer1.bias` | [11] | Bounded integers |
| `layer2.weight` | [3, 11] | {-1, 0, +1} |
| `layer2.bias` | [3] | Bounded integers |
| `output.weight` | [1, 3] | {-1, 0, +1} |
| `output.bias` | [1] | Bounded integer |

## Training

The network was trained using evolutionary search (genetic algorithm), not gradient descent.

**Why not backpropagation?** Parity is a canonical hard function for neural networks. The loss landscape is essentially flat because parity depends on a global property (bit count mod 2) rather than local features. Small weight perturbations produce negligible gradient signal until the network is already near a solution.

**Evolutionary search parameters:**
- Population size: 500
- Generations to solution: 10,542
- Selection: Top 20% elite
- Mutation rates: 0.05, 0.15, 0.30 (mixed)
- Fitness: Classification accuracy on all 256 inputs

The search finds a working configuration by directly optimizing discrete accuracy rather than relying on gradient information.

## Formal Verification

Both network variants are proven correct in the Coq proof assistant. The verification evolved through 12 proof versions, culminating in a fully constructive algebraic proof.

**Core theorem (proven for all 256 inputs):**
```coq
Theorem network_correct : forall x0 x1 x2 x3 x4 x5 x6 x7,
  network x0 x1 x2 x3 x4 x5 x6 x7 = parity [x0;x1;x2;x3;x4;x5;x6;x7].
```

**Proof strategy (V9_FullyConstructive.v):**

The proof decomposes the critical Layer-2 neuron's pre-activation into three components:
- **A**: Contribution from h-function neurons (depends on a specific linear form h(x))
- **B**: Contribution from Hamming-weight-dependent neurons
- **C**: Contribution from g-function neurons (where g(x) ≡ HW(x) mod 2)

Key algebraic insight: For any ±1 weight vector, the dot product has the same parity as the Hamming weight. This is proven constructively:

```coq
Theorem g_parity : forall x0 x1 x2 x3 x4 x5 x6 x7,
  Z.even (g_func x0 x1 x2 x3 x4 x5 x6 x7) =
  Nat.even (hamming_weight [x0;x1;x2;x3;x4;x5;x6;x7]).
```

The network implements parity by:
1. Two "always-on" neurons (L2-N0, L2-N2) with large biases
2. One parity discriminator (L2-N1) that fires iff Hamming weight is even
3. Output wiring that negates the discriminator

**Parametric generalization (V10-V11):**

The algebraic foundation extends to arbitrary input size n:

```coq
Theorem dot_parity_equals_hw_parity : forall ws xs,
  length ws = length xs ->
  Z.even (signed_dot ws xs) = Nat.even (hamming_weight xs).
```

A verified n-bit parity network construction is provided using n+3 neurons total.

Full proof development: [threshold-logic-verified/coq](https://github.com/CharlesCNorton/threshold-logic-verified/tree/main/coq)

## Test Results

### Functional Validation

| Test | Result |
|------|--------|
| Exhaustive (256 inputs) | 256/256 |
| Batch inference consistency | Pass |
| Determinism (1000 runs) | Pass |
| Throughput | 18,161 inf/sec (CPU) |

### Analytical Properties

| Property | Finding |
|----------|---------|
| Weight sparsity | 32.9% zeros (258/784) |
| Critical L1 neurons | 11 of 32 |
| Critical L2 neurons | 3 of 16 |
| Noise sensitivity (σ=0.05) | Accuracy drops to 51% |
| Minimum depth | 3 layers required |

**Neuron ablation results:**

| Neuron Removed | Accuracy Impact |
|----------------|-----------------|
| L2-N8 | 256 → 128 (chance) |
| L1-N11 | 256 → 162 |
| Any redundant neuron | 256 → 256 (no change) |

**Random pruning sensitivity:**

| Weights Pruned | Accuracy |
|----------------|----------|
| 0% | 256/256 |
| 10% | 138-191/256 |
| 20%+ | ~128/256 (chance) |

### Noise Robustness

The network is extremely fragile to input noise due to the hard Heaviside threshold:

| Gaussian Noise σ | Accuracy |
|------------------|----------|
| 0.00 | 100.0% |
| 0.05 | 51.2% |
| 0.10 | 51.2% |
| 0.20 | 48.8% |

This is inherent to threshold networks, not a defect.

### Bias Sensitivity

| Layer | Sensitive to ±1 Perturbation |
|-------|------------------------------|
| Layer 1 | 9 of 32 biases |
| Layer 2 | 1 of 16 biases |
| Output | 1 of 1 biases |

## Practical Applications

The network has been validated on standard parity-based protocols:

| Application | Test Cases | Result |
|-------------|------------|--------|
| Memory ECC parity | 100 bytes | 100% |
| RAID-5 reconstruction | 50 stripes | 100% |
| UART frame validation | 100 frames | 100% |
| XOR checksum | 100 bytes | Pass |
| LFSR keystream | 32 bits | Deterministic |
| Barcode validation | 50 codes | 100% |
| Packet checksum | 20 packets | 100% |
| Hamming (7,4) syndrome | 20 codewords | 100% |
| PS/2 keyboard codes | 20 scancodes | 100% |
| DNA codon parity | 8 codons | 100% |

## Neuromorphic Deployment

The primary use case is neuromorphic hardware where threshold networks are the native compute primitive and XOR gates cannot be directly instantiated:

| Platform | Status |
|----------|--------|
| Intel Loihi | Compatible (untested) |
| IBM TrueNorth | Compatible (untested) |
| BrainChip Akida | Compatible (untested) |

**Resource comparison:**

| Implementation | Gates/Neurons |
|----------------|---------------|
| Threshold network (pruned) | 14 neurons |
| Threshold network (full) | 49 neurons |
| XOR tree (conventional) | 7 gates |

The 70-140x overhead vs XOR gates is acceptable only when XOR is unavailable.

## Pruned Network Details

The pruned network was extracted by identifying critical neurons through systematic ablation:

**Retained neurons:**
- Layer 1: indices [3, 7, 9, 11, 12, 16, 17, 20, 24, 27, 28]
- Layer 2: indices [8, 11, 14]

**Interpretation:**

The pruned network reveals the minimal structure for parity computation:
- L2-N0 (originally N8): Always fires (bias=30 dominates)
- L2-N1 (originally N11): Parity discriminator (fires iff HW even)
- L2-N2 (originally N14): Always fires (bias=20 dominates)
- Output: Computes `L2-N0 - L2-N1 + L2-N2 - 2 >= 0`, which equals `NOT(L2-N1)`

Since L2-N1 fires when Hamming weight is even, the output fires when Hamming weight is odd—exactly parity.

## Limitations

- **Fixed input size**: 8 bits only (parametric construction in Coq extends to any n)
- **Binary inputs**: Expects {0, 1}, not continuous values
- **No noise margin**: Heaviside threshold at exactly 0
- **Not differentiable**: Cannot be fine-tuned with gradient descent

## Files

```
tiny-parity-prover/
├── model.safetensors      # Full network weights (833 params)
├── model.py               # Full network inference code
├── config.json            # Full network metadata
├── results.md             # Comprehensive test results
├── todo.md                # Future work notes
├── README.md              # This file
└── pruned/
    ├── model.safetensors  # Pruned network weights (139 params)
    ├── model.py           # Pruned network inference code
    ├── config.json        # Pruned network metadata
    └── README.md          # Pruned variant description
```

## Citation

```bibtex
@software{tiny_parity_prover_2025,
  title={tiny-parity-prover: Formally Verified Neural Network for 8-bit Parity},
  author={Norton, Charles},
  url={https://huggingface.co/phanerozoic/tiny-parity-prover},
  year={2025},
  note={Weights and inference code. Proofs at github.com/CharlesCNorton/threshold-logic-verified}
}
```

## Related

- **Proof repository**: [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified) — Coq proofs, training code, full documentation
- **Coq proof files**: V0_Core.v through V11_ParametricNetwork.v
- **Key proof**: V9_FullyConstructive.v (complete algebraic proof without vm_compute)

## License

MIT