tiny-parity-prover

Trained weights for a formally verified neural network that computes 8-bit parity. This repository contains the model artifacts; for proof development and Coq source code, see threshold-logic-verified.

Overview

This is a ternary threshold network that computes parity(x) = xβ‚€ βŠ• x₁ βŠ• ... βŠ• x₇ for 8-bit binary inputs. The network outputs 1 if the number of set bits is odd, 0 if even.

Key properties:

  • 100% accuracy on all 256 possible inputs
  • Correctness proven in Coq via constructive algebraic proof
  • Weights constrained to {-1, 0, +1} (ternary quantization)
  • Heaviside step activation (x β‰₯ 0 β†’ 1, else 0)
  • Suitable for neuromorphic hardware deployment

Model Variants

Variant Architecture Parameters Weights Verification
Full 8 β†’ 32 β†’ 16 β†’ 1 833 model.safetensors Coq-proven
Pruned 8 β†’ 11 β†’ 3 β†’ 1 139 pruned/model.safetensors Coq-proven

The pruned variant retains only the 14 neurons (11 in L1, 3 in L2) identified as critical through systematic ablation analysis. Both variants compute identical functions and are independently verified.

Quick Start

import torch
from safetensors.torch import load_file

# Load weights
weights = load_file('model.safetensors')  # or 'pruned/model.safetensors'

# Manual forward pass (Heaviside activation)
def forward(x, weights):
    x = x.float()
    x = (x @ weights['layer1.weight'].T + weights['layer1.bias'] >= 0).float()
    x = (x @ weights['layer2.weight'].T + weights['layer2.bias'] >= 0).float()
    x = (x @ weights['output.weight'].T + weights['output.bias'] >= 0).float()
    return x.squeeze(-1)

# Test
inputs = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)
output = forward(inputs, weights)
print(f"Parity of [1,0,1,1,0,0,1,0]: {int(output.item())}")  # 0 (4 bits set, even)

Or use the provided model classes:

from model import ThresholdNetwork
model = ThresholdNetwork.from_safetensors('model.safetensors')

from pruned.model import PrunedThresholdNetwork
pruned = PrunedThresholdNetwork.from_safetensors('pruned/model.safetensors')

Weight Structure

Full Network (833 parameters)

Tensor Shape Values Description
layer1.weight [32, 8] {-1, 0, +1} 256 ternary weights
layer1.bias [32] [-8, +8] Bounded integers
layer2.weight [16, 32] {-1, 0, +1} 512 ternary weights
layer2.bias [16] [-32, +32] Bounded integers
output.weight [1, 16] {-1, 0, +1} 16 ternary weights
output.bias [1] [-16, +16] Bounded integer

Pruned Network (139 parameters)

Tensor Shape Values
layer1.weight [11, 8] {-1, 0, +1}
layer1.bias [11] Bounded integers
layer2.weight [3, 11] {-1, 0, +1}
layer2.bias [3] Bounded integers
output.weight [1, 3] {-1, 0, +1}
output.bias [1] Bounded integer

Training

The network was trained using evolutionary search (genetic algorithm), not gradient descent.

Why not backpropagation? Parity is a canonical hard function for neural networks. The loss landscape is essentially flat because parity depends on a global property (bit count mod 2) rather than local features. Small weight perturbations produce negligible gradient signal until the network is already near a solution.

Evolutionary search parameters:

  • Population size: 500
  • Generations to solution: 10,542
  • Selection: Top 20% elite
  • Mutation rates: 0.05, 0.15, 0.30 (mixed)
  • Fitness: Classification accuracy on all 256 inputs

The search finds a working configuration by directly optimizing discrete accuracy rather than relying on gradient information.

Formal Verification

Both network variants are proven correct in the Coq proof assistant. The verification evolved through 12 proof versions, culminating in a fully constructive algebraic proof.

Core theorem (proven for all 256 inputs):

Theorem network_correct : forall x0 x1 x2 x3 x4 x5 x6 x7,
  network x0 x1 x2 x3 x4 x5 x6 x7 = parity [x0;x1;x2;x3;x4;x5;x6;x7].

Proof strategy (V9_FullyConstructive.v):

The proof decomposes the critical Layer-2 neuron's pre-activation into three components:

  • A: Contribution from h-function neurons (depends on a specific linear form h(x))
  • B: Contribution from Hamming-weight-dependent neurons
  • C: Contribution from g-function neurons (where g(x) ≑ HW(x) mod 2)

Key algebraic insight: For any Β±1 weight vector, the dot product has the same parity as the Hamming weight. This is proven constructively:

Theorem g_parity : forall x0 x1 x2 x3 x4 x5 x6 x7,
  Z.even (g_func x0 x1 x2 x3 x4 x5 x6 x7) =
  Nat.even (hamming_weight [x0;x1;x2;x3;x4;x5;x6;x7]).

The network implements parity by:

  1. Two "always-on" neurons (L2-N0, L2-N2) with large biases
  2. One parity discriminator (L2-N1) that fires iff Hamming weight is even
  3. Output wiring that negates the discriminator

Parametric generalization (V10-V11):

The algebraic foundation extends to arbitrary input size n:

Theorem dot_parity_equals_hw_parity : forall ws xs,
  length ws = length xs ->
  Z.even (signed_dot ws xs) = Nat.even (hamming_weight xs).

A verified n-bit parity network construction is provided using n+3 neurons total.

Full proof development: threshold-logic-verified/coq

Test Results

Functional Validation

Test Result
Exhaustive (256 inputs) 256/256
Batch inference consistency Pass
Determinism (1000 runs) Pass
Throughput 18,161 inf/sec (CPU)

Analytical Properties

Property Finding
Weight sparsity 32.9% zeros (258/784)
Critical L1 neurons 11 of 32
Critical L2 neurons 3 of 16
Noise sensitivity (Οƒ=0.05) Accuracy drops to 51%
Minimum depth 3 layers required

Neuron ablation results:

Neuron Removed Accuracy Impact
L2-N8 256 β†’ 128 (chance)
L1-N11 256 β†’ 162
Any redundant neuron 256 β†’ 256 (no change)

Random pruning sensitivity:

Weights Pruned Accuracy
0% 256/256
10% 138-191/256
20%+ ~128/256 (chance)

Noise Robustness

The network is extremely fragile to input noise due to the hard Heaviside threshold:

Gaussian Noise Οƒ Accuracy
0.00 100.0%
0.05 51.2%
0.10 51.2%
0.20 48.8%

This is inherent to threshold networks, not a defect.

Bias Sensitivity

Layer Sensitive to Β±1 Perturbation
Layer 1 9 of 32 biases
Layer 2 1 of 16 biases
Output 1 of 1 biases

Practical Applications

The network has been validated on standard parity-based protocols:

Application Test Cases Result
Memory ECC parity 100 bytes 100%
RAID-5 reconstruction 50 stripes 100%
UART frame validation 100 frames 100%
XOR checksum 100 bytes Pass
LFSR keystream 32 bits Deterministic
Barcode validation 50 codes 100%
Packet checksum 20 packets 100%
Hamming (7,4) syndrome 20 codewords 100%
PS/2 keyboard codes 20 scancodes 100%
DNA codon parity 8 codons 100%

Neuromorphic Deployment

The primary use case is neuromorphic hardware where threshold networks are the native compute primitive and XOR gates cannot be directly instantiated:

Platform Status
Intel Loihi Compatible (untested)
IBM TrueNorth Compatible (untested)
BrainChip Akida Compatible (untested)

Resource comparison:

Implementation Gates/Neurons
Threshold network (pruned) 14 neurons
Threshold network (full) 49 neurons
XOR tree (conventional) 7 gates

The 70-140x overhead vs XOR gates is acceptable only when XOR is unavailable.

Pruned Network Details

The pruned network was extracted by identifying critical neurons through systematic ablation:

Retained neurons:

  • Layer 1: indices [3, 7, 9, 11, 12, 16, 17, 20, 24, 27, 28]
  • Layer 2: indices [8, 11, 14]

Interpretation:

The pruned network reveals the minimal structure for parity computation:

  • L2-N0 (originally N8): Always fires (bias=30 dominates)
  • L2-N1 (originally N11): Parity discriminator (fires iff HW even)
  • L2-N2 (originally N14): Always fires (bias=20 dominates)
  • Output: Computes L2-N0 - L2-N1 + L2-N2 - 2 >= 0, which equals NOT(L2-N1)

Since L2-N1 fires when Hamming weight is even, the output fires when Hamming weight is oddβ€”exactly parity.

Limitations

  • Fixed input size: 8 bits only (parametric construction in Coq extends to any n)
  • Binary inputs: Expects {0, 1}, not continuous values
  • No noise margin: Heaviside threshold at exactly 0
  • Not differentiable: Cannot be fine-tuned with gradient descent

Files

tiny-parity-prover/
β”œβ”€β”€ model.safetensors      # Full network weights (833 params)
β”œβ”€β”€ model.py               # Full network inference code
β”œβ”€β”€ config.json            # Full network metadata
β”œβ”€β”€ results.md             # Comprehensive test results
β”œβ”€β”€ todo.md                # Future work notes
β”œβ”€β”€ README.md              # This file
└── pruned/
    β”œβ”€β”€ model.safetensors  # Pruned network weights (139 params)
    β”œβ”€β”€ model.py           # Pruned network inference code
    β”œβ”€β”€ config.json        # Pruned network metadata
    └── README.md          # Pruned variant description

Citation

@software{tiny_parity_prover_2025,
  title={tiny-parity-prover: Formally Verified Neural Network for 8-bit Parity},
  author={Norton, Charles},
  url={https://huggingface.co/phanerozoic/tiny-parity-prover},
  year={2025},
  note={Weights and inference code. Proofs at github.com/CharlesCNorton/threshold-logic-verified}
}

Related

  • Proof repository: threshold-logic-verified β€” Coq proofs, training code, full documentation
  • Coq proof files: V0_Core.v through V11_ParametricNetwork.v
  • Key proof: V9_FullyConstructive.v (complete algebraic proof without vm_compute)

License

MIT

Downloads last month
36
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support