tiny-mod3-prover

Formally verified neural network that computes the MOD-3 function (Hamming weight mod 3) on 8-bit inputs. This repository contains the model artifacts; for proof development and Coq source code, see mod3-verified.

Overview

This is a threshold network that computes mod3(x) = HW(x) mod 3 for 8-bit binary inputs, where HW denotes Hamming weight (number of set bits). The network outputs 0, 1, or 2 corresponding to the three residue classes.

Key properties:

100% accuracy on all 256 possible inputs
Correctness proven in Coq via constructive algebraic proof
Weights constrained to integers (many ternary)
Heaviside step activation (x ≥ 0 → 1, else 0)
First formally verified threshold circuit for MOD-m where m > 2

Architecture

Layer	Neurons	Function
Input	8	Binary input bits
Hidden 1	9	Thermometer encoding (HW ≥ k)
Hidden 2	2	MOD-3 detection
Output	3	Classification (one-hot)

Total: 14 neurons, 110 parameters

Quick Start

import torch
from safetensors.torch import load_file

# Load weights
weights = load_file('model.safetensors')

# Manual forward pass (Heaviside activation)
def forward(x, weights):
    x = x.float()
    x = (x @ weights['layer1.weight'].T + weights['layer1.bias'] >= 0).float()
    x = (x @ weights['layer2.weight'].T + weights['layer2.bias'] >= 0).float()
    out = x @ weights['output.weight'].T + weights['output.bias']
    return out.argmax(dim=-1)

# Test
inputs = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)
output = forward(inputs, weights)
print(f"MOD-3 of [1,0,1,1,0,0,1,0]: {output.item()}")  # 1 (4 bits set, 4 mod 3 = 1)

Weight Structure

Tensor	Shape	Values	Description
`layer1.weight`	[9, 8]	All 1s	Thermometer encoding
`layer1.bias`	[9]	[0, -1, ..., -8]	Threshold at HW ≥ k
`layer2.weight`	[2, 9]	[0,1,1,-2,1,1,-2,1,1]	MOD-3 detection
`layer2.bias`	[2]	[-1, -2]	Class thresholds
`output.weight`	[3, 2]	Various	Classification
`output.bias`	[3]	[0, -1, -1]	Output thresholds

Algebraic Insight

For parity (MOD-2), the key insight was that ±1 dot products preserve Hamming weight parity because (-1) ≡ 1 (mod 2).

For MOD-3, the insight is different. Using weights (1, 1, -2) repeated on the thermometer encoding produces partial sums that cycle through (0, 1, 2, 0, 1, 2, ...):

HW=0: cumsum = 0  →  0 mod 3
HW=1: cumsum = 1  →  1 mod 3
HW=2: cumsum = 2  →  2 mod 3
HW=3: cumsum = 0  →  0 mod 3  (reset: 1+1-2=0)
HW=4: cumsum = 1  →  1 mod 3
HW=5: cumsum = 2  →  2 mod 3
HW=6: cumsum = 0  →  0 mod 3  (reset)
HW=7: cumsum = 1  →  1 mod 3
HW=8: cumsum = 2  →  2 mod 3

The pattern 1 + 1 + (-2) = 0 causes the cumulative sum to reset every 3 steps, tracking HW mod 3.

This generalizes to MOD-m: use weights (1, 1, ..., 1, 1-m) with m-1 ones before the 1-m term.

Formal Verification

The network is proven correct in the Coq proof assistant with three independent proofs:

1. Exhaustive verification:

Theorem network_correct_exhaustive : verify_all = true.
Proof. vm_compute. reflexivity. Qed.

2. Constructive verification (case analysis):

Theorem network_correct_constructive : forall x0 x1 x2 x3 x4 x5 x6 x7,
  predict [x0; x1; x2; x3; x4; x5; x6; x7] =
  mod3 [x0; x1; x2; x3; x4; x5; x6; x7].

3. Algebraic verification:

Theorem cumsum_eq_mod3 : forall k,
  (k <= 8)%nat -> cumsum k = Z.of_nat (Nat.modulo k 3).

Theorem network_algebraic_correct : forall h,
  (h <= 8)%nat ->
  classify (Z.geb (cumsum h - 1) 0) (Z.geb (cumsum h - 2) 0) = Nat.modulo h 3.

All proofs are axiom-free ("Closed under the global context").

MOD-3 Distribution

For 8-bit inputs (256 total):

Class	Count	Hamming Weights
0	85	0, 3, 6
1	86	1, 4, 7
2	85	2, 5, 8

Training

The parametric construction was derived algebraically, not discovered through training.

Evolutionary search was attempted (as with parity) but consistently plateaued at 247/256 (96.5%) accuracy across multiple seeds. We were unable to discover the (1,1,-2) weight pattern through training, so we used algebraic construction instead.

Comparison to Parity

Property	Parity (MOD-2)	MOD-3
Output classes	2	3
Key weights	±1 (any)	(1,1,-2) specific
Training	Evolutionary (10K gen)	Algebraic construction
Neurons	14 (pruned)	14
Parameters	139 (pruned)	110
Algebraic insight	(-1) ≡ 1 (mod 2)	1+1-2 = 0 (reset)

Limitations

Fixed input size: 8 bits only (algebraic construction extends to any n)
Binary inputs: Expects {0, 1}, not continuous values
No noise margin: Heaviside threshold at exactly 0
Not differentiable: Cannot be fine-tuned with gradient descent
Training gap: Our evolutionary search achieved only 96.5%; we used algebraic construction instead

Files

mod3-verified/
├── model.safetensors    # Network weights (110 params)
├── model.py             # Inference code
├── config.json          # Model metadata
└── README.md            # This file

Citation

@software{tiny_mod3_prover_2025,
  title={tiny-mod3-prover: Formally Verified Threshold Network for MOD-3},
  author={Norton, Charles},
  url={https://huggingface.co/phanerozoic/tiny-mod3-prover},
  year={2025},
  note={First verified threshold circuit for modular counting beyond parity}
}

Proof repository: mod3-verified — Coq proofs, training attempts, full documentation
Parity network: tiny-parity-prover — Verified MOD-2 (parity) network
Parity proofs: threshold-logic-verified — Original parity verification project

License

MIT

Downloads last month: 12

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including phanerozoic/tiny-mod3-verified

Tiny Verified Logic Circuits

Collection

Formally verified threshold logic circuits. Compatible with neuromorphic hardware. • 33 items • Updated about 1 hour ago

phanerozoic
/

tiny-mod3-verified