tiny-mod3-verified / README.md

Update training section wording

0ff8da5 verified 2 days ago

6.52 kB

	---
	license: mit
	tags:
	- pytorch
	- safetensors
	- formal-verification
	- coq
	- mod3
	- modular-arithmetic
	- threshold-network
	- neuromorphic
	---

	# tiny-mod3-prover

	Formally verified neural network that computes the MOD-3 function (Hamming weight mod 3) on 8-bit inputs. This repository contains the model artifacts; for proof development and Coq source code, see [mod3-verified](https://github.com/CharlesCNorton/mod3-verified).

	## Overview

	This is a threshold network that computes `mod3(x) = HW(x) mod 3` for 8-bit binary inputs, where HW denotes Hamming weight (number of set bits). The network outputs 0, 1, or 2 corresponding to the three residue classes.

	Key properties:
	- 100% accuracy on all 256 possible inputs
	- Correctness proven in Coq via constructive algebraic proof
	- Weights constrained to integers (many ternary)
	- Heaviside step activation (x ≥ 0 → 1, else 0)
	- First formally verified threshold circuit for MOD-m where m > 2

	## Architecture

	\| Layer \| Neurons \| Function \|
	\|-------\|---------\|----------\|
	\| Input \| 8 \| Binary input bits \|
	\| Hidden 1 \| 9 \| Thermometer encoding (HW ≥ k) \|
	\| Hidden 2 \| 2 \| MOD-3 detection \|
	\| Output \| 3 \| Classification (one-hot) \|

	Total: 14 neurons, 110 parameters

	## Quick Start

	```python
	import torch
	from safetensors.torch import load_file

	# Load weights
	weights = load_file('model.safetensors')

	# Manual forward pass (Heaviside activation)
	def forward(x, weights):
	x = x.float()
	x = (x @ weights['layer1.weight'].T + weights['layer1.bias'] >= 0).float()
	x = (x @ weights['layer2.weight'].T + weights['layer2.bias'] >= 0).float()
	out = x @ weights['output.weight'].T + weights['output.bias']
	return out.argmax(dim=-1)

	# Test
	inputs = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)
	output = forward(inputs, weights)
	print(f"MOD-3 of [1,0,1,1,0,0,1,0]: {output.item()}") # 1 (4 bits set, 4 mod 3 = 1)
	```

	## Weight Structure

	\| Tensor \| Shape \| Values \| Description \|
	\|--------\|-------\|--------\|-------------\|
	\| `layer1.weight` \| [9, 8] \| All 1s \| Thermometer encoding \|
	\| `layer1.bias` \| [9] \| [0, -1, ..., -8] \| Threshold at HW ≥ k \|
	\| `layer2.weight` \| [2, 9] \| [0,1,1,-2,1,1,-2,1,1] \| MOD-3 detection \|
	\| `layer2.bias` \| [2] \| [-1, -2] \| Class thresholds \|
	\| `output.weight` \| [3, 2] \| Various \| Classification \|
	\| `output.bias` \| [3] \| [0, -1, -1] \| Output thresholds \|

	## Algebraic Insight

	For parity (MOD-2), the key insight was that ±1 dot products preserve Hamming weight parity because (-1) ≡ 1 (mod 2).

	For MOD-3, the insight is different. Using weights `(1, 1, -2)` repeated on the thermometer encoding produces partial sums that cycle through `(0, 1, 2, 0, 1, 2, ...)`:

	```
	HW=0: cumsum = 0 → 0 mod 3
	HW=1: cumsum = 1 → 1 mod 3
	HW=2: cumsum = 2 → 2 mod 3
	HW=3: cumsum = 0 → 0 mod 3 (reset: 1+1-2=0)
	HW=4: cumsum = 1 → 1 mod 3
	HW=5: cumsum = 2 → 2 mod 3
	HW=6: cumsum = 0 → 0 mod 3 (reset)
	HW=7: cumsum = 1 → 1 mod 3
	HW=8: cumsum = 2 → 2 mod 3
	```

	The pattern `1 + 1 + (-2) = 0` causes the cumulative sum to reset every 3 steps, tracking HW mod 3.

	This generalizes to MOD-m: use weights `(1, 1, ..., 1, 1-m)` with `m-1` ones before the `1-m` term.

	## Formal Verification

	The network is proven correct in the Coq proof assistant with three independent proofs:

	1. Exhaustive verification:
	```coq
	Theorem network_correct_exhaustive : verify_all = true.
	Proof. vm_compute. reflexivity. Qed.
	```

	2. Constructive verification (case analysis):
	```coq
	Theorem network_correct_constructive : forall x0 x1 x2 x3 x4 x5 x6 x7,
	predict [x0; x1; x2; x3; x4; x5; x6; x7] =
	mod3 [x0; x1; x2; x3; x4; x5; x6; x7].
	```

	3. Algebraic verification:
	```coq
	Theorem cumsum_eq_mod3 : forall k,
	(k <= 8)%nat -> cumsum k = Z.of_nat (Nat.modulo k 3).

	Theorem network_algebraic_correct : forall h,
	(h <= 8)%nat ->
	classify (Z.geb (cumsum h - 1) 0) (Z.geb (cumsum h - 2) 0) = Nat.modulo h 3.
	```

	All proofs are axiom-free ("Closed under the global context").

	## MOD-3 Distribution

	For 8-bit inputs (256 total):

	\| Class \| Count \| Hamming Weights \|
	\|-------\|-------\|-----------------\|
	\| 0 \| 85 \| 0, 3, 6 \|
	\| 1 \| 86 \| 1, 4, 7 \|
	\| 2 \| 85 \| 2, 5, 8 \|

	## Training

	The parametric construction was derived algebraically, not discovered through training.

	Evolutionary search was attempted (as with parity) but consistently plateaued at 247/256 (96.5%) accuracy across multiple seeds. We were unable to discover the (1,1,-2) weight pattern through training, so we used algebraic construction instead.

	## Comparison to Parity

	\| Property \| Parity (MOD-2) \| MOD-3 \|
	\|----------\|----------------\|-------\|
	\| Output classes \| 2 \| 3 \|
	\| Key weights \| ±1 (any) \| (1,1,-2) specific \|
	\| Training \| Evolutionary (10K gen) \| Algebraic construction \|
	\| Neurons \| 14 (pruned) \| 14 \|
	\| Parameters \| 139 (pruned) \| 110 \|
	\| Algebraic insight \| (-1) ≡ 1 (mod 2) \| 1+1-2 = 0 (reset) \|

	## Limitations

	- Fixed input size: 8 bits only (algebraic construction extends to any n)
	- Binary inputs: Expects {0, 1}, not continuous values
	- No noise margin: Heaviside threshold at exactly 0
	- Not differentiable: Cannot be fine-tuned with gradient descent
	- Training gap: Our evolutionary search achieved only 96.5%; we used algebraic construction instead

	## Files

	```
	mod3-verified/
	├── model.safetensors # Network weights (110 params)
	├── model.py # Inference code
	├── config.json # Model metadata
	└── README.md # This file
	```

	## Citation

	```bibtex
	@software{tiny_mod3_prover_2025,
	title={tiny-mod3-prover: Formally Verified Threshold Network for MOD-3},
	author={Norton, Charles},
	url={https://huggingface.co/phanerozoic/tiny-mod3-prover},
	year={2025},
	note={First verified threshold circuit for modular counting beyond parity}
	}
	```

	## Related

	- Proof repository: [mod3-verified](https://github.com/CharlesCNorton/mod3-verified) — Coq proofs, training attempts, full documentation
	- Parity network: [tiny-parity-prover](https://huggingface.co/phanerozoic/tiny-parity-prover) — Verified MOD-2 (parity) network
	- Parity proofs: [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified) — Original parity verification project

	## License

	MIT