File size: 6,523 Bytes
7cfe40b 4498680 7cfe40b 0ff8da5 7cfe40b 0ff8da5 7cfe40b 4498680 7cfe40b 4498680 7cfe40b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
---
license: mit
tags:
- pytorch
- safetensors
- formal-verification
- coq
- mod3
- modular-arithmetic
- threshold-network
- neuromorphic
---
# tiny-mod3-prover
Formally verified neural network that computes the MOD-3 function (Hamming weight mod 3) on 8-bit inputs. This repository contains the model artifacts; for proof development and Coq source code, see [mod3-verified](https://github.com/CharlesCNorton/mod3-verified).
## Overview
This is a threshold network that computes `mod3(x) = HW(x) mod 3` for 8-bit binary inputs, where HW denotes Hamming weight (number of set bits). The network outputs 0, 1, or 2 corresponding to the three residue classes.
**Key properties:**
- 100% accuracy on all 256 possible inputs
- Correctness proven in Coq via constructive algebraic proof
- Weights constrained to integers (many ternary)
- Heaviside step activation (x β₯ 0 β 1, else 0)
- First formally verified threshold circuit for MOD-m where m > 2
## Architecture
| Layer | Neurons | Function |
|-------|---------|----------|
| Input | 8 | Binary input bits |
| Hidden 1 | 9 | Thermometer encoding (HW β₯ k) |
| Hidden 2 | 2 | MOD-3 detection |
| Output | 3 | Classification (one-hot) |
**Total: 14 neurons, 110 parameters**
## Quick Start
```python
import torch
from safetensors.torch import load_file
# Load weights
weights = load_file('model.safetensors')
# Manual forward pass (Heaviside activation)
def forward(x, weights):
x = x.float()
x = (x @ weights['layer1.weight'].T + weights['layer1.bias'] >= 0).float()
x = (x @ weights['layer2.weight'].T + weights['layer2.bias'] >= 0).float()
out = x @ weights['output.weight'].T + weights['output.bias']
return out.argmax(dim=-1)
# Test
inputs = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)
output = forward(inputs, weights)
print(f"MOD-3 of [1,0,1,1,0,0,1,0]: {output.item()}") # 1 (4 bits set, 4 mod 3 = 1)
```
## Weight Structure
| Tensor | Shape | Values | Description |
|--------|-------|--------|-------------|
| `layer1.weight` | [9, 8] | All 1s | Thermometer encoding |
| `layer1.bias` | [9] | [0, -1, ..., -8] | Threshold at HW β₯ k |
| `layer2.weight` | [2, 9] | [0,1,1,-2,1,1,-2,1,1] | MOD-3 detection |
| `layer2.bias` | [2] | [-1, -2] | Class thresholds |
| `output.weight` | [3, 2] | Various | Classification |
| `output.bias` | [3] | [0, -1, -1] | Output thresholds |
## Algebraic Insight
For parity (MOD-2), the key insight was that Β±1 dot products preserve Hamming weight parity because (-1) β‘ 1 (mod 2).
For MOD-3, the insight is different. Using weights `(1, 1, -2)` repeated on the thermometer encoding produces partial sums that cycle through `(0, 1, 2, 0, 1, 2, ...)`:
```
HW=0: cumsum = 0 β 0 mod 3
HW=1: cumsum = 1 β 1 mod 3
HW=2: cumsum = 2 β 2 mod 3
HW=3: cumsum = 0 β 0 mod 3 (reset: 1+1-2=0)
HW=4: cumsum = 1 β 1 mod 3
HW=5: cumsum = 2 β 2 mod 3
HW=6: cumsum = 0 β 0 mod 3 (reset)
HW=7: cumsum = 1 β 1 mod 3
HW=8: cumsum = 2 β 2 mod 3
```
The pattern `1 + 1 + (-2) = 0` causes the cumulative sum to reset every 3 steps, tracking HW mod 3.
This generalizes to MOD-m: use weights `(1, 1, ..., 1, 1-m)` with `m-1` ones before the `1-m` term.
## Formal Verification
The network is proven correct in the Coq proof assistant with three independent proofs:
**1. Exhaustive verification:**
```coq
Theorem network_correct_exhaustive : verify_all = true.
Proof. vm_compute. reflexivity. Qed.
```
**2. Constructive verification (case analysis):**
```coq
Theorem network_correct_constructive : forall x0 x1 x2 x3 x4 x5 x6 x7,
predict [x0; x1; x2; x3; x4; x5; x6; x7] =
mod3 [x0; x1; x2; x3; x4; x5; x6; x7].
```
**3. Algebraic verification:**
```coq
Theorem cumsum_eq_mod3 : forall k,
(k <= 8)%nat -> cumsum k = Z.of_nat (Nat.modulo k 3).
Theorem network_algebraic_correct : forall h,
(h <= 8)%nat ->
classify (Z.geb (cumsum h - 1) 0) (Z.geb (cumsum h - 2) 0) = Nat.modulo h 3.
```
All proofs are axiom-free ("Closed under the global context").
## MOD-3 Distribution
For 8-bit inputs (256 total):
| Class | Count | Hamming Weights |
|-------|-------|-----------------|
| 0 | 85 | 0, 3, 6 |
| 1 | 86 | 1, 4, 7 |
| 2 | 85 | 2, 5, 8 |
## Training
The parametric construction was derived algebraically, not discovered through training.
Evolutionary search was attempted (as with parity) but consistently plateaued at 247/256 (96.5%) accuracy across multiple seeds. We were unable to discover the (1,1,-2) weight pattern through training, so we used algebraic construction instead.
## Comparison to Parity
| Property | Parity (MOD-2) | MOD-3 |
|----------|----------------|-------|
| Output classes | 2 | 3 |
| Key weights | Β±1 (any) | (1,1,-2) specific |
| Training | Evolutionary (10K gen) | Algebraic construction |
| Neurons | 14 (pruned) | 14 |
| Parameters | 139 (pruned) | 110 |
| Algebraic insight | (-1) β‘ 1 (mod 2) | 1+1-2 = 0 (reset) |
## Limitations
- **Fixed input size**: 8 bits only (algebraic construction extends to any n)
- **Binary inputs**: Expects {0, 1}, not continuous values
- **No noise margin**: Heaviside threshold at exactly 0
- **Not differentiable**: Cannot be fine-tuned with gradient descent
- **Training gap**: Our evolutionary search achieved only 96.5%; we used algebraic construction instead
## Files
```
mod3-verified/
βββ model.safetensors # Network weights (110 params)
βββ model.py # Inference code
βββ config.json # Model metadata
βββ README.md # This file
```
## Citation
```bibtex
@software{tiny_mod3_prover_2025,
title={tiny-mod3-prover: Formally Verified Threshold Network for MOD-3},
author={Norton, Charles},
url={https://huggingface.co/phanerozoic/tiny-mod3-prover},
year={2025},
note={First verified threshold circuit for modular counting beyond parity}
}
```
## Related
- **Proof repository**: [mod3-verified](https://github.com/CharlesCNorton/mod3-verified) β Coq proofs, training attempts, full documentation
- **Parity network**: [tiny-parity-prover](https://huggingface.co/phanerozoic/tiny-parity-prover) β Verified MOD-2 (parity) network
- **Parity proofs**: [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified) β Original parity verification project
## License
MIT
|