phanerozoic commited on
Commit
7cfe40b
·
verified ·
1 Parent(s): defb5fe

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. README.md +192 -0
  2. config.json +30 -0
  3. model.py +119 -0
  4. model.safetensors +3 -0
  5. tmpclaude-9499-cwd +1 -0
README.md ADDED
@@ -0,0 +1,192 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - pytorch
5
+ - safetensors
6
+ - formal-verification
7
+ - coq
8
+ - mod3
9
+ - modular-arithmetic
10
+ - threshold-network
11
+ - neuromorphic
12
+ ---
13
+
14
+ # mod3-verified
15
+
16
+ Formally verified neural network that computes the MOD-3 function (Hamming weight mod 3) on 8-bit inputs. This repository contains the model artifacts; for proof development and Coq source code, see [mod3-verified](https://github.com/CharlesCNorton/mod3-verified).
17
+
18
+ ## Overview
19
+
20
+ This is a threshold network that computes `mod3(x) = HW(x) mod 3` for 8-bit binary inputs, where HW denotes Hamming weight (number of set bits). The network outputs 0, 1, or 2 corresponding to the three residue classes.
21
+
22
+ **Key properties:**
23
+ - 100% accuracy on all 256 possible inputs
24
+ - Correctness proven in Coq via constructive algebraic proof
25
+ - Weights constrained to integers (many ternary)
26
+ - Heaviside step activation (x ≥ 0 → 1, else 0)
27
+ - First formally verified threshold circuit for MOD-m where m > 2
28
+
29
+ ## Architecture
30
+
31
+ | Layer | Neurons | Function |
32
+ |-------|---------|----------|
33
+ | Input | 8 | Binary input bits |
34
+ | Hidden 1 | 9 | Thermometer encoding (HW ≥ k) |
35
+ | Hidden 2 | 2 | MOD-3 detection |
36
+ | Output | 3 | Classification (one-hot) |
37
+
38
+ **Total: 14 neurons, 110 parameters**
39
+
40
+ ## Quick Start
41
+
42
+ ```python
43
+ import torch
44
+ from safetensors.torch import load_file
45
+
46
+ # Load weights
47
+ weights = load_file('model.safetensors')
48
+
49
+ # Manual forward pass (Heaviside activation)
50
+ def forward(x, weights):
51
+ x = x.float()
52
+ x = (x @ weights['layer1.weight'].T + weights['layer1.bias'] >= 0).float()
53
+ x = (x @ weights['layer2.weight'].T + weights['layer2.bias'] >= 0).float()
54
+ out = x @ weights['output.weight'].T + weights['output.bias']
55
+ return out.argmax(dim=-1)
56
+
57
+ # Test
58
+ inputs = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)
59
+ output = forward(inputs, weights)
60
+ print(f"MOD-3 of [1,0,1,1,0,0,1,0]: {output.item()}") # 1 (4 bits set, 4 mod 3 = 1)
61
+ ```
62
+
63
+ ## Weight Structure
64
+
65
+ | Tensor | Shape | Values | Description |
66
+ |--------|-------|--------|-------------|
67
+ | `layer1.weight` | [9, 8] | All 1s | Thermometer encoding |
68
+ | `layer1.bias` | [9] | [0, -1, ..., -8] | Threshold at HW ≥ k |
69
+ | `layer2.weight` | [2, 9] | [0,1,1,-2,1,1,-2,1,1] | MOD-3 detection |
70
+ | `layer2.bias` | [2] | [-1, -2] | Class thresholds |
71
+ | `output.weight` | [3, 2] | Various | Classification |
72
+ | `output.bias` | [3] | [0, -1, -1] | Output thresholds |
73
+
74
+ ## Algebraic Insight
75
+
76
+ For parity (MOD-2), the key insight was that ±1 dot products preserve Hamming weight parity because (-1) ≡ 1 (mod 2).
77
+
78
+ For MOD-3, the insight is different. Using weights `(1, 1, -2)` repeated on the thermometer encoding produces partial sums that cycle through `(0, 1, 2, 0, 1, 2, ...)`:
79
+
80
+ ```
81
+ HW=0: cumsum = 0 → 0 mod 3
82
+ HW=1: cumsum = 1 → 1 mod 3
83
+ HW=2: cumsum = 2 → 2 mod 3
84
+ HW=3: cumsum = 0 → 0 mod 3 (reset: 1+1-2=0)
85
+ HW=4: cumsum = 1 → 1 mod 3
86
+ HW=5: cumsum = 2 → 2 mod 3
87
+ HW=6: cumsum = 0 → 0 mod 3 (reset)
88
+ HW=7: cumsum = 1 → 1 mod 3
89
+ HW=8: cumsum = 2 → 2 mod 3
90
+ ```
91
+
92
+ The pattern `1 + 1 + (-2) = 0` causes the cumulative sum to reset every 3 steps, tracking HW mod 3.
93
+
94
+ This generalizes to MOD-m: use weights `(1, 1, ..., 1, 1-m)` with `m-1` ones before the `1-m` term.
95
+
96
+ ## Formal Verification
97
+
98
+ The network is proven correct in the Coq proof assistant with three independent proofs:
99
+
100
+ **1. Exhaustive verification:**
101
+ ```coq
102
+ Theorem network_correct_exhaustive : verify_all = true.
103
+ Proof. vm_compute. reflexivity. Qed.
104
+ ```
105
+
106
+ **2. Constructive verification (case analysis):**
107
+ ```coq
108
+ Theorem network_correct_constructive : forall x0 x1 x2 x3 x4 x5 x6 x7,
109
+ predict [x0; x1; x2; x3; x4; x5; x6; x7] =
110
+ mod3 [x0; x1; x2; x3; x4; x5; x6; x7].
111
+ ```
112
+
113
+ **3. Algebraic verification:**
114
+ ```coq
115
+ Theorem cumsum_eq_mod3 : forall k,
116
+ (k <= 8)%nat -> cumsum k = Z.of_nat (Nat.modulo k 3).
117
+
118
+ Theorem network_algebraic_correct : forall h,
119
+ (h <= 8)%nat ->
120
+ classify (Z.geb (cumsum h - 1) 0) (Z.geb (cumsum h - 2) 0) = Nat.modulo h 3.
121
+ ```
122
+
123
+ All proofs are axiom-free ("Closed under the global context").
124
+
125
+ ## MOD-3 Distribution
126
+
127
+ For 8-bit inputs (256 total):
128
+
129
+ | Class | Count | Hamming Weights |
130
+ |-------|-------|-----------------|
131
+ | 0 | 85 | 0, 3, 6 |
132
+ | 1 | 86 | 1, 4, 7 |
133
+ | 2 | 85 | 2, 5, 8 |
134
+
135
+ ## Training
136
+
137
+ The parametric construction was derived algebraically, not discovered through training.
138
+
139
+ Evolutionary search was attempted (as with parity) but consistently plateaued at 247/256 (96.5%) accuracy across multiple seeds. The (1,1,-2) weight pattern is sufficiently specific that random mutation cannot reliably discover it.
140
+
141
+ This finding reinforces that the algebraic insight is essential—MOD-3 networks cannot be found by naive search alone.
142
+
143
+ ## Comparison to Parity
144
+
145
+ | Property | Parity (MOD-2) | MOD-3 |
146
+ |----------|----------------|-------|
147
+ | Output classes | 2 | 3 |
148
+ | Key weights | ±1 (any) | (1,1,-2) specific |
149
+ | Training | Evolutionary (10K gen) | Algebraic construction |
150
+ | Neurons | 14 (pruned) | 14 |
151
+ | Parameters | 139 (pruned) | 110 |
152
+ | Algebraic insight | (-1) ≡ 1 (mod 2) | 1+1-2 = 0 (reset) |
153
+
154
+ ## Limitations
155
+
156
+ - **Fixed input size**: 8 bits only (algebraic construction extends to any n)
157
+ - **Binary inputs**: Expects {0, 1}, not continuous values
158
+ - **No noise margin**: Heaviside threshold at exactly 0
159
+ - **Not differentiable**: Cannot be fine-tuned with gradient descent
160
+ - **Training gap**: Evolutionary search achieves only 96.5%; algebraic construction required for 100%
161
+
162
+ ## Files
163
+
164
+ ```
165
+ mod3-verified/
166
+ ├── model.safetensors # Network weights (110 params)
167
+ ├── model.py # Inference code
168
+ ├── config.json # Model metadata
169
+ └── README.md # This file
170
+ ```
171
+
172
+ ## Citation
173
+
174
+ ```bibtex
175
+ @software{mod3_verified_2025,
176
+ title={mod3-verified: Formally Verified Threshold Network for MOD-3},
177
+ author={Norton, Charles},
178
+ url={https://huggingface.co/phanerozoic/mod3-verified},
179
+ year={2025},
180
+ note={First verified threshold circuit for modular counting beyond parity}
181
+ }
182
+ ```
183
+
184
+ ## Related
185
+
186
+ - **Proof repository**: [mod3-verified](https://github.com/CharlesCNorton/mod3-verified) — Coq proofs, training attempts, full documentation
187
+ - **Parity network**: [tiny-parity-prover](https://huggingface.co/phanerozoic/tiny-parity-prover) — Verified MOD-2 (parity) network
188
+ - **Parity proofs**: [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified) — Original parity verification project
189
+
190
+ ## License
191
+
192
+ MIT
config.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "threshold_network",
3
+ "task": "mod3_classification",
4
+ "architecture": "8 -> 9 -> 2 -> 3",
5
+ "input_size": 8,
6
+ "hidden1_size": 9,
7
+ "hidden2_size": 2,
8
+ "output_size": 3,
9
+ "num_parameters": 110,
10
+ "num_neurons": 14,
11
+ "activation": "heaviside",
12
+ "weight_constraints": "integer",
13
+ "verification": {
14
+ "method": "coq_proof",
15
+ "exhaustive": true,
16
+ "constructive": true,
17
+ "algebraic": true,
18
+ "axiom_free": true
19
+ },
20
+ "accuracy": {
21
+ "all_inputs": "256/256",
22
+ "percentage": 100.0
23
+ },
24
+ "algebraic_insight": "Weights (1,1,-2) on thermometer encoding produce cumsum = HW mod 3",
25
+ "github": "https://github.com/CharlesCNorton/mod3-verified",
26
+ "related": {
27
+ "parity_model": "https://huggingface.co/phanerozoic/tiny-parity-prover",
28
+ "parity_proofs": "https://github.com/CharlesCNorton/threshold-logic-verified"
29
+ }
30
+ }
model.py ADDED
@@ -0,0 +1,119 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Inference code for mod3-verified threshold network.
3
+
4
+ This network computes MOD-3 (Hamming weight mod 3) on 8-bit binary inputs.
5
+ """
6
+
7
+ import torch
8
+ import torch.nn as nn
9
+ from safetensors.torch import load_file
10
+
11
+
12
+ def heaviside(x):
13
+ """Heaviside step function: 1 if x >= 0, else 0."""
14
+ return (x >= 0).float()
15
+
16
+
17
+ class Mod3Network(nn.Module):
18
+ """
19
+ Verified threshold network for MOD-3 computation.
20
+
21
+ Architecture: 8 -> 9 -> 2 -> 3
22
+ - Layer 1: Thermometer encoding (9 neurons detect HW >= k)
23
+ - Layer 2: MOD-3 detection using (1,1,-2) weight pattern
24
+ - Output: 3-class classification
25
+ """
26
+
27
+ def __init__(self):
28
+ super().__init__()
29
+ self.layer1 = nn.Linear(8, 9)
30
+ self.layer2 = nn.Linear(9, 2)
31
+ self.output = nn.Linear(2, 3)
32
+
33
+ def forward(self, x):
34
+ """Forward pass with Heaviside activation."""
35
+ x = x.float()
36
+ x = heaviside(self.layer1(x))
37
+ x = heaviside(self.layer2(x))
38
+ x = self.output(x)
39
+ return x
40
+
41
+ def predict(self, x):
42
+ """Get predicted class (0, 1, or 2)."""
43
+ return self.forward(x).argmax(dim=-1)
44
+
45
+ @classmethod
46
+ def from_safetensors(cls, path):
47
+ """Load model from safetensors file."""
48
+ model = cls()
49
+ weights = load_file(path)
50
+
51
+ model.layer1.weight.data = weights['layer1.weight']
52
+ model.layer1.bias.data = weights['layer1.bias']
53
+ model.layer2.weight.data = weights['layer2.weight']
54
+ model.layer2.bias.data = weights['layer2.bias']
55
+ model.output.weight.data = weights['output.weight']
56
+ model.output.bias.data = weights['output.bias']
57
+
58
+ return model
59
+
60
+
61
+ def mod3_reference(x):
62
+ """Reference implementation: Hamming weight mod 3."""
63
+ return (x.sum(dim=-1) % 3).long()
64
+
65
+
66
+ def verify(model, verbose=True):
67
+ """Verify model on all 256 inputs."""
68
+ inputs = torch.zeros(256, 8)
69
+ for i in range(256):
70
+ for j in range(8):
71
+ inputs[i, j] = (i >> j) & 1
72
+
73
+ targets = mod3_reference(inputs)
74
+ predictions = model.predict(inputs)
75
+
76
+ correct = (predictions == targets).sum().item()
77
+
78
+ if verbose:
79
+ print(f"Verification: {correct}/256 ({100*correct/256:.1f}%)")
80
+
81
+ if correct < 256:
82
+ errors = (predictions != targets).nonzero(as_tuple=True)[0]
83
+ print(f"Errors at indices: {errors[:10].tolist()}")
84
+
85
+ return correct == 256
86
+
87
+
88
+ def demo():
89
+ """Demonstration of MOD-3 computation."""
90
+ print("Loading mod3-verified model...")
91
+ model = Mod3Network.from_safetensors('model.safetensors')
92
+
93
+ print("\nVerifying on all 256 inputs...")
94
+ verify(model)
95
+
96
+ print("\nExample predictions:")
97
+ test_cases = [
98
+ [0, 0, 0, 0, 0, 0, 0, 0], # HW=0, 0 mod 3 = 0
99
+ [1, 0, 0, 0, 0, 0, 0, 0], # HW=1, 1 mod 3 = 1
100
+ [1, 1, 0, 0, 0, 0, 0, 0], # HW=2, 2 mod 3 = 2
101
+ [1, 1, 1, 0, 0, 0, 0, 0], # HW=3, 3 mod 3 = 0
102
+ [1, 1, 1, 1, 0, 0, 0, 0], # HW=4, 4 mod 3 = 1
103
+ [1, 1, 1, 1, 1, 0, 0, 0], # HW=5, 5 mod 3 = 2
104
+ [1, 1, 1, 1, 1, 1, 0, 0], # HW=6, 6 mod 3 = 0
105
+ [1, 1, 1, 1, 1, 1, 1, 0], # HW=7, 7 mod 3 = 1
106
+ [1, 1, 1, 1, 1, 1, 1, 1], # HW=8, 8 mod 3 = 2
107
+ ]
108
+
109
+ for bits in test_cases:
110
+ x = torch.tensor([bits], dtype=torch.float32)
111
+ hw = sum(bits)
112
+ pred = model.predict(x).item()
113
+ expected = hw % 3
114
+ status = "OK" if pred == expected else "ERROR"
115
+ print(f" {bits} -> HW={hw}, pred={pred}, expected={expected} [{status}]")
116
+
117
+
118
+ if __name__ == '__main__':
119
+ demo()
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a4dfa3053aa3ab1ec347bd4099c326bb03a2fe05da16a6beeed66cfc35bd1b57
3
+ size 864
tmpclaude-9499-cwd ADDED
@@ -0,0 +1 @@
 
 
1
+ /d/mod3-verified/hf