Upload folder using huggingface_hub

Browse files

Files changed (5) hide show

README.md +192 -0
config.json +30 -0
model.py +119 -0
model.safetensors +3 -0
tmpclaude-9499-cwd +1 -0

README.md ADDED Viewed

	@@ -0,0 +1,192 @@

+---
+license: mit
+tags:
+- pytorch
+- safetensors
+- formal-verification
+- coq
+- mod3
+- modular-arithmetic
+- threshold-network
+- neuromorphic
+---
+# mod3-verified
+Formally verified neural network that computes the MOD-3 function (Hamming weight mod 3) on 8-bit inputs. This repository contains the model artifacts; for proof development and Coq source code, see [mod3-verified](https://github.com/CharlesCNorton/mod3-verified).
+## Overview
+This is a threshold network that computes `mod3(x) = HW(x) mod 3` for 8-bit binary inputs, where HW denotes Hamming weight (number of set bits). The network outputs 0, 1, or 2 corresponding to the three residue classes.
+**Key properties:**
+- 100% accuracy on all 256 possible inputs
+- Correctness proven in Coq via constructive algebraic proof
+- Weights constrained to integers (many ternary)
+- Heaviside step activation (x ≥ 0 → 1, else 0)
+- First formally verified threshold circuit for MOD-m where m > 2
+## Architecture
+| Layer | Neurons | Function |
+|-------|---------|----------|
+| Input | 8 | Binary input bits |
+| Hidden 1 | 9 | Thermometer encoding (HW ≥ k) |
+| Hidden 2 | 2 | MOD-3 detection |
+| Output | 3 | Classification (one-hot) |
+**Total: 14 neurons, 110 parameters**
+## Quick Start
+```python
+import torch
+from safetensors.torch import load_file
+# Load weights
+weights = load_file('model.safetensors')
+# Manual forward pass (Heaviside activation)
+def forward(x, weights):
+    x = x.float()
+    x = (x @ weights['layer1.weight'].T + weights['layer1.bias'] >= 0).float()
+    x = (x @ weights['layer2.weight'].T + weights['layer2.bias'] >= 0).float()
+    out = x @ weights['output.weight'].T + weights['output.bias']
+    return out.argmax(dim=-1)
+# Test
+inputs = torch.tensor([[1, 0, 1, 1, 0, 0, 1, 0]], dtype=torch.float32)
+output = forward(inputs, weights)
+print(f"MOD-3 of [1,0,1,1,0,0,1,0]: {output.item()}")  # 1 (4 bits set, 4 mod 3 = 1)
+```
+## Weight Structure
+| Tensor | Shape | Values | Description |
+|--------|-------|--------|-------------|
+| `layer1.weight` | [9, 8] | All 1s | Thermometer encoding |
+| `layer1.bias` | [9] | [0, -1, ..., -8] | Threshold at HW ≥ k |
+| `layer2.weight` | [2, 9] | [0,1,1,-2,1,1,-2,1,1] | MOD-3 detection |
+| `layer2.bias` | [2] | [-1, -2] | Class thresholds |
+| `output.weight` | [3, 2] | Various | Classification |
+| `output.bias` | [3] | [0, -1, -1] | Output thresholds |
+## Algebraic Insight
+For parity (MOD-2), the key insight was that ±1 dot products preserve Hamming weight parity because (-1) ≡ 1 (mod 2).
+For MOD-3, the insight is different. Using weights `(1, 1, -2)` repeated on the thermometer encoding produces partial sums that cycle through `(0, 1, 2, 0, 1, 2, ...)`:
+```
+HW=0: cumsum = 0  →  0 mod 3
+HW=1: cumsum = 1  →  1 mod 3
+HW=2: cumsum = 2  →  2 mod 3
+HW=3: cumsum = 0  →  0 mod 3  (reset: 1+1-2=0)
+HW=4: cumsum = 1  →  1 mod 3
+HW=5: cumsum = 2  →  2 mod 3
+HW=6: cumsum = 0  →  0 mod 3  (reset)
+HW=7: cumsum = 1  →  1 mod 3
+HW=8: cumsum = 2  →  2 mod 3
+```
+The pattern `1 + 1 + (-2) = 0` causes the cumulative sum to reset every 3 steps, tracking HW mod 3.
+This generalizes to MOD-m: use weights `(1, 1, ..., 1, 1-m)` with `m-1` ones before the `1-m` term.
+## Formal Verification
+The network is proven correct in the Coq proof assistant with three independent proofs:
+**1. Exhaustive verification:**
+```coq
+Theorem network_correct_exhaustive : verify_all = true.
+Proof. vm_compute. reflexivity. Qed.
+```
+**2. Constructive verification (case analysis):**
+```coq
+Theorem network_correct_constructive : forall x0 x1 x2 x3 x4 x5 x6 x7,
+  predict [x0; x1; x2; x3; x4; x5; x6; x7] =
+  mod3 [x0; x1; x2; x3; x4; x5; x6; x7].
+```
+**3. Algebraic verification:**
+```coq
+Theorem cumsum_eq_mod3 : forall k,
+  (k <= 8)%nat -> cumsum k = Z.of_nat (Nat.modulo k 3).
+Theorem network_algebraic_correct : forall h,
+  (h <= 8)%nat ->
+  classify (Z.geb (cumsum h - 1) 0) (Z.geb (cumsum h - 2) 0) = Nat.modulo h 3.
+```
+All proofs are axiom-free ("Closed under the global context").
+## MOD-3 Distribution
+For 8-bit inputs (256 total):
+| Class | Count | Hamming Weights |
+|-------|-------|-----------------|
+| 0 | 85 | 0, 3, 6 |
+| 1 | 86 | 1, 4, 7 |
+| 2 | 85 | 2, 5, 8 |
+## Training
+The parametric construction was derived algebraically, not discovered through training.
+Evolutionary search was attempted (as with parity) but consistently plateaued at 247/256 (96.5%) accuracy across multiple seeds. The (1,1,-2) weight pattern is sufficiently specific that random mutation cannot reliably discover it.
+This finding reinforces that the algebraic insight is essential—MOD-3 networks cannot be found by naive search alone.
+## Comparison to Parity
+| Property | Parity (MOD-2) | MOD-3 |
+|----------|----------------|-------|
+| Output classes | 2 | 3 |
+| Key weights | ±1 (any) | (1,1,-2) specific |
+| Training | Evolutionary (10K gen) | Algebraic construction |
+| Neurons | 14 (pruned) | 14 |
+| Parameters | 139 (pruned) | 110 |
+| Algebraic insight | (-1) ≡ 1 (mod 2) | 1+1-2 = 0 (reset) |
+## Limitations
+- **Fixed input size**: 8 bits only (algebraic construction extends to any n)
+- **Binary inputs**: Expects {0, 1}, not continuous values
+- **No noise margin**: Heaviside threshold at exactly 0
+- **Not differentiable**: Cannot be fine-tuned with gradient descent
+- **Training gap**: Evolutionary search achieves only 96.5%; algebraic construction required for 100%
+## Files
+```
+mod3-verified/
+├── model.safetensors    # Network weights (110 params)
+├── model.py             # Inference code
+├── config.json          # Model metadata
+└── README.md            # This file
+```
+## Citation
+```bibtex
+@software{mod3_verified_2025,
+  title={mod3-verified: Formally Verified Threshold Network for MOD-3},
+  author={Norton, Charles},
+  url={https://huggingface.co/phanerozoic/mod3-verified},
+  year={2025},
+  note={First verified threshold circuit for modular counting beyond parity}
+}
+```
+## Related
+- **Proof repository**: [mod3-verified](https://github.com/CharlesCNorton/mod3-verified) — Coq proofs, training attempts, full documentation
+- **Parity network**: [tiny-parity-prover](https://huggingface.co/phanerozoic/tiny-parity-prover) — Verified MOD-2 (parity) network
+- **Parity proofs**: [threshold-logic-verified](https://github.com/CharlesCNorton/threshold-logic-verified) — Original parity verification project
+## License
+MIT

config.json ADDED Viewed

	@@ -0,0 +1,30 @@

+{
+  "model_type": "threshold_network",
+  "task": "mod3_classification",
+  "architecture": "8 -> 9 -> 2 -> 3",
+  "input_size": 8,
+  "hidden1_size": 9,
+  "hidden2_size": 2,
+  "output_size": 3,
+  "num_parameters": 110,
+  "num_neurons": 14,
+  "activation": "heaviside",
+  "weight_constraints": "integer",
+  "verification": {
+    "method": "coq_proof",
+    "exhaustive": true,
+    "constructive": true,
+    "algebraic": true,
+    "axiom_free": true
+  },
+  "accuracy": {
+    "all_inputs": "256/256",
+    "percentage": 100.0
+  },
+  "algebraic_insight": "Weights (1,1,-2) on thermometer encoding produce cumsum = HW mod 3",
+  "github": "https://github.com/CharlesCNorton/mod3-verified",
+  "related": {
+    "parity_model": "https://huggingface.co/phanerozoic/tiny-parity-prover",
+    "parity_proofs": "https://github.com/CharlesCNorton/threshold-logic-verified"
+  }
+}

model.py ADDED Viewed

	@@ -0,0 +1,119 @@

+"""
+Inference code for mod3-verified threshold network.
+This network computes MOD-3 (Hamming weight mod 3) on 8-bit binary inputs.
+"""
+import torch
+import torch.nn as nn
+from safetensors.torch import load_file
+def heaviside(x):
+    """Heaviside step function: 1 if x >= 0, else 0."""
+    return (x >= 0).float()
+class Mod3Network(nn.Module):
+    """
+    Verified threshold network for MOD-3 computation.
+    Architecture: 8 -> 9 -> 2 -> 3
+    - Layer 1: Thermometer encoding (9 neurons detect HW >= k)
+    - Layer 2: MOD-3 detection using (1,1,-2) weight pattern
+    - Output: 3-class classification
+    """
+    def __init__(self):
+        super().__init__()
+        self.layer1 = nn.Linear(8, 9)
+        self.layer2 = nn.Linear(9, 2)
+        self.output = nn.Linear(2, 3)
+    def forward(self, x):
+        """Forward pass with Heaviside activation."""
+        x = x.float()
+        x = heaviside(self.layer1(x))
+        x = heaviside(self.layer2(x))
+        x = self.output(x)
+        return x
+    def predict(self, x):
+        """Get predicted class (0, 1, or 2)."""
+        return self.forward(x).argmax(dim=-1)
+    @classmethod
+    def from_safetensors(cls, path):
+        """Load model from safetensors file."""
+        model = cls()
+        weights = load_file(path)
+        model.layer1.weight.data = weights['layer1.weight']
+        model.layer1.bias.data = weights['layer1.bias']
+        model.layer2.weight.data = weights['layer2.weight']
+        model.layer2.bias.data = weights['layer2.bias']
+        model.output.weight.data = weights['output.weight']
+        model.output.bias.data = weights['output.bias']
+        return model
+def mod3_reference(x):
+    """Reference implementation: Hamming weight mod 3."""
+    return (x.sum(dim=-1) % 3).long()
+def verify(model, verbose=True):
+    """Verify model on all 256 inputs."""
+    inputs = torch.zeros(256, 8)
+    for i in range(256):
+        for j in range(8):
+            inputs[i, j] = (i >> j) & 1
+    targets = mod3_reference(inputs)
+    predictions = model.predict(inputs)
+    correct = (predictions == targets).sum().item()
+    if verbose:
+        print(f"Verification: {correct}/256 ({100*correct/256:.1f}%)")
+        if correct < 256:
+            errors = (predictions != targets).nonzero(as_tuple=True)[0]
+            print(f"Errors at indices: {errors[:10].tolist()}")
+    return correct == 256
+def demo():
+    """Demonstration of MOD-3 computation."""
+    print("Loading mod3-verified model...")
+    model = Mod3Network.from_safetensors('model.safetensors')
+    print("\nVerifying on all 256 inputs...")
+    verify(model)
+    print("\nExample predictions:")
+    test_cases = [
+        [0, 0, 0, 0, 0, 0, 0, 0],  # HW=0, 0 mod 3 = 0
+        [1, 0, 0, 0, 0, 0, 0, 0],  # HW=1, 1 mod 3 = 1
+        [1, 1, 0, 0, 0, 0, 0, 0],  # HW=2, 2 mod 3 = 2
+        [1, 1, 1, 0, 0, 0, 0, 0],  # HW=3, 3 mod 3 = 0
+        [1, 1, 1, 1, 0, 0, 0, 0],  # HW=4, 4 mod 3 = 1
+        [1, 1, 1, 1, 1, 0, 0, 0],  # HW=5, 5 mod 3 = 2
+        [1, 1, 1, 1, 1, 1, 0, 0],  # HW=6, 6 mod 3 = 0
+        [1, 1, 1, 1, 1, 1, 1, 0],  # HW=7, 7 mod 3 = 1
+        [1, 1, 1, 1, 1, 1, 1, 1],  # HW=8, 8 mod 3 = 2
+    ]
+    for bits in test_cases:
+        x = torch.tensor([bits], dtype=torch.float32)
+        hw = sum(bits)
+        pred = model.predict(x).item()
+        expected = hw % 3
+        status = "OK" if pred == expected else "ERROR"
+        print(f"  {bits} -> HW={hw}, pred={pred}, expected={expected} [{status}]")
+if __name__ == '__main__':
+    demo()

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a4dfa3053aa3ab1ec347bd4099c326bb03a2fe05da16a6beeed66cfc35bd1b57
+size 864

tmpclaude-9499-cwd ADDED Viewed

	@@ -0,0 +1 @@


1	+ /d/mod3-verified/hf