phanerozoic
/

threshold-calculus

@@ -14,9 +14,13 @@ pipeline_tag: other
 # Threshold Calculus
-**Verified arithmetic circuits as frozen neural network weights.**
-This repository contains an arithmetic core implemented as threshold logic gates stored in safetensors format. Every tensor represents a neural network weight or bias that, when combined with a Heaviside step activation function, computes exact arithmetic operations. All circuits are exhaustively tested across all possible inputs (100% pass rate).
 ---
@@ -576,31 +580,176 @@ While we have tested exhaustively where feasible, the 8x8 multiplier and 8-bit d
 ---
-## Future Work
-### Immediate Priorities
-1. **Floating-Point Circuits**: Implement IEEE 754 half-precision (16-bit) floating-point addition, subtraction, multiplication, and division. This addresses the most significant gap for LLM integration.
-2. **Pruning Experiments**: Systematically explore weight pruning, quantization, and structural compression while maintaining correctness.
-3. **Integration Prototype**: Build a proof-of-concept integration with a small language model to validate the architecture.
-### Medium-Term Goals
-1. **16-bit Arithmetic**: Extend integer operations to 16 bits for greater precision.
-2. **Square Root**: Implement integer square root using Newton-Raphson iteration built from existing primitives.
-3. **Transcendental Approximations**: Build CORDIC or polynomial approximations for sin, cos, exp, log using the arithmetic core.
-### Long-Term Vision
-1. **Resume CPU Development**: The 8-bit CPU project (phanerozoic/8bit-threshold-computer) will continue. Once the arithmetic core is mature, we will reintegrate it with CPU control logic.
-2. **Hardware Synthesis**: Generate Verilog or other HDL from the threshold logic representation for FPGA or ASIC implementation.
-3. **Formal Verification**: Prove correctness formally using theorem provers rather than exhaustive testing.
 ---

 # Threshold Calculus
+**Arithmetic coprocessor for LLMs, implemented as threshold logic gates.**
+This is a runtime component, not a proof artifact. The circuits embed directly into transformer MLP layers as a reusable arithmetic unit. The model learns when to route through circuits vs standard MLP paths. Inference runs in PyTorch.
+Early training runs embedding these circuits into SmolLM2-360M show significant accuracy improvements on arithmetic tasks.
+The repository contains an arithmetic core implemented as threshold logic gates stored in safetensors format. Every tensor represents a neural network weight or bias that, when combined with a Heaviside step activation function, computes exact arithmetic operations. All circuits pass exhaustive testing (7,177 tests, 100% pass rate).
 ---
 ---
+## Roadmap
+Goal: Complete arithmetic coprocessor for LLM mathematical reasoning.
+### Completed
+#### Float16 Core Arithmetic
+- [x] `float16.add` — IEEE 754 addition (~998 gates)
+- [x] `float16.sub` — IEEE 754 subtraction
+- [x] `float16.mul` — IEEE 754 multiplication (~1302 gates)
+- [x] `float16.div` — IEEE 754 division (~1854 gates)
+- [x] `float16.neg` — sign flip
+- [x] `float16.abs` — absolute value
+- [x] `float16.cmp` — comparison
+#### Float16 Utilities
+- [x] `float16.unpack` — extract sign, exponent, mantissa
+- [x] `float16.pack` — assemble components
+- [x] `float16.normalize` — CLZ-based normalization
+- [x] `float16.toint` — convert to int16
+- [x] `float16.fromint` — convert from int16
+#### Integer Arithmetic (8-bit)
+- [x] Adders (half, full, ripple carry 2/4/8 bit)
+- [x] Subtraction, negation
+- [x] Multiplication (2x2, 4x4, 8x8)
+- [x] Division (8-bit with remainder)
+- [x] Comparators (all relations)
+- [x] CLZ (8-bit and 16-bit)
+#### Logic and Patterns
+- [x] Boolean gates (AND, OR, NOT, NAND, NOR, XOR, XNOR, IMPLIES, BIIMPLIES)
+- [x] Threshold gates (k-of-n for k=1..8)
+- [x] Modular arithmetic (mod 2-12)
+- [x] Pattern recognition (popcount, one-hot, symmetry)
+- [x] Combinational (mux, demux, encoder, decoder, barrel shifter)
+- [x] Shifts and rotates
+#### Infrastructure
+- [x] Self-documenting .inputs tensors
+- [x] Signal registry in safetensors metadata
+- [x] Full circuit evaluation with topological sort
+- [x] Comprehensive test suite (7,177 tests, 100% pass)
+---
+### High Priority — Core Mathematical Functions
+#### Powers and Roots (float16)
+- [ ] `float16.sqrt` — square root via Newton-Raphson or digit-by-digit
+- [ ] `float16.rsqrt` — reciprocal square root (useful for normalization)
+- [ ] `float16.pow` — x^y for arbitrary y (via exp/ln)
+- [ ] `float16.sq` — x² (optimized special case)
+- [ ] `float16.cube` — x³ (optimized special case)
+- [ ] `float16.cbrt` — cube root
+#### Exponentials and Logarithms (float16)
+- [ ] `float16.exp` — e^x via range reduction + polynomial
+- [ ] `float16.exp2` — 2^x (simpler, useful for pow)
+- [ ] `float16.ln` — natural logarithm
+- [ ] `float16.log2` — base-2 logarithm (extract exponent + correction)
+- [ ] `float16.log10` — base-10 logarithm
+#### Trigonometry (float16, CORDIC)
+- [ ] `float16.sin` — sine
+- [ ] `float16.cos` — cosine
+- [ ] `float16.tan` — tangent (sin/cos)
+- [ ] `float16.sincos` — both sin and cos (CORDIC gives both)
+- [ ] `float16.asin` — arc sine
+- [ ] `float16.acos` — arc cosine
+- [ ] `float16.atan` — arc tangent
+- [ ] `float16.atan2` — two-argument arc tangent (quadrant-aware)
+#### Hyperbolic Functions (float16)
+- [ ] `float16.sinh` — hyperbolic sine
+- [ ] `float16.cosh` — hyperbolic cosine
+- [ ] `float16.tanh` — hyperbolic tangent (critical for ML activations)
+---
+### Medium Priority — Extended Operations
+#### Rounding and Truncation (float16)
+- [ ] `float16.floor` — round toward -∞
+- [ ] `float16.ceil` — round toward +∞
+- [ ] `float16.trunc` — round toward zero
+- [ ] `float16.round` — round to nearest
+- [ ] `float16.frac` — fractional part
+- [ ] `float16.fmod` — floating-point modulo
+#### Comparisons and Selection (float16)
+- [ ] `float16.min` — minimum of two values
+- [ ] `float16.max` — maximum of two values
+- [ ] `float16.clamp` — clamp to range [lo, hi]
+- [ ] `float16.sign` — sign function (-1, 0, +1)
+- [ ] `float16.copysign` — copy sign from y to x
+- [ ] `float16.isnan` — NaN test
+- [ ] `float16.isinf` — infinity test
+- [ ] `float16.isfinite` — finite test
+#### Integer Arithmetic (16-bit)
+- [ ] `arithmetic.add16` — 16-bit addition
+- [ ] `arithmetic.sub16` — 16-bit subtraction
+- [ ] `arithmetic.mul16` — 16-bit multiplication
+- [ ] `arithmetic.div16` — 16-bit division with remainder
+- [ ] `arithmetic.sqrt16` — 16-bit integer square root
+- [ ] `arithmetic.abs16` — 16-bit absolute value
+#### Number Theory
+- [ ] `arithmetic.gcd` — greatest common divisor (Euclidean)
+- [ ] `arithmetic.lcm` — least common multiple
+- [ ] `arithmetic.isprime8` — primality test (8-bit)
+- [ ] `arithmetic.factorial8` — factorial (8! = 40320 fits in 16-bit)
+- [ ] `arithmetic.comb` — binomial coefficient nCr
+- [ ] `arithmetic.perm` — permutation nPr
+---
+### Lower Priority — Specialized Functions
+#### ML Activation Functions (float16)
+- [ ] `float16.relu` — max(0, x)
+- [ ] `float16.leaky_relu` — x if x > 0 else αx
+- [ ] `float16.sigmoid` — 1/(1+e^(-x))
+- [ ] `float16.softplus` — ln(1+e^x)
+- [ ] `float16.gelu` — Gaussian error linear unit
+- [ ] `float16.silu` — x * sigmoid(x)
+#### Constants (float16 encoded)
+- [ ] `const.pi` — π = 3.14159...
+- [ ] `const.e` — e = 2.71828...
+- [ ] `const.phi` — φ = 1.61803... (golden ratio)
+- [ ] `const.sqrt2` — √2 = 1.41421...
+- [ ] `const.ln2` — ln(2) = 0.69314...
+- [ ] `const.log2e` — log₂(e) = 1.44269...
+#### Statistics (float16, multi-input)
+- [ ] `stats.sum` — sum of array
+- [ ] `stats.mean` — arithmetic mean
+- [ ] `stats.min_array` — minimum of array
+- [ ] `stats.max_array` — maximum of array
+- [ ] `stats.variance` — population variance
+- [ ] `stats.stddev` — standard deviation
+#### Bit Manipulation (16-bit)
+- [ ] `bits.popcnt16` — population count
+- [ ] `bits.clz16` — count leading zeros (done)
+- [ ] `bits.ctz16` — count trailing zeros
+- [ ] `bits.reverse16` — bit reversal
+- [ ] `bits.bswap16` — byte swap
+---
+### Infrastructure TODO
+#### Testing
+- [ ] Exhaustive float16 tests for new operations
+- [ ] Edge case coverage (±0, ±inf, NaN, subnormals)
+- [ ] Accuracy tests against reference implementations
+#### Documentation
+- [ ] Circuit diagrams for CORDIC, Newton-Raphson
+- [ ] Tutorial: implementing new circuits
+- [ ] Tutorial: LLM integration patterns
+- [ ] API reference for all operations
+#### Optimization
+- [ ] Gate count reduction analysis
+- [ ] Critical path optimization
+- [ ] Weight quantization study (int8/int4)
 ---

TODO.md DELETED Viewed

@@ -1,172 +0,0 @@
-# Threshold Calculus TODO
-Goal: Complete arithmetic coprocessor for LLM mathematical reasoning.
----
-## High Priority -- Core Mathematical Functions
-### Powers and Roots (float16)
-- [ ] `float16.sqrt` -- square root via Newton-Raphson or digit-by-digit
-- [ ] `float16.rsqrt` -- reciprocal square root (useful for normalization)
-- [ ] `float16.pow` -- x^y for arbitrary y (via exp/ln)
-- [ ] `float16.sq` -- x² (optimized special case)
-- [ ] `float16.cube` -- x³ (optimized special case)
-- [ ] `float16.cbrt` -- cube root
-### Exponentials and Logarithms (float16)
-- [ ] `float16.exp` -- e^x via range reduction + polynomial
-- [ ] `float16.exp2` -- 2^x (simpler, useful for pow)
-- [ ] `float16.ln` -- natural logarithm
-- [ ] `float16.log2` -- base-2 logarithm (extract exponent + correction)
-- [ ] `float16.log10` -- base-10 logarithm
-### Trigonometry (float16, CORDIC)
-- [ ] `float16.sin` -- sine
-- [ ] `float16.cos` -- cosine
-- [ ] `float16.tan` -- tangent (sin/cos)
-- [ ] `float16.sincos` -- both sin and cos (CORDIC gives both)
-- [ ] `float16.asin` -- arc sine
-- [ ] `float16.acos` -- arc cosine
-- [ ] `float16.atan` -- arc tangent
-- [ ] `float16.atan2` -- two-argument arc tangent (quadrant-aware)
-### Hyperbolic Functions (float16)
-- [ ] `float16.sinh` -- hyperbolic sine
-- [ ] `float16.cosh` -- hyperbolic cosine
-- [ ] `float16.tanh` -- hyperbolic tangent (critical for ML activations)
----
-## Medium Priority -- Extended Operations
-### Rounding and Truncation (float16)
-- [ ] `float16.floor` -- round toward -∞
-- [ ] `float16.ceil` -- round toward +∞
-- [ ] `float16.trunc` -- round toward zero
-- [ ] `float16.round` -- round to nearest
-- [ ] `float16.frac` -- fractional part
-- [ ] `float16.fmod` -- floating-point modulo
-### Comparisons and Selection (float16)
-- [ ] `float16.min` -- minimum of two values
-- [ ] `float16.max` -- maximum of two values
-- [ ] `float16.clamp` -- clamp to range [lo, hi]
-- [ ] `float16.sign` -- sign function (-1, 0, +1)
-- [ ] `float16.copysign` -- copy sign from y to x
-- [ ] `float16.isnan` -- NaN test
-- [ ] `float16.isinf` -- infinity test
-- [ ] `float16.isfinite` -- finite test
-### Integer Arithmetic (16-bit)
-- [ ] `arithmetic.add16` -- 16-bit addition
-- [ ] `arithmetic.sub16` -- 16-bit subtraction
-- [ ] `arithmetic.mul16` -- 16-bit multiplication
-- [ ] `arithmetic.div16` -- 16-bit division with remainder
-- [ ] `arithmetic.sqrt16` -- 16-bit integer square root
-- [ ] `arithmetic.abs16` -- 16-bit absolute value
-### Number Theory
-- [ ] `arithmetic.gcd` -- greatest common divisor (Euclidean)
-- [ ] `arithmetic.lcm` -- least common multiple
-- [ ] `arithmetic.isprime8` -- primality test (8-bit)
-- [ ] `arithmetic.factorial8` -- factorial (8! = 40320 fits in 16-bit)
-- [ ] `arithmetic.comb` -- binomial coefficient nCr
-- [ ] `arithmetic.perm` -- permutation nPr
----
-## Lower Priority -- Specialized Functions
-### ML Activation Functions (float16)
-- [ ] `float16.relu` -- max(0, x)
-- [ ] `float16.leaky_relu` -- x if x > 0 else αx
-- [ ] `float16.sigmoid` -- 1/(1+e^(-x))
-- [ ] `float16.softplus` -- ln(1+e^x)
-- [ ] `float16.gelu` -- Gaussian error linear unit
-- [ ] `float16.silu` -- x * sigmoid(x)
-### Constants (float16 encoded)
-- [ ] `const.pi` -- π = 3.14159...
-- [ ] `const.e` -- e = 2.71828...
-- [ ] `const.phi` -- φ = 1.61803... (golden ratio)
-- [ ] `const.sqrt2` -- √2 = 1.41421...
-- [ ] `const.ln2` -- ln(2) = 0.69314...
-- [ ] `const.log2e` -- log₂(e) = 1.44269...
-### Statistics (float16, multi-input)
-- [ ] `stats.sum` -- sum of array
-- [ ] `stats.mean` -- arithmetic mean
-- [ ] `stats.min_array` -- minimum of array
-- [ ] `stats.max_array` -- maximum of array
-- [ ] `stats.variance` -- population variance
-- [ ] `stats.stddev` -- standard deviation
-### Bit Manipulation (16-bit)
-- [ ] `bits.popcnt16` -- population count
-- [ ] `bits.clz16` -- count leading zeros (done)
-- [ ] `bits.ctz16` -- count trailing zeros
-- [ ] `bits.reverse16` -- bit reversal
-- [ ] `bits.bswap16` -- byte swap
----
-## Infrastructure
-### Testing
-- [ ] Exhaustive float16 tests for new operations
-- [ ] Edge case coverage (±0, ±inf, NaN, subnormals)
-- [ ] Accuracy tests against reference implementations
-### Documentation
-- [ ] Circuit diagrams for CORDIC, Newton-Raphson
-- [ ] Tutorial: implementing new circuits
-- [ ] Tutorial: LLM integration patterns
-- [ ] API reference for all operations
-### Optimization
-- [ ] Gate count reduction analysis
-- [ ] Critical path optimization
-- [ ] Weight quantization study (int8/int4)
----
-## Completed
-### Float16 Core Arithmetic
-- [x] `float16.add` -- IEEE 754 addition (~998 gates)
-- [x] `float16.sub` -- IEEE 754 subtraction
-- [x] `float16.mul` -- IEEE 754 multiplication (~1302 gates)
-- [x] `float16.div` -- IEEE 754 division (~1854 gates)
-- [x] `float16.neg` -- sign flip
-- [x] `float16.abs` -- absolute value
-- [x] `float16.cmp` -- comparison
-### Float16 Utilities
-- [x] `float16.unpack` -- extract sign, exponent, mantissa
-- [x] `float16.pack` -- assemble components
-- [x] `float16.normalize` -- CLZ-based normalization
-- [x] `float16.toint` -- convert to int16
-- [x] `float16.fromint` -- convert from int16
-### Integer Arithmetic (8-bit)
-- [x] Adders (half, full, ripple carry 2/4/8 bit)
-- [x] Subtraction, negation
-- [x] Multiplication (2x2, 4x4, 8x8)
-- [x] Division (8-bit with remainder)
-- [x] Comparators (all relations)
-- [x] CLZ (8-bit and 16-bit)
-### Logic and Patterns
-- [x] Boolean gates (AND, OR, NOT, NAND, NOR, XOR, XNOR, IMPLIES, BIIMPLIES)
-- [x] Threshold gates (k-of-n for k=1..8)
-- [x] Modular arithmetic (mod 2-12)
-- [x] Pattern recognition (popcount, one-hot, symmetry)
-- [x] Combinational (mux, demux, encoder, decoder, barrel shifter)
-- [x] Shifts and rotates
-### Infrastructure
-- [x] Self-documenting .inputs tensors
-- [x] Signal registry in safetensors metadata
-- [x] Full circuit evaluation with topological sort
-- [x] Comprehensive test suite (7177 tests, 100% pass)