--- license: mit tags: - pytorch - safetensors - threshold-logic - neuromorphic - arithmetic - multiplier - compressor --- # threshold-4to2-compressor 4:2 compressor for high-speed multiplier trees. Reduces 4 input bits plus carry-in to 2 output bits plus carry-out while preserving arithmetic value. ## Circuit ``` x y z w cin │ │ │ │ │ └──┬───┴──┬───┴──┬───┘ │ │ │ │ │ ▼ │ │ │ ┌─────┐ │ │ │ │XOR │ │ │ │ │(x,y)│ │ │ │ └──┬──┘ │ │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────────┐ │ │ │ XOR(xy,z) │ │ │ └──────┬──────┘ │ │ │ │ │ ▼ ▼ │ ┌──────────────┐ │ │ XOR(xyz,w) │ │ └──────┬───────┘ │ │ │ ▼ ▼ ┌─────────────────────┐ │ XOR(xyzw, cin) │───► Sum └─────────────────────┘ cout = MAJ(x,y,z) (independent of w, cin) carry = MAJ(XOR(x,y,z), w, cin) ``` ## Function ``` compress_4to2(x, y, z, w, cin) -> (sum, carry, cout) Invariant: x + y + z + w + cin = sum + 2*carry + 2*cout ``` ## Truth Table (partial - 32 combinations) | x | y | z | w | cin | sum | carry | cout | verify | |---|---|---|---|-----|-----|-------|------|--------| | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0=0 | | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1=1 | | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 2=2 | | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 3=3 | | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 4=4 | | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 5=5 | Input sum range: 0 to 5 Output encoding: sum + 2*carry + 2*cout (range 0-5) ## Mechanism The 4:2 compressor is built from two cascaded 3:2 compressors with a twist: **Stage 1: Compress (x, y, z)** - sum1 = x XOR y XOR z - cout = MAJ(x, y, z) ← This goes to next column **Stage 2: Compress (sum1, w, cin)** - sum = sum1 XOR w XOR cin - carry = MAJ(sum1, w, cin) ← This goes to next column Key insight: The cout is computed early and can propagate horizontally while sum/carry are still being computed. ## Architecture | Component | Function | Neurons | Layers | |-----------|----------|---------|--------| | XOR(x,y) | First pair | 3 | 2 | | XOR(xy,z) | Add third | 3 | 2 | | MAJ(x,y,z) | cout | 1 | 1 | | XOR(xyz,w) | Add fourth | 3 | 2 | | XOR(xyzw,cin) | sum | 3 | 2 | | MAJ(xyz,w,cin) | carry | 1 | 1 | **Total: 14 neurons** ## Parameters | | | |---|---| | Inputs | 5 (x, y, z, w, cin) | | Outputs | 3 (sum, carry, cout) | | Neurons | 14 | | Layers | 8 | | Parameters | 44 | | Magnitude | 46 | ## Delay Analysis Critical path for sum: 4 XOR stages = 8 layers Critical path for carry: 4 XOR stages + 1 MAJ = 9 layers Critical path for cout: 1 MAJ = 1 layer (very fast!) The early cout enables fast horizontal carry propagation in multiplier arrays. ## Usage ```python from safetensors.torch import load_file import torch w = load_file('model.safetensors') def compress_4to2(x, y, z, w_in, cin): # Implementation details in model.py pass # Example: sum of 5 bits s, carry, cout = compress_4to2(1, 1, 1, 1, 1) print(f"1+1+1+1+1 = {s} + 2*{carry} + 2*{cout} = {s + 2*carry + 2*cout}") # Output: 1+1+1+1+1 = 1 + 2*1 + 2*1 = 5 ``` ## Applications - Booth multipliers (radix-4) - Wallace/Dadda tree reduction - FMA (fused multiply-add) units - High-performance DSP ## Comparison with 3:2 Compressor | Property | 3:2 | 4:2 | |----------|-----|-----| | Inputs | 3 | 5 (4 + cin) | | Outputs | 2 | 3 (2 + cout) | | Reduction ratio | 3→2 | 4→2 per column | | Neurons | 7 | 14 | | Tree depth for n bits | O(log₁.₅ n) | O(log₂ n) | 4:2 compressors provide faster reduction in multiplier trees. ## Files ``` threshold-4to2-compressor/ ├── model.safetensors ├── model.py ├── create_safetensors.py ├── config.json └── README.md ``` ## License MIT