phanerozoic's picture
Upload folder using huggingface_hub
5648530 verified
---
license: mit
tags:
- pytorch
- safetensors
- threshold-logic
- neuromorphic
- arithmetic
- multiplier
- compressor
---
# threshold-4to2-compressor
4:2 compressor for high-speed multiplier trees. Reduces 4 input bits plus carry-in to 2 output bits plus carry-out while preserving arithmetic value.
## Circuit
```
x y z w cin
β”‚ β”‚ β”‚ β”‚ β”‚
β””β”€β”€β”¬β”€β”€β”€β”΄β”€β”€β”¬β”€β”€β”€β”΄β”€β”€β”¬β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚ β”‚
β–Ό β”‚ β”‚ β”‚
β”Œβ”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚
β”‚XOR β”‚ β”‚ β”‚ β”‚
β”‚(x,y)β”‚ β”‚ β”‚ β”‚
β””β”€β”€β”¬β”€β”€β”˜ β”‚ β”‚ β”‚
β”‚ β”‚ β”‚ β”‚
β–Ό β–Ό β”‚ β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚
β”‚ XOR(xy,z) β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
β”‚ β”‚ β”‚
β–Ό β–Ό β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ XOR(xyz,w) β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚
β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ XOR(xyzw, cin) │───► Sum
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
cout = MAJ(x,y,z) (independent of w, cin)
carry = MAJ(XOR(x,y,z), w, cin)
```
## Function
```
compress_4to2(x, y, z, w, cin) -> (sum, carry, cout)
Invariant: x + y + z + w + cin = sum + 2*carry + 2*cout
```
## Truth Table (partial - 32 combinations)
| x | y | z | w | cin | sum | carry | cout | verify |
|---|---|---|---|-----|-----|-------|------|--------|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0=0 |
| 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1=1 |
| 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 2=2 |
| 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 3=3 |
| 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 4=4 |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 5=5 |
Input sum range: 0 to 5
Output encoding: sum + 2*carry + 2*cout (range 0-5)
## Mechanism
The 4:2 compressor is built from two cascaded 3:2 compressors with a twist:
**Stage 1: Compress (x, y, z)**
- sum1 = x XOR y XOR z
- cout = MAJ(x, y, z) ← This goes to next column
**Stage 2: Compress (sum1, w, cin)**
- sum = sum1 XOR w XOR cin
- carry = MAJ(sum1, w, cin) ← This goes to next column
Key insight: The cout is computed early and can propagate horizontally while sum/carry are still being computed.
## Architecture
| Component | Function | Neurons | Layers |
|-----------|----------|---------|--------|
| XOR(x,y) | First pair | 3 | 2 |
| XOR(xy,z) | Add third | 3 | 2 |
| MAJ(x,y,z) | cout | 1 | 1 |
| XOR(xyz,w) | Add fourth | 3 | 2 |
| XOR(xyzw,cin) | sum | 3 | 2 |
| MAJ(xyz,w,cin) | carry | 1 | 1 |
**Total: 14 neurons**
## Parameters
| | |
|---|---|
| Inputs | 5 (x, y, z, w, cin) |
| Outputs | 3 (sum, carry, cout) |
| Neurons | 14 |
| Layers | 8 |
| Parameters | 44 |
| Magnitude | 46 |
## Delay Analysis
Critical path for sum: 4 XOR stages = 8 layers
Critical path for carry: 4 XOR stages + 1 MAJ = 9 layers
Critical path for cout: 1 MAJ = 1 layer (very fast!)
The early cout enables fast horizontal carry propagation in multiplier arrays.
## Usage
```python
from safetensors.torch import load_file
import torch
w = load_file('model.safetensors')
def compress_4to2(x, y, z, w_in, cin):
# Implementation details in model.py
pass
# Example: sum of 5 bits
s, carry, cout = compress_4to2(1, 1, 1, 1, 1)
print(f"1+1+1+1+1 = {s} + 2*{carry} + 2*{cout} = {s + 2*carry + 2*cout}")
# Output: 1+1+1+1+1 = 1 + 2*1 + 2*1 = 5
```
## Applications
- Booth multipliers (radix-4)
- Wallace/Dadda tree reduction
- FMA (fused multiply-add) units
- High-performance DSP
## Comparison with 3:2 Compressor
| Property | 3:2 | 4:2 |
|----------|-----|-----|
| Inputs | 3 | 5 (4 + cin) |
| Outputs | 2 | 3 (2 + cout) |
| Reduction ratio | 3β†’2 | 4β†’2 per column |
| Neurons | 7 | 14 |
| Tree depth for n bits | O(log₁.β‚… n) | O(logβ‚‚ n) |
4:2 compressors provide faster reduction in multiplier trees.
## Files
```
threshold-4to2-compressor/
β”œβ”€β”€ model.safetensors
β”œβ”€β”€ model.py
β”œβ”€β”€ create_safetensors.py
β”œβ”€β”€ config.json
└── README.md
```
## License
MIT