Expand README with full documentation

- Add detailed circuit diagrams
- Add threshold implementation details with weights/biases
- Add complete usage code examples
- Add truth tables with examples
- Add academic references
- Add verification and files sections

Files changed (1) hide show

README.md +131 -17

README.md CHANGED Viewed

@@ -7,45 +7,159 @@ tags:
 - neuromorphic
 - arithmetic
 - parallel-prefix
 ---
 # threshold-han-carlson
-4-bit Han-Carlson parallel prefix adder. Hybrid combining Kogge-Stone speed with Brent-Kung area efficiency.
-## Circuit
 ```
-Inputs: A[3:0], B[3:0], Cin (9 inputs)
-Outputs: S[3:0], Cout (5 outputs)
 ```
-## Han-Carlson Structure
-Combines characteristics of both Kogge-Stone and Brent-Kung:
-- Uses Kogge-Stone parallel structure for odd bit positions
-- Uses Brent-Kung back-propagation for even positions
-- Results in moderate wiring complexity and good speed
-## Comparison
-| Adder | Depth | Cells | Wiring |
-|-------|-------|-------|--------|
-| Kogge-Stone | log n | Max | High |
-| Brent-Kung | 2 log n - 2 | Min | Low |
-| Han-Carlson | log n + 1 | Medium | Medium |
 ## Parameters
 | | |
 |---|---|
-| Inputs | 9 |
-| Outputs | 5 |
 | Neurons | 32 |
 | Layers | 5 |
 | Parameters | 132 |
 | Magnitude | 56 |
 ## License
 MIT

 - neuromorphic
 - arithmetic
 - parallel-prefix
+- adder
 ---
 # threshold-han-carlson
+4-bit Han-Carlson parallel prefix adder. Hybrid design combining Kogge-Stone's logarithmic depth with Brent-Kung's reduced wiring complexity.
+## Function
 ```
+S[3:0], Cout = A[3:0] + B[3:0] + Cin
 ```
+Computes 4-bit addition with carry-in/carry-out using parallel prefix computation.
+## Han-Carlson Structure (4-bit)
+```
+       G3,P3    G2,P2    G1,P1    G0,P0    Cin
+         │        │        │        │       │
+         │        │        │        │       │
+Level 1  ●────────┘        ●────────┘       │   ← Odd-position prefix (like Brent-Kung)
+         │                 │                │
+         │                 │                │
+Level 2  ●─────────────────┘                │   ← Kogge-Stone style merge
+         │                                  │
+         │                                  │
+Level 3  └────────●                         │   ← Back-propagate to even positions
+                  │                         │
+                  │                         │
+       G3:0     G2:0     G1:0     G0       │
+         │        │        │        │       │
+         ▼        ▼        ▼        ▼       │
+    ┌────┴────┬───┴───┬────┴────┬───┴───────┘
+    │         │       │         │
+   XOR       XOR     XOR       XOR
+    │         │       │         │
+    ▼         ▼       ▼         ▼
+  Cout       S3      S2        S1        S0
+```
+## Design Philosophy
+Han-Carlson is a **hybrid parallel prefix adder** that interpolates between two extremes:
+| Property | Kogge-Stone | Han-Carlson | Brent-Kung |
+|----------|-------------|-------------|------------|
+| Depth | log₂(n) | log₂(n) + 1 | 2·log₂(n) - 2 |
+| Prefix cells | n·log₂(n) - n + 1 | (n/2)·log₂(n) | 2n - 2 - log₂(n) |
+| Wiring tracks | High | Medium | Low |
+| Fanout | Uniform | Mixed | Low |
+For n=4: Depth=3, Cells=4, offering the best balance for small adders.
+## Parallel Prefix Operation
+The fundamental operation combines (Generate, Propagate) pairs:
+```
+(G_high, P_high) ○ (G_low, P_low) = (G_high + P_high·G_low, P_high·P_low)
+```
+Where:
+- G_i = A_i AND B_i (generate carry)
+- P_i = A_i XOR B_i (propagate carry)
+## Truth Table (Examples)
+| A (dec) | B (dec) | Cin | S (dec) | Cout | Equation |
+|---------|---------|-----|---------|------|----------|
+| 0 | 0 | 0 | 0 | 0 | 0+0+0=0 |
+| 5 | 3 | 0 | 8 | 0 | 5+3+0=8 |
+| 7 | 7 | 0 | 14 | 0 | 7+7+0=14 |
+| 15 | 1 | 0 | 0 | 1 | 15+1+0=16 |
+| 15 | 15 | 1 | 15 | 1 | 15+15+1=31 |
+## Layer 0 Weights (Generate/Propagate)
+Each bit position computes G and P from inputs:
+```
+G_i: weights[A_i]=1, weights[B_i]=1, bias=-2  (AND gate)
+P_i: XOR decomposition using OR/NAND/AND
+```
 ## Parameters
 | | |
 |---|---|
+| Inputs | 9 (A[3:0], B[3:0], Cin) |
+| Outputs | 5 (S[3:0], Cout) |
 | Neurons | 32 |
 | Layers | 5 |
 | Parameters | 132 |
 | Magnitude | 56 |
+## Usage
+```python
+from safetensors.torch import load_file
+import torch
+w = load_file('model.safetensors')
+def han_carlson_add(a3, a2, a1, a0, b3, b2, b1, b0, cin):
+    # Generate and Propagate
+    g = [a0 & b0, a1 & b1, a2 & b2, a3 & b3]
+    p = [a0 ^ b0, a1 ^ b1, a2 ^ b2, a3 ^ b3]
+    # Level 1: odd positions
+    g10 = g[1] | (p[1] & g[0])
+    p10 = p[1] & p[0]
+    g32 = g[3] | (p[3] & g[2])
+    p32 = p[3] & p[2]
+    # Level 2: top merge
+    g30 = g32 | (p32 & g10)
+    # Level 3: back-propagate
+    g20 = g[2] | (p[2] & g10)
+    # Carries and sums
+    c0 = g[0] | (p[0] & cin)
+    c1 = g10 | (p10 & cin)
+    c2 = g20 | (p[2] & p10 & cin)
+    c3 = g30 | (p32 & p10 & cin)
+    return p[3]^c2, p[2]^c1, p[1]^c0, p[0]^cin, c3
+# Example: 5 + 3 = 8
+s3, s2, s1, s0, cout = han_carlson_add(0,1,0,1, 0,0,1,1, 0)
+print(f"Result: {cout*16 + s3*8 + s2*4 + s1*2 + s0}")  # 8
+```
+## Verification
+All 512 input combinations (16 × 16 × 2) verified correct.
+## Files
+```
+threshold-han-carlson/
+├── model.safetensors    # Threshold network weights
+├── create_safetensors.py # Weight generation + verification
+├── config.json          # Circuit metadata
+└── README.md
+```
+## References
+- Han, T., & Carlson, D. A. (1987). "Fast area-efficient VLSI adders"
+- Parallel prefix adders are fundamental to high-performance ALU design
 ## License
 MIT