CharlesCNorton commited on
Commit
107e84b
Β·
1 Parent(s): 66ce5d6

Expand README with full documentation

Browse files

- Add detailed circuit diagrams
- Add threshold implementation details with weights/biases
- Add complete usage code examples
- Add truth tables with examples
- Add academic references
- Add verification and files sections

Files changed (1) hide show
  1. README.md +131 -17
README.md CHANGED
@@ -7,45 +7,159 @@ tags:
7
  - neuromorphic
8
  - arithmetic
9
  - parallel-prefix
 
10
  ---
11
 
12
  # threshold-han-carlson
13
 
14
- 4-bit Han-Carlson parallel prefix adder. Hybrid combining Kogge-Stone speed with Brent-Kung area efficiency.
15
 
16
- ## Circuit
17
 
18
  ```
19
- Inputs: A[3:0], B[3:0], Cin (9 inputs)
20
- Outputs: S[3:0], Cout (5 outputs)
21
  ```
22
 
23
- ## Han-Carlson Structure
24
 
25
- Combines characteristics of both Kogge-Stone and Brent-Kung:
26
- - Uses Kogge-Stone parallel structure for odd bit positions
27
- - Uses Brent-Kung back-propagation for even positions
28
- - Results in moderate wiring complexity and good speed
29
 
30
- ## Comparison
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
- | Adder | Depth | Cells | Wiring |
33
- |-------|-------|-------|--------|
34
- | Kogge-Stone | log n | Max | High |
35
- | Brent-Kung | 2 log n - 2 | Min | Low |
36
- | Han-Carlson | log n + 1 | Medium | Medium |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
 
38
  ## Parameters
39
 
40
  | | |
41
  |---|---|
42
- | Inputs | 9 |
43
- | Outputs | 5 |
44
  | Neurons | 32 |
45
  | Layers | 5 |
46
  | Parameters | 132 |
47
  | Magnitude | 56 |
48
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
  ## License
50
 
51
  MIT
 
7
  - neuromorphic
8
  - arithmetic
9
  - parallel-prefix
10
+ - adder
11
  ---
12
 
13
  # threshold-han-carlson
14
 
15
+ 4-bit Han-Carlson parallel prefix adder. Hybrid design combining Kogge-Stone's logarithmic depth with Brent-Kung's reduced wiring complexity.
16
 
17
+ ## Function
18
 
19
  ```
20
+ S[3:0], Cout = A[3:0] + B[3:0] + Cin
 
21
  ```
22
 
23
+ Computes 4-bit addition with carry-in/carry-out using parallel prefix computation.
24
 
25
+ ## Han-Carlson Structure (4-bit)
 
 
 
26
 
27
+ ```
28
+ G3,P3 G2,P2 G1,P1 G0,P0 Cin
29
+ β”‚ β”‚ β”‚ β”‚ β”‚
30
+ β”‚ β”‚ β”‚ β”‚ β”‚
31
+ Level 1 β—β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β—β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ ← Odd-position prefix (like Brent-Kung)
32
+ β”‚ β”‚ β”‚
33
+ β”‚ β”‚ β”‚
34
+ Level 2 β—β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ ← Kogge-Stone style merge
35
+ β”‚ β”‚
36
+ β”‚ β”‚
37
+ Level 3 └────────● β”‚ ← Back-propagate to even positions
38
+ β”‚ β”‚
39
+ β”‚ β”‚
40
+ G3:0 G2:0 G1:0 G0 β”‚
41
+ β”‚ β”‚ β”‚ β”‚ β”‚
42
+ β–Ό β–Ό β–Ό β–Ό β”‚
43
+ β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”¬β”€β”€β”€β”΄β”€β”€β”€β”¬β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”¬β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”˜
44
+ β”‚ β”‚ β”‚ β”‚
45
+ XOR XOR XOR XOR
46
+ β”‚ β”‚ β”‚ β”‚
47
+ β–Ό β–Ό β–Ό β–Ό
48
+ Cout S3 S2 S1 S0
49
+ ```
50
+
51
+ ## Design Philosophy
52
+
53
+ Han-Carlson is a **hybrid parallel prefix adder** that interpolates between two extremes:
54
+
55
+ | Property | Kogge-Stone | Han-Carlson | Brent-Kung |
56
+ |----------|-------------|-------------|------------|
57
+ | Depth | logβ‚‚(n) | logβ‚‚(n) + 1 | 2Β·logβ‚‚(n) - 2 |
58
+ | Prefix cells | nΒ·logβ‚‚(n) - n + 1 | (n/2)Β·logβ‚‚(n) | 2n - 2 - logβ‚‚(n) |
59
+ | Wiring tracks | High | Medium | Low |
60
+ | Fanout | Uniform | Mixed | Low |
61
 
62
+ For n=4: Depth=3, Cells=4, offering the best balance for small adders.
63
+
64
+ ## Parallel Prefix Operation
65
+
66
+ The fundamental operation combines (Generate, Propagate) pairs:
67
+
68
+ ```
69
+ (G_high, P_high) β—‹ (G_low, P_low) = (G_high + P_highΒ·G_low, P_highΒ·P_low)
70
+ ```
71
+
72
+ Where:
73
+ - G_i = A_i AND B_i (generate carry)
74
+ - P_i = A_i XOR B_i (propagate carry)
75
+
76
+ ## Truth Table (Examples)
77
+
78
+ | A (dec) | B (dec) | Cin | S (dec) | Cout | Equation |
79
+ |---------|---------|-----|---------|------|----------|
80
+ | 0 | 0 | 0 | 0 | 0 | 0+0+0=0 |
81
+ | 5 | 3 | 0 | 8 | 0 | 5+3+0=8 |
82
+ | 7 | 7 | 0 | 14 | 0 | 7+7+0=14 |
83
+ | 15 | 1 | 0 | 0 | 1 | 15+1+0=16 |
84
+ | 15 | 15 | 1 | 15 | 1 | 15+15+1=31 |
85
+
86
+ ## Layer 0 Weights (Generate/Propagate)
87
+
88
+ Each bit position computes G and P from inputs:
89
+
90
+ ```
91
+ G_i: weights[A_i]=1, weights[B_i]=1, bias=-2 (AND gate)
92
+ P_i: XOR decomposition using OR/NAND/AND
93
+ ```
94
 
95
  ## Parameters
96
 
97
  | | |
98
  |---|---|
99
+ | Inputs | 9 (A[3:0], B[3:0], Cin) |
100
+ | Outputs | 5 (S[3:0], Cout) |
101
  | Neurons | 32 |
102
  | Layers | 5 |
103
  | Parameters | 132 |
104
  | Magnitude | 56 |
105
 
106
+ ## Usage
107
+
108
+ ```python
109
+ from safetensors.torch import load_file
110
+ import torch
111
+
112
+ w = load_file('model.safetensors')
113
+
114
+ def han_carlson_add(a3, a2, a1, a0, b3, b2, b1, b0, cin):
115
+ # Generate and Propagate
116
+ g = [a0 & b0, a1 & b1, a2 & b2, a3 & b3]
117
+ p = [a0 ^ b0, a1 ^ b1, a2 ^ b2, a3 ^ b3]
118
+
119
+ # Level 1: odd positions
120
+ g10 = g[1] | (p[1] & g[0])
121
+ p10 = p[1] & p[0]
122
+ g32 = g[3] | (p[3] & g[2])
123
+ p32 = p[3] & p[2]
124
+
125
+ # Level 2: top merge
126
+ g30 = g32 | (p32 & g10)
127
+
128
+ # Level 3: back-propagate
129
+ g20 = g[2] | (p[2] & g10)
130
+
131
+ # Carries and sums
132
+ c0 = g[0] | (p[0] & cin)
133
+ c1 = g10 | (p10 & cin)
134
+ c2 = g20 | (p[2] & p10 & cin)
135
+ c3 = g30 | (p32 & p10 & cin)
136
+
137
+ return p[3]^c2, p[2]^c1, p[1]^c0, p[0]^cin, c3
138
+
139
+ # Example: 5 + 3 = 8
140
+ s3, s2, s1, s0, cout = han_carlson_add(0,1,0,1, 0,0,1,1, 0)
141
+ print(f"Result: {cout*16 + s3*8 + s2*4 + s1*2 + s0}") # 8
142
+ ```
143
+
144
+ ## Verification
145
+
146
+ All 512 input combinations (16 Γ— 16 Γ— 2) verified correct.
147
+
148
+ ## Files
149
+
150
+ ```
151
+ threshold-han-carlson/
152
+ β”œβ”€β”€ model.safetensors # Threshold network weights
153
+ β”œβ”€β”€ create_safetensors.py # Weight generation + verification
154
+ β”œβ”€β”€ config.json # Circuit metadata
155
+ └── README.md
156
+ ```
157
+
158
+ ## References
159
+
160
+ - Han, T., & Carlson, D. A. (1987). "Fast area-efficient VLSI adders"
161
+ - Parallel prefix adders are fundamental to high-performance ALU design
162
+
163
  ## License
164
 
165
  MIT