threshold-wallace-tree-3x3

3x3 Wallace tree multiplier. Multiplies two 3-bit unsigned integers using carry-save reduction.

Function

multiply(A[2:0], B[2:0]) -> P[5:0]

Range: 0-7 x 0-7 = 0-49

Wallace Tree Concept

Wallace trees use 3:2 compressors (full adders) to reduce partial products in parallel, minimizing the critical path compared to ripple-carry approaches.

Partial Product Layout

          b2    b1    b0
      x   a2    a1    a0
      ─────────────────────
                a0b2  a0b1  a0b0
          a1b2  a1b1  a1b0
    a2b2  a2b1  a2b0
    ─────────────────────────────
    p5    p4    p3    p2    p1    p0

Column counts: [1, 2, 3, 2, 1] at positions [0, 1, 2, 3, 4]

Reduction Tree

Col 0: pp00 ────────────────────────────────────► p0

Col 1: pp01 ─┬─► HA ─┬─► p1
       pp10 β”€β”˜       └─► c1 ──────────┐
                                      β”‚
Col 2: pp02 ─┬─► FA ─┬─► s2 ──► HA ──┼──► p2
       pp11 ──       └─► c2 ─┐       β”‚
       pp20 β”€β”˜               β”‚       β”‚
                             β–Ό       β”‚
Col 3: pp12 ─┬─► HA ─┬─► s3 ─► HA ──┼──► p3
       pp21 β”€β”˜       └─► c3 ────────┼─┐
                                    β”‚ β”‚
Col 4: pp22 ─────────► HA ──► HA ──┴─┴─► p4
                                    β”‚
Col 5: ─────────────────────────────────► p5

Parameters

Inputs 6
Outputs 6
Neurons 37
Layers 5
Parameters 147
Magnitude 133

Wallace vs Ripple-Carry

Approach Depth Parallelism
Ripple-Carry O(n) Low
Wallace Tree O(log n) High

For larger multipliers, Wallace trees significantly reduce latency.

Usage

from safetensors.torch import load_file

w = load_file('model.safetensors')

# 7 x 7 = 49
# All 64 combinations verified

License

MIT

Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including phanerozoic/threshold-wallace-tree-3x3