Kernels
cuda
hadamard

hadamard_transform_kernels

Forward Hadamard transform CUDA kernel, packaged for the kernels library.

fp32 / fp16 / bf16, last dim from 1 up to 32768 (zero-padded to the next power of two internally).

Use

import torch
from kernels import get_kernel

hadamard = get_kernel("galqiwi/hadamard_transform_kernels", version=1)

x = torch.randn(4, 4096, device="cuda", dtype=torch.float16)
y = hadamard.hadamard_transform(x, scale=1.0)

API

hadamard_transform(x: torch.Tensor, scale: float = 1.0) -> torch.Tensor

x is a CUDA tensor of shape (..., dim). The output has the same shape and dtype.

Attribution

CUDA code is adapted from Dao-AILab/fast-hadamard-transform (Tri Dao, BSD-3-Clause).

Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support