--- license: bsd-3-clause library_name: kernels tags: - kernels - cuda - hadamard --- # hadamard_transform_kernels Forward Hadamard transform CUDA kernel, packaged for the [`kernels`](https://huggingface.co/docs/kernels/index) library. `fp32` / `fp16` / `bf16`, last dim from `1` up to `32768` (zero-padded to the next power of two internally). ## Use ```python import torch from kernels import get_kernel hadamard = get_kernel("galqiwi/hadamard_transform_kernels", version=1) x = torch.randn(4, 4096, device="cuda", dtype=torch.float16) y = hadamard.hadamard_transform(x, scale=1.0) ``` ## API ```python hadamard_transform(x: torch.Tensor, scale: float = 1.0) -> torch.Tensor ``` `x` is a CUDA tensor of shape `(..., dim)`. The output has the same shape and dtype. ## Attribution CUDA code is adapted from [Dao-AILab/fast-hadamard-transform](https://github.com/Dao-AILab/fast-hadamard-transform) (Tri Dao, BSD-3-Clause).