YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

higgs-kernels

CUDA kernels for HIGGS quantization, packaged for the Hugging Face Kernel Hub.

Extracted from galqiwi/higgs-kernels.

Kernels

  • higgs_dequantize_2_256 - codebook lookup: uint8 indices -> 2D fp16/bf16 vectors
  • higgs_quantize_2_256_f16 - nearest codebook entry search (fp16)
  • higgs_quantize_2_256_bf16 - nearest codebook entry search (bf16)

Usage

from kernels import get_kernel

higgs = get_kernel("galqiwi/higgs-kernels")

out = higgs.higgs_dequantize_2_256_kernel(x_uint8, grid_256x2)
indices = higgs.higgs_quantize_2_256_kernel(x_fp16_Nx2, grid_256x2, grid_norms_256)

Grid data

Pre-trained codebook included in grids.safetensors (256x2, key "2_256").

from kernels import get_kernel
higgs = get_kernel("galqiwi/higgs-kernels")
grid = higgs.load_optimal_grid_2_256(device="cuda", dtype=torch.float16)

License

Apache-2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for galqiwi/higgs-kernels