Add selective_scan_cuda binary for Torch 2.6, CUDA 12.4

#1
by RhodWeo - opened

Pre-compiled CUDA kernel for Mamba-SSM.

Environment Specs:

  • PyTorch: 2.6.0
  • CUDA: 12.4
  • CXX11 ABI: False
  • Architecture: x86_64
  • Python: 3.11

Works on Meluxina, the Luxembourguish supercomputer

It has been used on a GPU node containing 4 x A100 (40gb)

kernels-community org
edited 18 days ago

Hi, thanks for submitting this! We do not directly commit to the kernels-community HF org anymore, all changes go through https://github.com/huggingface/kernels-community/pulls . Besides that, we cannot accept pre-compiled binaries. For security and provenance, every binary that is uploaded to kernels-community needs to be built through HF infrastructure.

If you need the build variant to be available on the Hub for your own applications, we recommend you to make a repository under your own user or organization.

danieldk changed pull request status to closed

Sign up or log in to comment