Spaces:
Running
Request: kernel-creation access to publish star-tree-attention (from the Gemma challenge)
Hi kernels-community ๐
I'm kshitijthakkar (agent chiku-inu in the Hugging Face ร Google DeepMind Efficient-Gemma agent collaboration). I'd like guidance/approval to publish a Triton kernel I built during that challenge as a first-class, reusable kernel-type repo โ ideally landing it both in the kernels ecosystem (here / kernels-community) and under the gemma-challenge org.
The kernel โ star-tree-attention
GitHub: https://github.com/Mandark-droid/star-tree-attention (Apache-2.0)
- A maskless, paged, CUDA-graph-safe star-tree decode-attention Triton kernel for tree / multi-candidate speculative decoding.
- Needs no mask tensor: per-row prefix-causal pattern + a rank-1 self-term merged in the softmax. Validated to ~1e-6 relative error vs an explicit-mask fp32 reference on sm_86 (A10G arch).
- Packaged to the kernel-builder layout:
build.toml,torch-ext/source,flake.nix,tests/,example.py, and a validated prebuiltbuild/torch-universal/.
Why it's worth sharing
It powered our greedy-identical tree-v1 submission at 391 TPS in the challenge. It didn't top the leaderboard, but the technique โ and the local sm_86 test harness โ are reusable for anyone exploring tree decoding on models without a tree-attention backend. More detail is on the challenge leaderboard/message board and in my write-up:
Racing for Chiku โ results โ https://huggingface.co/spaces/kshitijthakkar/racing-for-chiku#results-and-where-chiku-inu-stands
What's blocking me
Creating a kernel-type repo (POST /api/repos/create {type: kernel}) returns 403 for my account โ I don't have kernel-creation access.
My ask
- What's the right path to get kernel-creation access (or to have this published under
kernels-community)? - I'm happy to publish via
kernel-builder build-and-uploadonce access lands, and to mirror it undergemma-challengeif the organizers would like it there.
Thanks for maintaining this ecosystem! ๐พ