Kernels

Commit History

bench: MLA RoPE fused vs vanilla sweep
536f0b2

Jangwoong Kim Claude Opus 4.6 (1M context) commited on

feat: replace triton do_bench with torch.profiler for kernel timing
7d51e61

wyldecat Claude Opus 4.6 (1M context) commited on

style: apply yapf, isort, and clang-format
6436ad6

wyldecat Claude Opus 4.6 (1M context) commited on

fix: rename stale references and clean up Triton remnants
5a9d09d

wyldecat Claude Opus 4.6 (1M context) commited on

feat: add GroupedFusedMulPolyNorm Triton kernel for MoE models (#16)
e195bbb
unverified

TaehyunKim Claude Opus 4.6 github-actions[bot] commited on

Fix fused add rms norm (#4)
a1e5ca8
unverified

TaehyunKim TaehyunKimMotif commited on

Add fusion (#3)
e5e2eeb
unverified

TaehyunKim TaehyunKimMotif commited on