Merge pull request #22 from MotifTechnologies/jangwoong/mla-rope-fa4-port 5adea7d unverified Jangwoong Kim commited on 27 days ago
bench: MLA RoPE fused vs vanilla sweep 536f0b2 Jangwoong Kim Claude Opus 4.6 (1M context) commited on 27 days ago
style: fix yapf/isort/clang-format for CI --all-files 9dcee96 wyldecat Claude Opus 4.6 (1M context) commited on 29 days ago
feat: add RMSNorm benchmark scripts and K8s job a5e85e1 wyldecat Claude Opus 4.6 (1M context) commited on 29 days ago
style: apply yapf + isort formatting 60615a0 wyldecat Claude Opus 4.6 (1M context) commited on 29 days ago
feat: replace triton do_bench with torch.profiler for kernel timing 7d51e61 wyldecat Claude Opus 4.6 (1M context) commited on Apr 10
style: apply yapf, isort, and clang-format 6436ad6 wyldecat Claude Opus 4.6 (1M context) commited on Apr 6
fix: rename stale references and clean up Triton remnants 5a9d09d wyldecat Claude Opus 4.6 (1M context) commited on Apr 6
feat: add GroupedFusedMulPolyNorm Triton kernel for MoE models (#16) e195bbb unverified TaehyunKim Claude Opus 4.6 github-actions[bot] commited on Mar 6