Major revision: add phantom momentum ablation, compute-matched baselines, multi-seed predictor accuracy 96bc237 verified theapemachine commited on 9 days ago
Add sparse transformer v19 with Triton-backed KNN scheduler and various backward modes. Includes utilities for synthetic data generation and model training. Implements chunked sparse updates and integrates with existing sparse linear layers. bc1b8eb theapemachine commited on 10 days ago