Fix: backward() inside @torch .no_grad() — use torch.enable_grad() for dense gradient computation 39301d6 verified theapemachine commited on 6 days ago
Fix: compute_relaxer_diagnostics called backward inside no_grad context" 1f4765b verified theapemachine commited on 6 days ago
Major revision: add phantom momentum ablation, compute-matched baselines, multi-seed predictor accuracy 96bc237 verified theapemachine commited on 7 days ago
Add definitive ablation suite addressing all critique gaps 7853236 verified theapemachine commited on 7 days ago
Add sparse transformer v19 with Triton-backed KNN scheduler and various backward modes. Includes utilities for synthetic data generation and model training. Implements chunked sparse updates and integrates with existing sparse linear layers. bc1b8eb theapemachine commited on 7 days ago