theapemachine commited on
Commit
7cf627f
·
verified ·
1 Parent(s): e5e3719

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Sparse Transformer: Experiment Suite + Triton Kernels
2
+
3
+ Comprehensive experiment infrastructure for the Chunked Sparse Backward Pass paper.
4
+
5
+ ## Files
6
+
7
+ | File | Description |
8
+ |------|-------------|
9
+ | `triton_sparse.py` | Triton-fused sparse backward kernels (dW, dX, dBias) + Python-loop baseline + correctness tests + microbenchmark |
10
+ | `e2e_full.py` | End-to-end training benchmark: Dense vs PyLoop vs Triton at d_model ∈ {512, 1024, 2048} |
11
+ | `full_experiments.py` | 7-experiment ablation suite (baselines, predictor accuracy, chunk ablation, compute-matched, exploration, attention sparsification, sparsity sweep) |
12
+ | `analyze_results.py` | Publication figure generator (matplotlib) |
13
+
14
+ ## Quick Start
15
+
16
+ ```bash
17
+ pip install torch triton tiktoken matplotlib numpy
18
+
19
+ # Correctness test + microbenchmark
20
+ python triton_sparse.py
21
+
22
+ # End-to-end training (needs ≥24GB GPU for d=2048)
23
+ python e2e_full.py
24
+
25
+ # Full ablation suite (7 experiments, ~4-6 hours on A10G)
26
+ python full_experiments.py --experiment all --device cuda --steps 2000 --seeds "42,123,456"
27
+
28
+ # Single experiment
29
+ python full_experiments.py --experiment baselines --device cuda
30
+ ```
31
+
32
+ ## Results
33
+
34
+ See `RESULTS.md` for collected tables.