yitongl commited on
Commit
fda7c8f
·
verified ·
1 Parent(s): 697fddf

Document inference code and training attention settings

Browse files
Files changed (1) hide show
  1. README.md +12 -0
README.md CHANGED
@@ -27,3 +27,15 @@ training resume state is required.
27
  checkpoint, including `SPARSE_FP4_OURS_P_ATTN`, its Triton forward/backward
28
  kernel, FP4 quant helpers, VSA metadata helper, backend wiring, and the exact
29
  SFT launch scripts.
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  checkpoint, including `SPARSE_FP4_OURS_P_ATTN`, its Triton forward/backward
28
  kernel, FP4 quant helpers, VSA metadata helper, backend wiring, and the exact
29
  SFT launch scripts.
30
+
31
+ It also includes the inference entrypoint snapshot and an example script:
32
+
33
+ - `backend_snapshot/scripts/inference/run_sfp4_ours_p_checkpoint_750.sh`
34
+ - `backend_snapshot/training_attention_settings.json`
35
+
36
+ Attention setup for this checkpoint:
37
+
38
+ - self-attention: `SPARSE_FP4_OURS_P_ATTN`, FP4 Q/K/V, sparse 64-token VSA
39
+ tiles, group-local P quant, dropped-tile mean compensation
40
+ - cross-attention: dense SDPA fallback, not FP4/sparse
41
+ - force-dense paths: dense SDPA