yitongl
/

sparse_quant_exp

yitongl commited on 13 days ago

Commit

fda7c8f

verified ·

1 Parent(s): 697fddf

Document inference code and training attention settings

Files changed (1) hide show

README.md CHANGED Viewed

@@ -27,3 +27,15 @@ training resume state is required.
 checkpoint, including `SPARSE_FP4_OURS_P_ATTN`, its Triton forward/backward
 kernel, FP4 quant helpers, VSA metadata helper, backend wiring, and the exact
 SFT launch scripts.

 checkpoint, including `SPARSE_FP4_OURS_P_ATTN`, its Triton forward/backward
 kernel, FP4 quant helpers, VSA metadata helper, backend wiring, and the exact
 SFT launch scripts.
+It also includes the inference entrypoint snapshot and an example script:
+- `backend_snapshot/scripts/inference/run_sfp4_ours_p_checkpoint_750.sh`
+- `backend_snapshot/training_attention_settings.json`
+Attention setup for this checkpoint:
+- self-attention: `SPARSE_FP4_OURS_P_ATTN`, FP4 Q/K/V, sparse 64-token VSA
+  tiles, group-local P quant, dropped-tile mean compensation
+- cross-attention: dense SDPA fallback, not FP4/sparse
+- force-dense paths: dense SDPA