Document inference code and training attention settings
Browse files
README.md
CHANGED
|
@@ -27,3 +27,15 @@ training resume state is required.
|
|
| 27 |
checkpoint, including `SPARSE_FP4_OURS_P_ATTN`, its Triton forward/backward
|
| 28 |
kernel, FP4 quant helpers, VSA metadata helper, backend wiring, and the exact
|
| 29 |
SFT launch scripts.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
checkpoint, including `SPARSE_FP4_OURS_P_ATTN`, its Triton forward/backward
|
| 28 |
kernel, FP4 quant helpers, VSA metadata helper, backend wiring, and the exact
|
| 29 |
SFT launch scripts.
|
| 30 |
+
|
| 31 |
+
It also includes the inference entrypoint snapshot and an example script:
|
| 32 |
+
|
| 33 |
+
- `backend_snapshot/scripts/inference/run_sfp4_ours_p_checkpoint_750.sh`
|
| 34 |
+
- `backend_snapshot/training_attention_settings.json`
|
| 35 |
+
|
| 36 |
+
Attention setup for this checkpoint:
|
| 37 |
+
|
| 38 |
+
- self-attention: `SPARSE_FP4_OURS_P_ATTN`, FP4 Q/K/V, sparse 64-token VSA
|
| 39 |
+
tiles, group-local P quant, dropped-tile mean compensation
|
| 40 |
+
- cross-attention: dense SDPA fallback, not FP4/sparse
|
| 41 |
+
- force-dense paths: dense SDPA
|