Add tension heatmap documentation

Files changed (3) hide show

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+docs/tension_heatmap.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -37,6 +37,16 @@ Safetensors is preferred for distribution because it is a data-only tensor conta
 The converted file was validated locally by reloading `model.safetensors` and checking every tensor against the original PyTorch state dictionary with exact tensor equality. If the original PyTorch checkpoint used shared tensor storage for tied weights, those state-dictionary entries are materialized as separate tensors in the safetensors file because raw safetensors state dictionaries do not encode Python storage aliasing. The matching model implementation should still apply its normal weight tying after `load_state_dict`.
 ## Loading the weights
 This repository contains a raw PyTorch state dictionary rather than a fully packaged Transformers model class. Load the safetensors file into the matching TensionLM model implementation used to train the checkpoint.

 The converted file was validated locally by reloading `model.safetensors` and checking every tensor against the original PyTorch state dictionary with exact tensor equality. If the original PyTorch checkpoint used shared tensor storage for tied weights, those state-dictionary entries are materialized as separate tensors in the safetensors file because raw safetensors state dictionaries do not encode Python storage aliasing. The matching model implementation should still apply its normal weight tying after `load_state_dict`.
+## Tension heatmap
+The image below is an average causal tension-field heatmap rendered from this safetensors checkpoint with the prompt:
+`If all mammals are warm blooded and all whales are mammals then`
+Rows are target tokens whose hidden states are being constrained. Columns are earlier source tokens contributing those constraints. Brighter cells indicate larger mean τ values averaged across layers and heads. This is not a softmax attention map: source positions do not compete for one probability mass, so multiple prior tokens can remain bright for the same target token.
+![Average causal tension heatmap](docs/tension_heatmap.png)
 ## Loading the weights
 This repository contains a raw PyTorch state dictionary rather than a fully packaged Transformers model class. Load the safetensors file into the matching TensionLM model implementation used to train the checkpoint.

docs/tension_heatmap.png ADDED Viewed