BoggersTheFish commited on
Commit
cc720c7
·
verified ·
1 Parent(s): 849d034

Add tension heatmap documentation

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. README.md +10 -0
  3. docs/tension_heatmap.png +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ docs/tension_heatmap.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -37,6 +37,16 @@ Safetensors is preferred for distribution because it is a data-only tensor conta
37
 
38
  The converted file was validated locally by reloading `model.safetensors` and checking every tensor against the original PyTorch state dictionary with exact tensor equality. If the original PyTorch checkpoint used shared tensor storage for tied weights, those state-dictionary entries are materialized as separate tensors in the safetensors file because raw safetensors state dictionaries do not encode Python storage aliasing. The matching model implementation should still apply its normal weight tying after `load_state_dict`.
39
 
 
 
 
 
 
 
 
 
 
 
40
  ## Loading the weights
41
 
42
  This repository contains a raw PyTorch state dictionary rather than a fully packaged Transformers model class. Load the safetensors file into the matching TensionLM model implementation used to train the checkpoint.
 
37
 
38
  The converted file was validated locally by reloading `model.safetensors` and checking every tensor against the original PyTorch state dictionary with exact tensor equality. If the original PyTorch checkpoint used shared tensor storage for tied weights, those state-dictionary entries are materialized as separate tensors in the safetensors file because raw safetensors state dictionaries do not encode Python storage aliasing. The matching model implementation should still apply its normal weight tying after `load_state_dict`.
39
 
40
+ ## Tension heatmap
41
+
42
+ The image below is an average causal tension-field heatmap rendered from this safetensors checkpoint with the prompt:
43
+
44
+ `If all mammals are warm blooded and all whales are mammals then`
45
+
46
+ Rows are target tokens whose hidden states are being constrained. Columns are earlier source tokens contributing those constraints. Brighter cells indicate larger mean τ values averaged across layers and heads. This is not a softmax attention map: source positions do not compete for one probability mass, so multiple prior tokens can remain bright for the same target token.
47
+
48
+ ![Average causal tension heatmap](docs/tension_heatmap.png)
49
+
50
  ## Loading the weights
51
 
52
  This repository contains a raw PyTorch state dictionary rather than a fully packaged Transformers model class. Load the safetensors file into the matching TensionLM model implementation used to train the checkpoint.
docs/tension_heatmap.png ADDED

Git LFS Details

  • SHA256: 0702225832ecdbf52167b84b690f2a7c80e3e24476352822e96aa7cada8d7e51
  • Pointer size: 131 Bytes
  • Size of remote file: 109 kB