pointbreak3000
/

gpt2-clt-layer0

@@ -10,26 +10,25 @@ tags:
 # GPT-2 Cross-Layer Transcoder — Layer 0
-Trained to reconstruct MLP output at layer 0 of GPT-2 from the residual stream input.
 ## Architecture
 - **Input**: residual stream before layer 0 (d_model=768)
 - **Output**: MLP output at layer 0 (d_model=768)
-- **Features**: 4096 (JumpReLU sparse)
 - **Training data**: WikiText-103 (5,000 documents, seq_len=64)
 ## Metrics
-- **R²**: 0.8907
-- **Dead features**: 105/4096
-- **Training steps**: 2000
 ## Usage
 ```python
-import torch
-import json
 from huggingface_hub import hf_hub_download
-# Download
 weights_path = hf_hub_download("pointbreak3000/gpt2-clt-layer0", "model.pt")
 config_path  = hf_hub_download("pointbreak3000/gpt2-clt-layer0", "config.json")
@@ -40,8 +39,3 @@ clt = ProperCLT(d_model=config["d_model"], n_features=config["n_features"])
 clt.load_state_dict(torch.load(weights_path, map_location="cpu"))
 clt.eval()
 ```
-## Purpose
-Part of an experiment extending the circuit-tracing paper
-(Ameisen et al. 2025) to include attention circuit attribution
-via feature-space Jacobians.

 # GPT-2 Cross-Layer Transcoder — Layer 0
+Trained to reconstruct MLP output at layer 0 of GPT-2 from the residual stream.
 ## Architecture
 - **Input**: residual stream before layer 0 (d_model=768)
 - **Output**: MLP output at layer 0 (d_model=768)
+- **Features**: 8192 (JumpReLU sparse)
 - **Training data**: WikiText-103 (5,000 documents, seq_len=64)
 ## Metrics
+- **R²**: 0.9409
+- **Dead features**: 38/8192
+- **Training steps**: 5000
+- **Sparsity coef**: 0.02
 ## Usage
 ```python
+import torch, json
 from huggingface_hub import hf_hub_download
 weights_path = hf_hub_download("pointbreak3000/gpt2-clt-layer0", "model.pt")
 config_path  = hf_hub_download("pointbreak3000/gpt2-clt-layer0", "config.json")
 clt.load_state_dict(torch.load(weights_path, map_location="cpu"))
 clt.eval()
 ```