pointbreak3000 commited on
Commit
83fd8a0
·
verified ·
1 Parent(s): 90728ad

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +7 -13
README.md CHANGED
@@ -10,26 +10,25 @@ tags:
10
 
11
  # GPT-2 Cross-Layer Transcoder — Layer 0
12
 
13
- Trained to reconstruct MLP output at layer 0 of GPT-2 from the residual stream input.
14
 
15
  ## Architecture
16
  - **Input**: residual stream before layer 0 (d_model=768)
17
  - **Output**: MLP output at layer 0 (d_model=768)
18
- - **Features**: 4096 (JumpReLU sparse)
19
  - **Training data**: WikiText-103 (5,000 documents, seq_len=64)
20
 
21
  ## Metrics
22
- - **R²**: 0.8907
23
- - **Dead features**: 105/4096
24
- - **Training steps**: 2000
 
25
 
26
  ## Usage
27
  ```python
28
- import torch
29
- import json
30
  from huggingface_hub import hf_hub_download
31
 
32
- # Download
33
  weights_path = hf_hub_download("pointbreak3000/gpt2-clt-layer0", "model.pt")
34
  config_path = hf_hub_download("pointbreak3000/gpt2-clt-layer0", "config.json")
35
 
@@ -40,8 +39,3 @@ clt = ProperCLT(d_model=config["d_model"], n_features=config["n_features"])
40
  clt.load_state_dict(torch.load(weights_path, map_location="cpu"))
41
  clt.eval()
42
  ```
43
-
44
- ## Purpose
45
- Part of an experiment extending the circuit-tracing paper
46
- (Ameisen et al. 2025) to include attention circuit attribution
47
- via feature-space Jacobians.
 
10
 
11
  # GPT-2 Cross-Layer Transcoder — Layer 0
12
 
13
+ Trained to reconstruct MLP output at layer 0 of GPT-2 from the residual stream.
14
 
15
  ## Architecture
16
  - **Input**: residual stream before layer 0 (d_model=768)
17
  - **Output**: MLP output at layer 0 (d_model=768)
18
+ - **Features**: 8192 (JumpReLU sparse)
19
  - **Training data**: WikiText-103 (5,000 documents, seq_len=64)
20
 
21
  ## Metrics
22
+ - **R²**: 0.9409
23
+ - **Dead features**: 38/8192
24
+ - **Training steps**: 5000
25
+ - **Sparsity coef**: 0.02
26
 
27
  ## Usage
28
  ```python
29
+ import torch, json
 
30
  from huggingface_hub import hf_hub_download
31
 
 
32
  weights_path = hf_hub_download("pointbreak3000/gpt2-clt-layer0", "model.pt")
33
  config_path = hf_hub_download("pointbreak3000/gpt2-clt-layer0", "config.json")
34
 
 
39
  clt.load_state_dict(torch.load(weights_path, map_location="cpu"))
40
  clt.eval()
41
  ```