luciaquirke commited on
Commit
af6f73e
·
verified ·
1 Parent(s): 42047e5

Upload affine transforms with MSE metrics

Browse files
Files changed (3) hide show
  1. README.md +57 -0
  2. affine_transforms.safetensors +3 -0
  3. metadata.json +12 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - affine-transform
4
+ - activation-mapping
5
+ library_name: safetensors
6
+ ---
7
+
8
+ # Affine Transform: EleutherAI/deep-ignorance-pretraining-stage-unfiltered@global_step54832 → EleutherAI/deep-ignorance-unfiltered@main
9
+
10
+ Learned affine transformation mapping hidden state activations from a source checkpoint to a target model.
11
+
12
+ ## Usage
13
+
14
+ ```python
15
+ from safetensors.torch import load_file
16
+ import torch.nn as nn
17
+ from huggingface_hub import hf_hub_download
18
+
19
+ # Download files
20
+ weights_path = hf_hub_download(repo_id="EleutherAI/affine-checkpoint-transfer-step54832", filename="affine_transforms.safetensors")
21
+ metadata_path = hf_hub_download(repo_id="EleutherAI/affine-checkpoint-transfer-step54832", filename="metadata.json")
22
+
23
+ # Load
24
+ import json
25
+ with open(metadata_path) as f:
26
+ metadata = json.load(f)
27
+
28
+ weights = load_file(weights_path)
29
+ affine_transforms = {}
30
+ for layer_idx in metadata["layer_indices"]:
31
+ linear = nn.Linear(metadata["hidden_dim"], metadata["hidden_dim"], bias=True)
32
+ linear.weight.data = weights[f"layer_{layer_idx}.weight"]
33
+ linear.bias.data = weights[f"layer_{layer_idx}.bias"]
34
+ affine_transforms[layer_idx] = linear
35
+ ```
36
+
37
+ ## MSE Metrics
38
+
39
+ | Layer | MSE |
40
+ |-------|-----|
41
+ | 5 | 0.061545 |
42
+ | 10 | 0.225621 |
43
+ | 15 | 0.552396 |
44
+ | 20 | 0.800604 |
45
+ | 25 | 1.295939 |
46
+ | 30 | 3.541077 |
47
+
48
+ **Mean MSE: 1.079530**
49
+
50
+ ## Training Details
51
+
52
+ - **Source Model:** EleutherAI/deep-ignorance-pretraining-stage-unfiltered@global_step54832
53
+ - **Target Model:** EleutherAI/deep-ignorance-unfiltered@main
54
+ - **Hidden Dimension:** 4096
55
+ - **Ridge Alpha:** 0.01
56
+ - **Layers:** [5, 10, 15, 20, 25, 30]
57
+ - **Training Examples:** 100000
affine_transforms.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9c3034a34ce55a3ffc65aff2f5a5f376c76e3fefae0a5462df78f412373b391d
3
+ size 402752528
metadata.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "layer_indices": [
3
+ 5,
4
+ 10,
5
+ 15,
6
+ 20,
7
+ 25,
8
+ 30
9
+ ],
10
+ "hidden_dim": 4096,
11
+ "alpha": 0.01
12
+ }