maximuspowers commited on
Commit
08226c4
·
verified ·
1 Parent(s): 2d17341

Upload weight-space autoencoder (encoder + decoder) and configuration

Browse files
Files changed (5) hide show
  1. README.md +42 -0
  2. config.yaml +122 -0
  3. decoder.pt +3 -0
  4. encoder.pt +3 -0
  5. tokenizer_config.json +7 -0
README.md ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - weight-space-learning
4
+ - neural-network-autoencoder
5
+ - autoencoder
6
+ - transformer
7
+ datasets:
8
+ - maximuspowers/muat-fourier-5
9
+ ---
10
+
11
+ # Weight-Space Autoencoder (TRANSFORMER)
12
+
13
+ This model is a weight-space autoencoder trained on neural network activation weights/signatures.
14
+ It includes both an encoder (compresses weights into latent representations) and a decoder (reconstructs weights from latent codes).
15
+
16
+ ## Model Description
17
+
18
+ - **Architecture**: Transformer encoder-decoder
19
+ - **Training Dataset**: maximuspowers/muat-fourier-5
20
+ - **Input Mode**: signature
21
+ - **Latent Dimension**: 256
22
+
23
+ ## Tokenization
24
+
25
+ - **Chunk Size**: 64 weight values per token
26
+ - **Max Tokens**: 512
27
+ - **Metadata**: True
28
+
29
+ ## Training Config
30
+
31
+ - **Loss Function**: contrastive
32
+ - **Optimizer**: adam
33
+ - **Learning Rate**: 0.0001
34
+ - **Batch Size**: 8
35
+
36
+ ## Performance Metrics (Test Set)
37
+
38
+ - **MSE**: 0.088272
39
+ - **MAE**: 0.213148
40
+ - **RMSE**: 0.297107
41
+ - **Cosine Similarity**: 0.8509
42
+ - **R² Score**: 0.7242
config.yaml ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ architecture:
2
+ latent_dim: 256
3
+ mlp:
4
+ decoder:
5
+ activation: relu
6
+ batch_norm: true
7
+ dropout: 0.2
8
+ hidden_dims:
9
+ - 256
10
+ - 384
11
+ - 512
12
+ encoder:
13
+ activation: relu
14
+ batch_norm: true
15
+ dropout: 0.2
16
+ hidden_dims:
17
+ - 512
18
+ - 384
19
+ - 256
20
+ token_pooling: mean
21
+ transformer:
22
+ decoder:
23
+ activation: relu
24
+ d_model: 512
25
+ dim_feedforward: 2048
26
+ dropout: 0.1
27
+ num_heads: 8
28
+ num_layers: 6
29
+ encoder:
30
+ activation: relu
31
+ d_model: 512
32
+ dim_feedforward: 2048
33
+ dropout: 0.1
34
+ num_heads: 8
35
+ num_layers: 6
36
+ pooling: mean
37
+ positional_encoding: learned
38
+ type: transformer
39
+ dataloader:
40
+ num_workers: 0
41
+ pin_memory: true
42
+ dataset:
43
+ hf_dataset: maximuspowers/muat-fourier-5
44
+ input_mode: signature
45
+ max_dimensions:
46
+ max_hidden_layers: 6
47
+ max_neurons_per_layer: 8
48
+ max_sequence_length: 5
49
+ neuron_profile:
50
+ methods:
51
+ - fourier
52
+ random_seed: 42
53
+ test_split: 0.1
54
+ train_split: 0.8
55
+ val_split: 0.1
56
+ device:
57
+ type: auto
58
+ evaluation:
59
+ metrics:
60
+ - mse
61
+ - mae
62
+ - rmse
63
+ - cosine_similarity
64
+ - relative_error
65
+ - r2_score
66
+ per_layer_metrics: false
67
+ hub:
68
+ enabled: true
69
+ private: false
70
+ push_logs: true
71
+ push_metrics: true
72
+ push_model: true
73
+ repo_id: maximuspowers/sig-autoencoder-fourier-5-simclr-mse
74
+ token: <REDACTED>
75
+ logging:
76
+ checkpoint:
77
+ enabled: true
78
+ mode: min
79
+ monitor: val_loss
80
+ save_best_only: true
81
+ tensorboard:
82
+ auto_launch: true
83
+ enabled: true
84
+ log_interval: 10
85
+ port: 6006
86
+ verbose: true
87
+ loss:
88
+ augmentation_type: noise
89
+ contrast_type: simclr
90
+ dropout_prob: 0.1
91
+ gamma: 0.4
92
+ noise_std: 0.01
93
+ projection_head:
94
+ hidden_dim: 256
95
+ input_dim: 256
96
+ output_dim: 128
97
+ reconstruction_type: mse
98
+ temperature: 0.1
99
+ type: contrastive
100
+ run_dir: /Users/max/Desktop/muat/model_zoo/runs/train-encoder-decoder_config_2025-12-11_16-05-33
101
+ run_log_cleanup: false
102
+ tokenization:
103
+ chunk_size: 64
104
+ include_metadata: true
105
+ max_tokens: 512
106
+ training:
107
+ batch_size: 8
108
+ early_stopping:
109
+ enabled: true
110
+ mode: min
111
+ monitor: val_loss
112
+ patience: 5
113
+ epochs: 100
114
+ learning_rate: 0.0001
115
+ lr_scheduler:
116
+ enabled: true
117
+ factor: 0.5
118
+ min_lr: 1.0e-06
119
+ patience: 3
120
+ type: reduce_on_plateau
121
+ optimizer: adam
122
+ weight_decay: 0.0001
decoder.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:591946e91abb22e3de70e569745392acb6bcbd6fd7cf4a7abda07f71f8b86f01
3
+ size 102658318
encoder.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0702f10bfe8da77b1af3c044704fd597145b295a0acf3b0689b1251f5cf38c1d
3
+ size 77406188
tokenizer_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "chunk_size": 64,
3
+ "max_tokens": 512,
4
+ "include_metadata": true,
5
+ "metadata_features": 5,
6
+ "token_dim": 69
7
+ }