maximuspowers commited on
Commit
857f6a8
·
verified ·
1 Parent(s): d6f25d2

Upload weight-space autoencoder (encoder + decoder) and configuration

Browse files
Files changed (5) hide show
  1. README.md +42 -0
  2. config.yaml +111 -0
  3. decoder.pt +3 -0
  4. encoder.pt +3 -0
  5. tokenizer_config.json +7 -0
README.md ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - weight-space-learning
4
+ - neural-network-autoencoder
5
+ - autoencoder
6
+ - transformer
7
+ datasets:
8
+ - maximuspowers/muat-fourier-5
9
+ ---
10
+
11
+ # Weight-Space Autoencoder (TRANSFORMER)
12
+
13
+ This model is a weight-space autoencoder trained on neural network activation weights/signatures.
14
+ It includes both an encoder (compresses weights into latent representations) and a decoder (reconstructs weights from latent codes).
15
+
16
+ ## Model Description
17
+
18
+ - **Architecture**: Transformer encoder-decoder
19
+ - **Training Dataset**: maximuspowers/muat-fourier-5
20
+ - **Input Mode**: signature
21
+ - **Latent Dimension**: 256
22
+
23
+ ## Tokenization
24
+
25
+ - **Chunk Size**: 64 weight values per token
26
+ - **Max Tokens**: 512
27
+ - **Metadata**: True
28
+
29
+ ## Training Config
30
+
31
+ - **Loss Function**: cosine
32
+ - **Optimizer**: adam
33
+ - **Learning Rate**: 0.0001
34
+ - **Batch Size**: 16
35
+
36
+ ## Performance Metrics (Test Set)
37
+
38
+ - **MSE**: 0.299696
39
+ - **MAE**: 0.303521
40
+ - **RMSE**: 0.547445
41
+ - **Cosine Similarity**: 0.8642
42
+ - **R² Score**: 0.0638
config.yaml ADDED
@@ -0,0 +1,111 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ architecture:
2
+ latent_dim: 256
3
+ mlp:
4
+ decoder:
5
+ activation: relu
6
+ batch_norm: true
7
+ dropout: 0.2
8
+ hidden_dims:
9
+ - 256
10
+ - 384
11
+ - 512
12
+ encoder:
13
+ activation: relu
14
+ batch_norm: true
15
+ dropout: 0.2
16
+ hidden_dims:
17
+ - 512
18
+ - 384
19
+ - 256
20
+ token_pooling: mean
21
+ transformer:
22
+ decoder:
23
+ activation: relu
24
+ d_model: 512
25
+ dim_feedforward: 2048
26
+ dropout: 0.1
27
+ num_heads: 8
28
+ num_layers: 6
29
+ encoder:
30
+ activation: relu
31
+ d_model: 512
32
+ dim_feedforward: 2048
33
+ dropout: 0.1
34
+ num_heads: 8
35
+ num_layers: 6
36
+ pooling: mean
37
+ positional_encoding: learned
38
+ type: transformer
39
+ dataloader:
40
+ num_workers: 0
41
+ pin_memory: true
42
+ dataset:
43
+ hf_dataset: maximuspowers/muat-fourier-5
44
+ input_mode: signature
45
+ max_dimensions:
46
+ max_hidden_layers: 6
47
+ max_neurons_per_layer: 8
48
+ max_sequence_length: 5
49
+ neuron_profile:
50
+ methods:
51
+ - fourier
52
+ random_seed: 42
53
+ test_split: 0.1
54
+ train_split: 0.8
55
+ val_split: 0.1
56
+ device:
57
+ type: auto
58
+ evaluation:
59
+ metrics:
60
+ - mse
61
+ - mae
62
+ - rmse
63
+ - cosine_similarity
64
+ - relative_error
65
+ - r2_score
66
+ per_layer_metrics: false
67
+ hub:
68
+ enabled: true
69
+ private: false
70
+ push_logs: true
71
+ push_metrics: true
72
+ push_model: true
73
+ repo_id: maximuspowers/sig-autoencoder-fourier-5
74
+ token: <REDACTED>
75
+ logging:
76
+ checkpoint:
77
+ enabled: true
78
+ mode: min
79
+ monitor: val_loss
80
+ save_best_only: true
81
+ tensorboard:
82
+ auto_launch: true
83
+ enabled: true
84
+ log_interval: 10
85
+ port: 6006
86
+ verbose: true
87
+ loss:
88
+ type: cosine
89
+ run_dir: /Users/max/Desktop/muat/model_zoo/runs/train-encoder-decoder_config_2025-12-10_13-14-11
90
+ run_log_cleanup: false
91
+ tokenization:
92
+ chunk_size: 64
93
+ include_metadata: true
94
+ max_tokens: 512
95
+ training:
96
+ batch_size: 16
97
+ early_stopping:
98
+ enabled: true
99
+ mode: min
100
+ monitor: val_loss
101
+ patience: 5
102
+ epochs: 100
103
+ learning_rate: 0.0001
104
+ lr_scheduler:
105
+ enabled: true
106
+ factor: 0.5
107
+ min_lr: 1.0e-06
108
+ patience: 3
109
+ type: reduce_on_plateau
110
+ optimizer: adam
111
+ weight_decay: 0.0001
decoder.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a45a2663fc4303166220a9800ddc9166f0649d6709331b5a8b0207c5d8ae41cb
3
+ size 102657998
encoder.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9deab4af43381f8b2516638e872c770b3b7fee2f44e57a1c86f0b4e4b8fce98e
3
+ size 77405804
tokenizer_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "chunk_size": 64,
3
+ "max_tokens": 512,
4
+ "include_metadata": true,
5
+ "metadata_features": 5,
6
+ "token_dim": 69
7
+ }