IlayMalinyak commited on
Commit
946e951
·
1 Parent(s): ef81795

add pretrained model

Browse files
pretrained_models/MultiTaskRegressor_spectra__decode_1_complete_config.yaml ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ conformer_args:
2
+ dropout_p: 0.2
3
+ encoder:
4
+ - mhsa_pro
5
+ - conv
6
+ - ffn
7
+ encoder_dim: 2048
8
+ kernel_size: 3
9
+ norm: postnorm
10
+ num_heads: 8
11
+ num_layers: 8
12
+ timeshift: false
13
+ beta: 0.866 # Calculated as (num_layers/6)^(-0.5) for 8 layers
14
+ data_args:
15
+ batch_size: 64
16
+ continuum_norm: true
17
+ create_umap: false
18
+ data_dir: /data/lamost/data
19
+ dataset: SpectraDataset
20
+ exp_num: 1
21
+ lc_freq: 0.0208
22
+ log_dir: /data/lightSpec/logs
23
+ max_days_lc: 720
24
+ max_len_spectra: 4096
25
+ model_name: MultiTaskRegressor
26
+ num_epochs: 1000
27
+ test_run: false
28
+ model_args:
29
+ activation: silu
30
+ avg_output: false
31
+ beta: 1
32
+ checkpoint_num: 1
33
+ checkpoint_path: /data/lightSpec/logs/spec_decode2_2025-01-11/MultiTaskRegressor_spectra_decode_1.pth
34
+ dropout_p: 0.2
35
+ encoder_dims:
36
+ - 64
37
+ - 128
38
+ - 256
39
+ - 1024
40
+ - 2048
41
+ in_channels: 1
42
+ kernel_size: 3
43
+ load_checkpoint: false
44
+ num_layers: 5
45
+ num_quantiles: 5
46
+ output_dim: 3
47
+ stride: 1
48
+ transformer_layers: 4
49
+ model_name: MultiTaskRegressor
50
+ model_structure: "DistributedDataParallel(\n (module): MultiTaskRegressor(\n (encoder):\
51
+ \ MultiEncoder(\n (backbone): CNNEncoder(\n (activation): SiLU()\n \
52
+ \ (embedding): Sequential(\n (0): Conv1d(1, 64, kernel_size=(3,),\
53
+ \ stride=(1,), padding=same, bias=False)\n (1): BatchNorm1d(64, eps=1e-05,\
54
+ \ momentum=0.1, affine=True, track_running_stats=True)\n (2): SiLU()\n\
55
+ \ )\n (layers): ModuleList(\n (0): ConvBlock(\n \
56
+ \ (activation): SiLU()\n (layers): Sequential(\n (0):\
57
+ \ Conv1d(64, 128, kernel_size=(3,), stride=(1,), padding=same, bias=False)\n \
58
+ \ (1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n\
59
+ \ (2): SiLU()\n )\n )\n (1): ConvBlock(\n\
60
+ \ (activation): SiLU()\n (layers): Sequential(\n \
61
+ \ (0): Conv1d(128, 256, kernel_size=(3,), stride=(1,), padding=same, bias=False)\n\
62
+ \ (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n\
63
+ \ (2): SiLU()\n )\n )\n (2): ConvBlock(\n\
64
+ \ (activation): SiLU()\n (layers): Sequential(\n \
65
+ \ (0): Conv1d(256, 1024, kernel_size=(3,), stride=(1,), padding=same, bias=False)\n\
66
+ \ (1): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n\
67
+ \ (2): SiLU()\n )\n )\n (3): ConvBlock(\n\
68
+ \ (activation): SiLU()\n (layers): Sequential(\n \
69
+ \ (0): Conv1d(1024, 2048, kernel_size=(3,), stride=(1,), padding=same, bias=False)\n\
70
+ \ (1): BatchNorm1d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n\
71
+ \ (2): SiLU()\n )\n )\n (4): ConvBlock(\n\
72
+ \ (activation): SiLU()\n (layers): Sequential(\n \
73
+ \ (0): Conv1d(2048, 2048, kernel_size=(3,), stride=(1,), padding=same, bias=False)\n\
74
+ \ (1): BatchNorm1d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n\
75
+ \ (2): SiLU()\n )\n )\n )\n (pool):\
76
+ \ MaxPool1d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n \
77
+ \ )\n (pe): RotaryEmbedding()\n (encoder): ConformerEncoder(\n \
78
+ \ (blocks): ModuleList(\n (0-7): 8 x ConformerBlock(\n (modlist):\
79
+ \ ModuleList(\n (0): PostNorm(\n (module): MHA_rotary(\n\
80
+ \ (query): Linear(in_features=2048, out_features=2048, bias=True)\n\
81
+ \ (key): Linear(in_features=2048, out_features=2048, bias=True)\n\
82
+ \ (value): Linear(in_features=2048, out_features=2048, bias=True)\n\
83
+ \ (rotary_emb): RotaryEmbedding()\n (output):\
84
+ \ Linear(in_features=2048, out_features=2048, bias=True)\n )\n \
85
+ \ (norm): LayerNorm((2048,), eps=1e-05, elementwise_affine=True)\n\
86
+ \ )\n (1): PostNorm(\n (module): ConvBlock(\n\
87
+ \ (layers): Sequential(\n (0): Conv1d(2048,\
88
+ \ 2048, kernel_size=(3,), stride=(1,), padding=same, bias=False)\n \
89
+ \ (1): BatchNorm1d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n\
90
+ \ (2): SiLU()\n )\n )\n \
91
+ \ (norm): LayerNorm((2048,), eps=1e-05, elementwise_affine=True)\n \
92
+ \ )\n (2): PostNorm(\n (module): FeedForwardModule(\n\
93
+ \ (sequential): Sequential(\n (0): LayerNorm((2048,),\
94
+ \ eps=1e-05, elementwise_affine=True)\n (1): Linear(\n \
95
+ \ (linear): Linear(in_features=2048, out_features=8192, bias=True)\n\
96
+ \ )\n (2): SiLU()\n (3):\
97
+ \ Dropout(p=0.2, inplace=False)\n (4): Linear(\n \
98
+ \ (linear): Linear(in_features=8192, out_features=2048, bias=True)\n \
99
+ \ )\n (5): Dropout(p=0.2, inplace=False)\n \
100
+ \ )\n )\n (norm): LayerNorm((2048,),\
101
+ \ eps=1e-05, elementwise_affine=True)\n )\n )\n \
102
+ \ )\n )\n )\n )\n (decoder): CNNDecoder(\n (activation):\
103
+ \ SiLU()\n (initial_expand): Linear(in_features=2048, out_features=8192, bias=True)\n\
104
+ \ (layers): ModuleList(\n (0): Sequential(\n (0): ConvTranspose1d(2048,\
105
+ \ 1024, kernel_size=(4,), stride=(2,), padding=(1,), bias=False)\n (1):\
106
+ \ BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n\
107
+ \ (2): SiLU()\n )\n (1): Sequential(\n (0): ConvTranspose1d(1024,\
108
+ \ 256, kernel_size=(4,), stride=(2,), padding=(1,), bias=False)\n (1):\
109
+ \ BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n\
110
+ \ (2): SiLU()\n )\n (2): Sequential(\n (0): ConvTranspose1d(256,\
111
+ \ 128, kernel_size=(4,), stride=(2,), padding=(1,), bias=False)\n (1):\
112
+ \ BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n\
113
+ \ (2): SiLU()\n )\n (3): Sequential(\n (0): ConvTranspose1d(128,\
114
+ \ 64, kernel_size=(4,), stride=(2,), padding=(1,), bias=False)\n (1): BatchNorm1d(64,\
115
+ \ eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n (2):\
116
+ \ SiLU()\n )\n (4): Sequential(\n (0): ConvTranspose1d(64,\
117
+ \ 64, kernel_size=(4,), stride=(2,), padding=(1,), bias=False)\n (1): BatchNorm1d(64,\
118
+ \ eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n (2):\
119
+ \ SiLU()\n )\n )\n (final_conv): ConvTranspose1d(64, 1, kernel_size=(3,),\
120
+ \ stride=(1,), padding=(1,))\n )\n (activation): SiLU()\n (regressor):\
121
+ \ Sequential(\n (0): Linear(in_features=2048, out_features=1024, bias=True)\n\
122
+ \ (1): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n\
123
+ \ (2): SiLU()\n (3): Dropout(p=0.2, inplace=False)\n (4): Linear(in_features=1024,\
124
+ \ out_features=15, bias=True)\n )\n )\n)"
125
+ num_params: 551944464
126
+ optim_args:
127
+ max_lr: 2e-5
128
+ quantiles:
129
+ - 0.1
130
+ - 0.25
131
+ - 0.5
132
+ - 0.75
133
+ - 0.9
134
+ steps_per_epoch: 3500
135
+ warmup_pct: 0.3
136
+ weight_decay: 5e-6
137
+ transforms: "Compose(\n LAMOSTSpectrumPreprocessor(blue_range=(3841, 5800), red_range=(5800,\
138
+ \ 8798), resample_step=0.0001)\n ToTensor\n)"
pretrained_models/spectra.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c1b00ff8c3ab815ed26942d238e71cebd3165e9bd468cca7461ff1351f408fda
3
+ size 2208107954