Upload folder using huggingface_hub
Browse files- CALF/PEMS03_sl96_pl192_gpt12/checkpoint.pth +3 -0
- CALF/PEMS03_sl96_pl192_gpt12/log.txt +23 -0
- CALF/PEMS03_sl96_pl336_gpt12/checkpoint.pth +3 -0
- CALF/PEMS03_sl96_pl336_gpt12/log.txt +22 -0
- CALF/traffic_sl96_pl720_gpt12/checkpoint.pth +3 -0
- CALF/traffic_sl96_pl720_gpt12/log.txt +13 -0
- CALF/traffic_sl96_pl96_gpt12/checkpoint.pth +3 -0
- CALF/traffic_sl96_pl96_gpt12/log.txt +12 -0
- FSCA/ETTm2_96/checkpoint.pth +3 -0
- FSCA/Electricity_96/checkpoint.pth +3 -0
- FSCA/Solar_96/checkpoint.pth +3 -0
- FSCA/weather_96/checkpoint.pth +3 -0
- OFA/PEMS04_336/checkpoint-65624/pytorch_model.bin +3 -0
- OFA/PEMS04_336/checkpoint-65624/trainer_state.json +1142 -0
- OFA/PEMS04_336/checkpoint-65624/training_args.bin +3 -0
- OFA/Solar_192/checkpoint-14556/pytorch_model.bin +3 -0
- OFA/Solar_192/checkpoint-14556/trainer_state.json +330 -0
- OFA/Solar_192/checkpoint-14556/training_args.bin +3 -0
- OFA/exchange_rate_192/checkpoint-299/pytorch_model.bin +3 -0
- OFA/exchange_rate_192/checkpoint-299/trainer_state.json +57 -0
- OFA/exchange_rate_192/checkpoint-299/training_args.bin +3 -0
- OFA/weather_720/checkpoint-368/pytorch_model.bin +3 -0
- OFA/weather_720/checkpoint-368/trainer_state.json +57 -0
- OFA/weather_720/checkpoint-368/training_args.bin +3 -0
- TimeLLM/ETTm1_512_192_TimeLLM_ETTm1_sl512_pl192_dm32_nh8_df128/checkpoint.pth +3 -0
- TimeLLM/PEMS07_512_336_TimeLLM_PEMS07_sl512_pl336_dm16_nh8_df32/checkpoint.pth +3 -0
- TimeLLM/PEMS08_512_720_TimeLLM_PEMS08_sl512_pl720_dm16_nh8_df32/checkpoint.pth +3 -0
- TimeLLM/PEMS08_512_720_TimeLLM_PEMS08_sl512_pl720_dm16_nh8_df32/log.txt +28 -0
- TimeLLM/electricity_512_192_TimeLLM_electricity_sl512_pl192_dm16_nh8_df32/checkpoint.pth +3 -0
CALF/PEMS03_sl96_pl192_gpt12/checkpoint.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0e9523a29e41abdf6d4cfd48d79b51becbf0b1e0fe249664a00ceac7dff3333d
|
| 3 |
+
size 1090570197
|
CALF/PEMS03_sl96_pl192_gpt12/log.txt
ADDED
|
@@ -0,0 +1,23 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
>>>>>>>start training>>>>>>>>>>>>>>
|
| 2 |
+
Namespace(is_training=1, model_id='PEMS03_96_192', model='CALF', data='PEMS03', features='M', target='OT', freq='h', checkpoints='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/', seq_len=96, pred_len=192, d_model=768, d_ff=768, dropout=0.3, embed='timeF', num_workers=20, train_epochs=100, batch_size=512, patience=3, learning_rate=0.001, lradj='type1', task_loss='smooth_l1', feature_loss='smooth_l1', output_loss='smooth_l1', tmax=20, cos=1, r=8, lora_alpha=32, lora_dropout=0.1, word_embedding_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/wte_pca_500.pt', task_w=1.0, feature_w=0.01, output_w=1.0, gpt_layers=12, test_metrics_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/test_metrics_path/96_192.txt', multi=0, block_or_sublayer='no', load_path='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/PEMS03_sl96_pl192_gpt12/log.txt')
|
| 3 |
+
Epoch: 1 cost time: 1213.5586137771606Epoch: 1, Steps: 12627 | Train Loss: 0.2405325 Vali Loss: 0.4243636lr = 0.0009045095Validation loss decreased (inf --> 0.424364). Saving model ...
|
| 4 |
+
Epoch: 2 cost time: 1214.5051691532135Epoch: 2, Steps: 12627 | Train Loss: 0.1846573 Vali Loss: 0.4036770lr = 0.0006545120Validation loss decreased (0.424364 --> 0.403677). Saving model ...
|
| 5 |
+
Epoch: 3 cost time: 1216.8372721672058Epoch: 3, Steps: 12627 | Train Loss: 0.1736200 Vali Loss: 0.3934794lr = 0.0003454980Validation loss decreased (0.403677 --> 0.393479). Saving model ...
|
| 6 |
+
Epoch: 4 cost time: 1209.1885206699371Epoch: 4, Steps: 12627 | Train Loss: 0.1654215 Vali Loss: 0.3756141lr = 0.0000955005Validation loss decreased (0.393479 --> 0.375614). Saving model ...
|
| 7 |
+
Epoch: 5 cost time: 1221.2066152095795Epoch: 5, Steps: 12627 | Train Loss: 0.1578077 Vali Loss: 0.3667139lr = 0.0000000100Validation loss decreased (0.375614 --> 0.366714). Saving model ...
|
| 8 |
+
Epoch: 6 cost time: 1215.5747256278992Epoch: 6, Steps: 12627 | Train Loss: 0.1550683 Vali Loss: 0.3650614lr = 0.0000955005Validation loss decreased (0.366714 --> 0.365061). Saving model ...
|
| 9 |
+
Epoch: 7 cost time: 1222.5011975765228Epoch: 7, Steps: 12627 | Train Loss: 0.1548908 Vali Loss: 0.3642367lr = 0.0003454980Validation loss decreased (0.365061 --> 0.364237). Saving model ...
|
| 10 |
+
Epoch: 8 cost time: 1218.6213192939758Epoch: 8, Steps: 12627 | Train Loss: 0.1566171 Vali Loss: 0.3638320lr = 0.0006545120Validation loss decreased (0.364237 --> 0.363832). Saving model ...
|
| 11 |
+
Epoch: 9 cost time: 1217.3795993328094Epoch: 9, Steps: 12627 | Train Loss: 0.1578000 Vali Loss: 0.3680637lr = 0.0009045095EarlyStopping counter: 1 out of 3
|
| 12 |
+
Epoch: 10 cost time: 1219.9044802188873Epoch: 10, Steps: 12627 | Train Loss: 0.1567791 Vali Loss: 0.3629943lr = 0.0010000000Validation loss decreased (0.363832 --> 0.362994). Saving model ...
|
| 13 |
+
Epoch: 11 cost time: 1219.7232382297516Epoch: 11, Steps: 12627 | Train Loss: 0.1545983 Vali Loss: 0.3605110lr = 0.0009045095Validation loss decreased (0.362994 --> 0.360511). Saving model ...
|
| 14 |
+
Epoch: 12 cost time: 1214.502541065216Epoch: 12, Steps: 12627 | Train Loss: 0.1509807 Vali Loss: 0.3564376lr = 0.0006545120Validation loss decreased (0.360511 --> 0.356438). Saving model ...
|
| 15 |
+
Epoch: 13 cost time: 1218.4142220020294Epoch: 13, Steps: 12627 | Train Loss: 0.1465258 Vali Loss: 0.3490480lr = 0.0003454980Validation loss decreased (0.356438 --> 0.349048). Saving model ...
|
| 16 |
+
Epoch: 14 cost time: 1216.8537709712982Epoch: 14, Steps: 12627 | Train Loss: 0.1419440 Vali Loss: 0.3439195lr = 0.0000955005Validation loss decreased (0.349048 --> 0.343919). Saving model ...
|
| 17 |
+
Epoch: 15 cost time: 1218.0101554393768Epoch: 15, Steps: 12627 | Train Loss: 0.1383972 Vali Loss: 0.3422484lr = 0.0000000100Validation loss decreased (0.343919 --> 0.342248). Saving model ...
|
| 18 |
+
Epoch: 16 cost time: 1222.7025911808014Epoch: 16, Steps: 12627 | Train Loss: 0.1374614 Vali Loss: 0.3408269lr = 0.0000955005Validation loss decreased (0.342248 --> 0.340827). Saving model ...
|
| 19 |
+
Epoch: 17 cost time: 1218.2776758670807Epoch: 17, Steps: 12627 | Train Loss: 0.1378242 Vali Loss: 0.3398700lr = 0.0003454980Validation loss decreased (0.340827 --> 0.339870). Saving model ...
|
| 20 |
+
Epoch: 18 cost time: 1214.6523876190186Epoch: 18, Steps: 12627 | Train Loss: 0.1396362 Vali Loss: 0.3457815lr = 0.0006545120EarlyStopping counter: 1 out of 3
|
| 21 |
+
Epoch: 19 cost time: 1215.1394522190094Epoch: 19, Steps: 12627 | Train Loss: 0.1422571 Vali Loss: 0.3470419lr = 0.0009045095EarlyStopping counter: 2 out of 3
|
| 22 |
+
Epoch: 20 cost time: 1215.5405178070068Epoch: 20, Steps: 12627 | Train Loss: 0.1439515 Vali Loss: 0.3507722lr = 0.0010000000EarlyStopping counter: 3 out of 3
|
| 23 |
+
Early stopping>>>>>>>testing>>>>>>>>>>>>>>
|
CALF/PEMS03_sl96_pl336_gpt12/checkpoint.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fde94b09516373bbaf8b1651e40a6fe4c09d400e21a0dc5c6f5d8363931198e1
|
| 3 |
+
size 1091013141
|
CALF/PEMS03_sl96_pl336_gpt12/log.txt
ADDED
|
@@ -0,0 +1,22 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
>>>>>>>start training>>>>>>>>>>>>>>
|
| 2 |
+
Namespace(is_training=1, model_id='PEMS03_96_336', model='CALF', data='PEMS03', features='M', target='OT', freq='h', checkpoints='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/', seq_len=96, pred_len=336, d_model=768, d_ff=768, dropout=0.3, embed='timeF', num_workers=20, train_epochs=100, batch_size=512, patience=3, learning_rate=0.001, lradj='type1', task_loss='smooth_l1', feature_loss='smooth_l1', output_loss='smooth_l1', tmax=20, cos=1, r=8, lora_alpha=32, lora_dropout=0.1, word_embedding_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/wte_pca_500.pt', task_w=1.0, feature_w=0.01, output_w=1.0, gpt_layers=12, test_metrics_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/test_metrics_path/96_336.txt', multi=0, block_or_sublayer='no', load_path='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/PEMS03_sl96_pl336_gpt12/log.txt')
|
| 3 |
+
Epoch: 1 cost time: 1228.0863962173462Epoch: 1, Steps: 12526 | Train Loss: 0.1989904 Vali Loss: 0.3550510lr = 0.0009045095Validation loss decreased (inf --> 0.355051). Saving model ...
|
| 4 |
+
Epoch: 2 cost time: 1205.9942526817322Epoch: 2, Steps: 12526 | Train Loss: 0.1547784 Vali Loss: 0.3398961lr = 0.0006545120Validation loss decreased (0.355051 --> 0.339896). Saving model ...
|
| 5 |
+
Epoch: 3 cost time: 1219.171292066574Epoch: 3, Steps: 12526 | Train Loss: 0.1466388 Vali Loss: 0.3313598lr = 0.0003454980Validation loss decreased (0.339896 --> 0.331360). Saving model ...
|
| 6 |
+
Epoch: 4 cost time: 1218.2594282627106Epoch: 4, Steps: 12526 | Train Loss: 0.1410800 Vali Loss: 0.3234892lr = 0.0000955005Validation loss decreased (0.331360 --> 0.323489). Saving model ...
|
| 7 |
+
Epoch: 5 cost time: 1217.5273847579956Epoch: 5, Steps: 12526 | Train Loss: 0.1370407 Vali Loss: 0.3151883lr = 0.0000000100Validation loss decreased (0.323489 --> 0.315188). Saving model ...
|
| 8 |
+
Epoch: 6 cost time: 1220.6260316371918Epoch: 6, Steps: 12526 | Train Loss: 0.1349816 Vali Loss: 0.3143618lr = 0.0000955005Validation loss decreased (0.315188 --> 0.314362). Saving model ...
|
| 9 |
+
Epoch: 7 cost time: 1219.025268316269Epoch: 7, Steps: 12526 | Train Loss: 0.1350453 Vali Loss: 0.3129805lr = 0.0003454980Validation loss decreased (0.314362 --> 0.312980). Saving model ...
|
| 10 |
+
Epoch: 8 cost time: 1220.592013835907Epoch: 8, Steps: 12526 | Train Loss: 0.1358405 Vali Loss: 0.3148754lr = 0.0006545120EarlyStopping counter: 1 out of 3
|
| 11 |
+
Epoch: 9 cost time: 1220.2405898571014Epoch: 9, Steps: 12526 | Train Loss: 0.1357594 Vali Loss: 0.3106849lr = 0.0009045095Validation loss decreased (0.312980 --> 0.310685). Saving model ...
|
| 12 |
+
Epoch: 10 cost time: 1222.7074172496796Epoch: 10, Steps: 12526 | Train Loss: 0.1350104 Vali Loss: 0.3121745lr = 0.0010000000EarlyStopping counter: 1 out of 3
|
| 13 |
+
Epoch: 11 cost time: 1220.4871294498444Epoch: 11, Steps: 12526 | Train Loss: 0.1332038 Vali Loss: 0.3067603lr = 0.0009045095Validation loss decreased (0.310685 --> 0.306760). Saving model ...
|
| 14 |
+
Epoch: 12 cost time: 1214.7389903068542Epoch: 12, Steps: 12526 | Train Loss: 0.1303220 Vali Loss: 0.3014657lr = 0.0006545120Validation loss decreased (0.306760 --> 0.301466). Saving model ...
|
| 15 |
+
Epoch: 13 cost time: 1224.0501599311829Epoch: 13, Steps: 12526 | Train Loss: 0.1271204 Vali Loss: 0.2972590lr = 0.0003454980Validation loss decreased (0.301466 --> 0.297259). Saving model ...
|
| 16 |
+
Epoch: 14 cost time: 1222.7768981456757Epoch: 14, Steps: 12526 | Train Loss: 0.1238745 Vali Loss: 0.2942495lr = 0.0000955005Validation loss decreased (0.297259 --> 0.294249). Saving model ...
|
| 17 |
+
Epoch: 15 cost time: 1225.6895768642426Epoch: 15, Steps: 12526 | Train Loss: 0.1216603 Vali Loss: 0.2913513lr = 0.0000000100Validation loss decreased (0.294249 --> 0.291351). Saving model ...
|
| 18 |
+
Epoch: 16 cost time: 1216.2601499557495Epoch: 16, Steps: 12526 | Train Loss: 0.1207567 Vali Loss: 0.2911408lr = 0.0000955005Validation loss decreased (0.291351 --> 0.291141). Saving model ...
|
| 19 |
+
Epoch: 17 cost time: 1222.1440660953522Epoch: 17, Steps: 12526 | Train Loss: 0.1208047 Vali Loss: 0.2921268lr = 0.0003454980EarlyStopping counter: 1 out of 3
|
| 20 |
+
Epoch: 18 cost time: 1219.8968963623047Epoch: 18, Steps: 12526 | Train Loss: 0.1223113 Vali Loss: 0.2934834lr = 0.0006545120EarlyStopping counter: 2 out of 3
|
| 21 |
+
Epoch: 19 cost time: 1224.0830509662628Epoch: 19, Steps: 12526 | Train Loss: 0.1239117 Vali Loss: 0.2936475lr = 0.0009045095EarlyStopping counter: 3 out of 3
|
| 22 |
+
Early stopping>>>>>>>testing>>>>>>>>>>>>>>
|
CALF/traffic_sl96_pl720_gpt12/checkpoint.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8e051c17a4676999ab8d29b31fa984319dad9a4cdf81e53d8687c1348fe06754
|
| 3 |
+
size 1092194325
|
CALF/traffic_sl96_pl720_gpt12/log.txt
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
>>>>>>>start training>>>>>>>>>>>>>>
|
| 2 |
+
Namespace(is_training=1, model_id='traffic_96_720', model='CALF', data='traffic', features='M', target='OT', freq='h', checkpoints='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/', seq_len=96, pred_len=720, d_model=768, d_ff=768, dropout=0.3, embed='timeF', num_workers=20, train_epochs=100, batch_size=512, patience=3, learning_rate=0.001, lradj='type1', task_loss='smooth_l1', feature_loss='smooth_l1', output_loss='smooth_l1', tmax=20, cos=1, r=8, lora_alpha=32, lora_dropout=0.1, word_embedding_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/wte_pca_500.pt', task_w=1.0, feature_w=0.01, output_w=1.0, gpt_layers=12, test_metrics_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/test_metrics_path/96_720.txt', multi=0, block_or_sublayer='no', load_path='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/traffic_sl96_pl720_gpt12/log.txt')
|
| 3 |
+
Epoch: 1 cost time: 1706.1240828037262Epoch: 1, Steps: 19303 | Train Loss: 0.1389736 Vali Loss: 0.4617771lr = 0.0009045095Validation loss decreased (inf --> 0.461777). Saving model ...
|
| 4 |
+
Epoch: 2 cost time: 1691.999606847763Epoch: 2, Steps: 19303 | Train Loss: 0.1101452 Vali Loss: 0.4579828lr = 0.0006545120Validation loss decreased (0.461777 --> 0.457983). Saving model ...
|
| 5 |
+
Epoch: 3 cost time: 1694.3127541542053Epoch: 3, Steps: 19303 | Train Loss: 0.1069227 Vali Loss: 0.4562370lr = 0.0003454980Validation loss decreased (0.457983 --> 0.456237). Saving model ...
|
| 6 |
+
Epoch: 4 cost time: 1685.3278629779816Epoch: 4, Steps: 19303 | Train Loss: 0.1050713 Vali Loss: 0.4541570lr = 0.0000955005Validation loss decreased (0.456237 --> 0.454157). Saving model ...
|
| 7 |
+
Epoch: 5 cost time: 1705.2223370075226Epoch: 5, Steps: 19303 | Train Loss: 0.1038729 Vali Loss: 0.4523360lr = 0.0000000100Validation loss decreased (0.454157 --> 0.452336). Saving model ...
|
| 8 |
+
Epoch: 6 cost time: 1688.0082385540009Epoch: 6, Steps: 19303 | Train Loss: 0.1035937 Vali Loss: 0.4521805lr = 0.0000955005Validation loss decreased (0.452336 --> 0.452181). Saving model ...
|
| 9 |
+
Epoch: 7 cost time: 1696.0097732543945Epoch: 7, Steps: 19303 | Train Loss: 0.1033980 Vali Loss: 0.4520068lr = 0.0003454980Validation loss decreased (0.452181 --> 0.452007). Saving model ...
|
| 10 |
+
Epoch: 8 cost time: 1705.5962662696838Epoch: 8, Steps: 19303 | Train Loss: 0.1036567 Vali Loss: 0.4533293lr = 0.0006545120EarlyStopping counter: 1 out of 3
|
| 11 |
+
Epoch: 9 cost time: 1708.6191654205322Epoch: 9, Steps: 19303 | Train Loss: 0.1039235 Vali Loss: 0.4534400lr = 0.0009045095EarlyStopping counter: 2 out of 3
|
| 12 |
+
Epoch: 10 cost time: 1716.0039064884186Epoch: 10, Steps: 19303 | Train Loss: 0.1037365 Vali Loss: 0.4530395lr = 0.0010000000EarlyStopping counter: 3 out of 3
|
| 13 |
+
Early stopping>>>>>>>testing>>>>>>>>>>>>>>
|
CALF/traffic_sl96_pl96_gpt12/checkpoint.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bd53739b9377458f71a36d60e6d70a31f0d2566752372fc58c57b329bf9c9d2c
|
| 3 |
+
size 1090274901
|
CALF/traffic_sl96_pl96_gpt12/log.txt
ADDED
|
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
>>>>>>>start training>>>>>>>>>>>>>>
|
| 2 |
+
Namespace(is_training=1, model_id='traffic_96_96', model='CALF', data='traffic', features='M', target='OT', freq='h', checkpoints='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/', seq_len=96, pred_len=96, d_model=768, d_ff=768, dropout=0.3, embed='timeF', num_workers=20, train_epochs=100, batch_size=512, patience=3, learning_rate=0.001, lradj='type1', task_loss='smooth_l1', feature_loss='smooth_l1', output_loss='smooth_l1', tmax=20, cos=1, r=8, lora_alpha=32, lora_dropout=0.1, word_embedding_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/wte_pca_500.pt', task_w=1.0, feature_w=0.01, output_w=1.0, gpt_layers=12, test_metrics_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/test_metrics_path/96_96.txt', multi=0, block_or_sublayer='no', load_path='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/traffic_sl96_pl96_gpt12/log.txt')
|
| 3 |
+
Epoch: 1 cost time: 1765.7570405006409Epoch: 1, Steps: 20353 | Train Loss: 0.1241191 Vali Loss: 0.4174732lr = 0.0009045095Validation loss decreased (inf --> 0.417473). Saving model ...
|
| 4 |
+
Epoch: 2 cost time: 1800.9101555347443Epoch: 2, Steps: 20353 | Train Loss: 0.0977185 Vali Loss: 0.4121245lr = 0.0006545120Validation loss decreased (0.417473 --> 0.412125). Saving model ...
|
| 5 |
+
Epoch: 3 cost time: 1806.9728829860687Epoch: 3, Steps: 20353 | Train Loss: 0.0939899 Vali Loss: 0.4083741lr = 0.0003454980Validation loss decreased (0.412125 --> 0.408374). Saving model ...
|
| 6 |
+
Epoch: 4 cost time: 1796.8674626350403Epoch: 4, Steps: 20353 | Train Loss: 0.0918263 Vali Loss: 0.4060673lr = 0.0000955005Validation loss decreased (0.408374 --> 0.406067). Saving model ...
|
| 7 |
+
Epoch: 5 cost time: 1797.072764635086Epoch: 5, Steps: 20353 | Train Loss: 0.0907320 Vali Loss: 0.4039504lr = 0.0000000100Validation loss decreased (0.406067 --> 0.403950). Saving model ...
|
| 8 |
+
Epoch: 6 cost time: 1812.847502231598Epoch: 6, Steps: 20353 | Train Loss: 0.0902542 Vali Loss: 0.4035453lr = 0.0000955005Validation loss decreased (0.403950 --> 0.403545). Saving model ...
|
| 9 |
+
Epoch: 7 cost time: 1838.9145092964172Epoch: 7, Steps: 20353 | Train Loss: 0.0903154 Vali Loss: 0.4039853lr = 0.0003454980EarlyStopping counter: 1 out of 3
|
| 10 |
+
Epoch: 8 cost time: 1832.54545378685Epoch: 8, Steps: 20353 | Train Loss: 0.0908403 Vali Loss: 0.4056821lr = 0.0006545120EarlyStopping counter: 2 out of 3
|
| 11 |
+
Epoch: 9 cost time: 1830.4206190109253Epoch: 9, Steps: 20353 | Train Loss: 0.0911053 Vali Loss: 0.4059569lr = 0.0009045095EarlyStopping counter: 3 out of 3
|
| 12 |
+
Early stopping>>>>>>>testing>>>>>>>>>>>>>>
|
FSCA/ETTm2_96/checkpoint.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:859c5748206934aaac6e851a0cd0384d33944ff164848650735f68a05a8a85cb
|
| 3 |
+
size 552849378
|
FSCA/Electricity_96/checkpoint.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:dbfef391437cc4efc56bfadebc64763f1d85404336e1744f4e07ba3dfe12a969
|
| 3 |
+
size 553242594
|
FSCA/Solar_96/checkpoint.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8c19ef293164bbcd336d7af250fef398a71cde93e7380189ba074fb4344d6fb1
|
| 3 |
+
size 553242594
|
FSCA/weather_96/checkpoint.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1a69d95cea6110fb596ac710742e01f219cb8e20d22678e3f70e3823e59a1be6
|
| 3 |
+
size 553242594
|
OFA/PEMS04_336/checkpoint-65624/pytorch_model.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:221dba394b4fab3926f62dd85f226fbeab914ffb70e9a93524aa149c89cbdd80
|
| 3 |
+
size 270628842
|
OFA/PEMS04_336/checkpoint-65624/trainer_state.json
ADDED
|
@@ -0,0 +1,1142 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"best_global_step": 65624,
|
| 3 |
+
"best_metric": 0.4831537902355194,
|
| 4 |
+
"best_model_checkpoint": "/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/OFA_layer/haspara/336/PEMS04/checkpoint-65624",
|
| 5 |
+
"epoch": 13.0,
|
| 6 |
+
"eval_steps": 500,
|
| 7 |
+
"global_step": 65624,
|
| 8 |
+
"is_hyper_param_search": false,
|
| 9 |
+
"is_local_process_zero": true,
|
| 10 |
+
"is_world_process_zero": true,
|
| 11 |
+
"log_history": [
|
| 12 |
+
{
|
| 13 |
+
"epoch": 0.09904912836767037,
|
| 14 |
+
"grad_norm": 0.06085604056715965,
|
| 15 |
+
"learning_rate": 0.0009999803182724851,
|
| 16 |
+
"loss": 0.6841,
|
| 17 |
+
"step": 500
|
| 18 |
+
},
|
| 19 |
+
{
|
| 20 |
+
"epoch": 0.19809825673534073,
|
| 21 |
+
"grad_norm": 0.11020142585039139,
|
| 22 |
+
"learning_rate": 0.0009999211167982776,
|
| 23 |
+
"loss": 0.466,
|
| 24 |
+
"step": 1000
|
| 25 |
+
},
|
| 26 |
+
{
|
| 27 |
+
"epoch": 0.2971473851030111,
|
| 28 |
+
"grad_norm": 0.2978723347187042,
|
| 29 |
+
"learning_rate": 0.0009998224001777822,
|
| 30 |
+
"loss": 0.4559,
|
| 31 |
+
"step": 1500
|
| 32 |
+
},
|
| 33 |
+
{
|
| 34 |
+
"epoch": 0.39619651347068147,
|
| 35 |
+
"grad_norm": 0.13569457828998566,
|
| 36 |
+
"learning_rate": 0.0009996841762138337,
|
| 37 |
+
"loss": 0.4501,
|
| 38 |
+
"step": 2000
|
| 39 |
+
},
|
| 40 |
+
{
|
| 41 |
+
"epoch": 0.49524564183835185,
|
| 42 |
+
"grad_norm": 0.11234386265277863,
|
| 43 |
+
"learning_rate": 0.0009995064558320358,
|
| 44 |
+
"loss": 0.4456,
|
| 45 |
+
"step": 2500
|
| 46 |
+
},
|
| 47 |
+
{
|
| 48 |
+
"epoch": 0.5942947702060222,
|
| 49 |
+
"grad_norm": 0.23295480012893677,
|
| 50 |
+
"learning_rate": 0.0009992892530798984,
|
| 51 |
+
"loss": 0.4427,
|
| 52 |
+
"step": 3000
|
| 53 |
+
},
|
| 54 |
+
{
|
| 55 |
+
"epoch": 0.6933438985736925,
|
| 56 |
+
"grad_norm": 0.07768674939870834,
|
| 57 |
+
"learning_rate": 0.0009990325851257273,
|
| 58 |
+
"loss": 0.4412,
|
| 59 |
+
"step": 3500
|
| 60 |
+
},
|
| 61 |
+
{
|
| 62 |
+
"epoch": 0.7923930269413629,
|
| 63 |
+
"grad_norm": 0.13024824857711792,
|
| 64 |
+
"learning_rate": 0.000998736472257267,
|
| 65 |
+
"loss": 0.4392,
|
| 66 |
+
"step": 4000
|
| 67 |
+
},
|
| 68 |
+
{
|
| 69 |
+
"epoch": 0.8914421553090333,
|
| 70 |
+
"grad_norm": 0.21133571863174438,
|
| 71 |
+
"learning_rate": 0.0009984009378800963,
|
| 72 |
+
"loss": 0.438,
|
| 73 |
+
"step": 4500
|
| 74 |
+
},
|
| 75 |
+
{
|
| 76 |
+
"epoch": 0.9904912836767037,
|
| 77 |
+
"grad_norm": 0.3099469542503357,
|
| 78 |
+
"learning_rate": 0.0009980260085157794,
|
| 79 |
+
"loss": 0.4366,
|
| 80 |
+
"step": 5000
|
| 81 |
+
},
|
| 82 |
+
{
|
| 83 |
+
"epoch": 1.0,
|
| 84 |
+
"eval_MAE": 0.3764905631542206,
|
| 85 |
+
"eval_MAPE": 349.27911376953125,
|
| 86 |
+
"eval_MSE": 0.5023903846740723,
|
| 87 |
+
"eval_MSPE": 154474.234375,
|
| 88 |
+
"eval_ND": 0.4538463056087494,
|
| 89 |
+
"eval_RMSE": 0.7087950110435486,
|
| 90 |
+
"eval_SMAPE": 61.74858474731445,
|
| 91 |
+
"eval_runtime": 67.2343,
|
| 92 |
+
"eval_samples_per_second": 18698.269,
|
| 93 |
+
"eval_steps_per_second": 9.132,
|
| 94 |
+
"step": 5048
|
| 95 |
+
},
|
| 96 |
+
{
|
| 97 |
+
"epoch": 1.089540412044374,
|
| 98 |
+
"grad_norm": 0.167328879237175,
|
| 99 |
+
"learning_rate": 0.0009976117137997695,
|
| 100 |
+
"loss": 0.4365,
|
| 101 |
+
"step": 5500
|
| 102 |
+
},
|
| 103 |
+
{
|
| 104 |
+
"epoch": 1.1885895404120443,
|
| 105 |
+
"grad_norm": 0.15923786163330078,
|
| 106 |
+
"learning_rate": 0.0009971580864790652,
|
| 107 |
+
"loss": 0.4335,
|
| 108 |
+
"step": 6000
|
| 109 |
+
},
|
| 110 |
+
{
|
| 111 |
+
"epoch": 1.2876386687797148,
|
| 112 |
+
"grad_norm": 0.23647330701351166,
|
| 113 |
+
"learning_rate": 0.0009966651624096236,
|
| 114 |
+
"loss": 0.433,
|
| 115 |
+
"step": 6500
|
| 116 |
+
},
|
| 117 |
+
{
|
| 118 |
+
"epoch": 1.3866877971473852,
|
| 119 |
+
"grad_norm": 0.08678620308637619,
|
| 120 |
+
"learning_rate": 0.0009961329805535251,
|
| 121 |
+
"loss": 0.4323,
|
| 122 |
+
"step": 7000
|
| 123 |
+
},
|
| 124 |
+
{
|
| 125 |
+
"epoch": 1.4857369255150554,
|
| 126 |
+
"grad_norm": 0.10297006368637085,
|
| 127 |
+
"learning_rate": 0.0009955615829758935,
|
| 128 |
+
"loss": 0.4327,
|
| 129 |
+
"step": 7500
|
| 130 |
+
},
|
| 131 |
+
{
|
| 132 |
+
"epoch": 1.5847860538827259,
|
| 133 |
+
"grad_norm": 0.12103519588708878,
|
| 134 |
+
"learning_rate": 0.0009949510148415722,
|
| 135 |
+
"loss": 0.4312,
|
| 136 |
+
"step": 8000
|
| 137 |
+
},
|
| 138 |
+
{
|
| 139 |
+
"epoch": 1.6838351822503963,
|
| 140 |
+
"grad_norm": 0.1411912441253662,
|
| 141 |
+
"learning_rate": 0.0009943013244115538,
|
| 142 |
+
"loss": 0.4317,
|
| 143 |
+
"step": 8500
|
| 144 |
+
},
|
| 145 |
+
{
|
| 146 |
+
"epoch": 1.7828843106180665,
|
| 147 |
+
"grad_norm": 0.16333575546741486,
|
| 148 |
+
"learning_rate": 0.0009936125630391644,
|
| 149 |
+
"loss": 0.4306,
|
| 150 |
+
"step": 9000
|
| 151 |
+
},
|
| 152 |
+
{
|
| 153 |
+
"epoch": 1.881933438985737,
|
| 154 |
+
"grad_norm": 0.25319305062294006,
|
| 155 |
+
"learning_rate": 0.000992884785166006,
|
| 156 |
+
"loss": 0.428,
|
| 157 |
+
"step": 9500
|
| 158 |
+
},
|
| 159 |
+
{
|
| 160 |
+
"epoch": 1.9809825673534074,
|
| 161 |
+
"grad_norm": 0.08323787152767181,
|
| 162 |
+
"learning_rate": 0.0009921180483176526,
|
| 163 |
+
"loss": 0.4294,
|
| 164 |
+
"step": 10000
|
| 165 |
+
},
|
| 166 |
+
{
|
| 167 |
+
"epoch": 2.0,
|
| 168 |
+
"eval_MAE": 0.37166666984558105,
|
| 169 |
+
"eval_MAPE": 348.12994384765625,
|
| 170 |
+
"eval_MSE": 0.49643462896347046,
|
| 171 |
+
"eval_MSPE": 161231.390625,
|
| 172 |
+
"eval_ND": 0.4480312764644623,
|
| 173 |
+
"eval_RMSE": 0.7045812010765076,
|
| 174 |
+
"eval_SMAPE": 61.25733947753906,
|
| 175 |
+
"eval_runtime": 54.411,
|
| 176 |
+
"eval_samples_per_second": 23104.958,
|
| 177 |
+
"eval_steps_per_second": 11.284,
|
| 178 |
+
"step": 10096
|
| 179 |
+
},
|
| 180 |
+
{
|
| 181 |
+
"epoch": 2.0800316957210776,
|
| 182 |
+
"grad_norm": 0.07184210419654846,
|
| 183 |
+
"learning_rate": 0.000991312413099103,
|
| 184 |
+
"loss": 0.4284,
|
| 185 |
+
"step": 10500
|
| 186 |
+
},
|
| 187 |
+
{
|
| 188 |
+
"epoch": 2.179080824088748,
|
| 189 |
+
"grad_norm": 0.3041013479232788,
|
| 190 |
+
"learning_rate": 0.0009904679431899906,
|
| 191 |
+
"loss": 0.4279,
|
| 192 |
+
"step": 11000
|
| 193 |
+
},
|
| 194 |
+
{
|
| 195 |
+
"epoch": 2.2781299524564185,
|
| 196 |
+
"grad_norm": 0.11070399731397629,
|
| 197 |
+
"learning_rate": 0.0009895847053395504,
|
| 198 |
+
"loss": 0.4286,
|
| 199 |
+
"step": 11500
|
| 200 |
+
},
|
| 201 |
+
{
|
| 202 |
+
"epoch": 2.3771790808240887,
|
| 203 |
+
"grad_norm": 0.14005513489246368,
|
| 204 |
+
"learning_rate": 0.0009886627693613424,
|
| 205 |
+
"loss": 0.427,
|
| 206 |
+
"step": 12000
|
| 207 |
+
},
|
| 208 |
+
{
|
| 209 |
+
"epoch": 2.476228209191759,
|
| 210 |
+
"grad_norm": 0.09888464212417603,
|
| 211 |
+
"learning_rate": 0.0009877022081277332,
|
| 212 |
+
"loss": 0.4266,
|
| 213 |
+
"step": 12500
|
| 214 |
+
},
|
| 215 |
+
{
|
| 216 |
+
"epoch": 2.5752773375594296,
|
| 217 |
+
"grad_norm": 0.31440699100494385,
|
| 218 |
+
"learning_rate": 0.000986703097564137,
|
| 219 |
+
"loss": 0.4276,
|
| 220 |
+
"step": 13000
|
| 221 |
+
},
|
| 222 |
+
{
|
| 223 |
+
"epoch": 2.6743264659270998,
|
| 224 |
+
"grad_norm": 0.1942104548215866,
|
| 225 |
+
"learning_rate": 0.0009856655166430126,
|
| 226 |
+
"loss": 0.4269,
|
| 227 |
+
"step": 13500
|
| 228 |
+
},
|
| 229 |
+
{
|
| 230 |
+
"epoch": 2.7733755942947704,
|
| 231 |
+
"grad_norm": 0.3018036186695099,
|
| 232 |
+
"learning_rate": 0.0009845895473776232,
|
| 233 |
+
"loss": 0.4263,
|
| 234 |
+
"step": 14000
|
| 235 |
+
},
|
| 236 |
+
{
|
| 237 |
+
"epoch": 2.8724247226624406,
|
| 238 |
+
"grad_norm": 0.10349088907241821,
|
| 239 |
+
"learning_rate": 0.0009834752748155522,
|
| 240 |
+
"loss": 0.4259,
|
| 241 |
+
"step": 14500
|
| 242 |
+
},
|
| 243 |
+
{
|
| 244 |
+
"epoch": 2.971473851030111,
|
| 245 |
+
"grad_norm": 0.12177316844463348,
|
| 246 |
+
"learning_rate": 0.0009823227870319814,
|
| 247 |
+
"loss": 0.426,
|
| 248 |
+
"step": 15000
|
| 249 |
+
},
|
| 250 |
+
{
|
| 251 |
+
"epoch": 3.0,
|
| 252 |
+
"eval_MAE": 0.3759180009365082,
|
| 253 |
+
"eval_MAPE": 352.69488525390625,
|
| 254 |
+
"eval_MSE": 0.4976975917816162,
|
| 255 |
+
"eval_MSPE": 169782.796875,
|
| 256 |
+
"eval_ND": 0.45315608382225037,
|
| 257 |
+
"eval_RMSE": 0.7054768800735474,
|
| 258 |
+
"eval_SMAPE": 61.35142517089844,
|
| 259 |
+
"eval_runtime": 54.733,
|
| 260 |
+
"eval_samples_per_second": 22969.03,
|
| 261 |
+
"eval_steps_per_second": 11.218,
|
| 262 |
+
"step": 15144
|
| 263 |
+
},
|
| 264 |
+
{
|
| 265 |
+
"epoch": 3.070522979397781,
|
| 266 |
+
"grad_norm": 0.09670817106962204,
|
| 267 |
+
"learning_rate": 0.0009811321751227293,
|
| 268 |
+
"loss": 0.4253,
|
| 269 |
+
"step": 15500
|
| 270 |
+
},
|
| 271 |
+
{
|
| 272 |
+
"epoch": 3.1695721077654517,
|
| 273 |
+
"grad_norm": 0.1024642288684845,
|
| 274 |
+
"learning_rate": 0.000979903533197051,
|
| 275 |
+
"loss": 0.4254,
|
| 276 |
+
"step": 16000
|
| 277 |
+
},
|
| 278 |
+
{
|
| 279 |
+
"epoch": 3.268621236133122,
|
| 280 |
+
"grad_norm": 0.07792851328849792,
|
| 281 |
+
"learning_rate": 0.0009786369583701987,
|
| 282 |
+
"loss": 0.4256,
|
| 283 |
+
"step": 16500
|
| 284 |
+
},
|
| 285 |
+
{
|
| 286 |
+
"epoch": 3.3676703645007926,
|
| 287 |
+
"grad_norm": 0.1396850198507309,
|
| 288 |
+
"learning_rate": 0.000977332550755746,
|
| 289 |
+
"loss": 0.4246,
|
| 290 |
+
"step": 17000
|
| 291 |
+
},
|
| 292 |
+
{
|
| 293 |
+
"epoch": 3.466719492868463,
|
| 294 |
+
"grad_norm": 0.2117234766483307,
|
| 295 |
+
"learning_rate": 0.0009759904134576747,
|
| 296 |
+
"loss": 0.4242,
|
| 297 |
+
"step": 17500
|
| 298 |
+
},
|
| 299 |
+
{
|
| 300 |
+
"epoch": 3.565768621236133,
|
| 301 |
+
"grad_norm": 0.09312921017408371,
|
| 302 |
+
"learning_rate": 0.0009746106525622252,
|
| 303 |
+
"loss": 0.4233,
|
| 304 |
+
"step": 18000
|
| 305 |
+
},
|
| 306 |
+
{
|
| 307 |
+
"epoch": 3.6648177496038032,
|
| 308 |
+
"grad_norm": 0.12480303645133972,
|
| 309 |
+
"learning_rate": 0.0009731933771295105,
|
| 310 |
+
"loss": 0.4246,
|
| 311 |
+
"step": 18500
|
| 312 |
+
},
|
| 313 |
+
{
|
| 314 |
+
"epoch": 3.763866877971474,
|
| 315 |
+
"grad_norm": 0.06942948698997498,
|
| 316 |
+
"learning_rate": 0.0009717386991848969,
|
| 317 |
+
"loss": 0.4251,
|
| 318 |
+
"step": 19000
|
| 319 |
+
},
|
| 320 |
+
{
|
| 321 |
+
"epoch": 3.862916006339144,
|
| 322 |
+
"grad_norm": 0.11137889325618744,
|
| 323 |
+
"learning_rate": 0.0009702467337101477,
|
| 324 |
+
"loss": 0.4229,
|
| 325 |
+
"step": 19500
|
| 326 |
+
},
|
| 327 |
+
{
|
| 328 |
+
"epoch": 3.9619651347068148,
|
| 329 |
+
"grad_norm": 0.15101556479930878,
|
| 330 |
+
"learning_rate": 0.0009687175986343367,
|
| 331 |
+
"loss": 0.4242,
|
| 332 |
+
"step": 20000
|
| 333 |
+
},
|
| 334 |
+
{
|
| 335 |
+
"epoch": 4.0,
|
| 336 |
+
"eval_MAE": 0.37888333201408386,
|
| 337 |
+
"eval_MAPE": 356.0927429199219,
|
| 338 |
+
"eval_MSE": 0.49698570370674133,
|
| 339 |
+
"eval_MSPE": 187043.203125,
|
| 340 |
+
"eval_ND": 0.4567306935787201,
|
| 341 |
+
"eval_RMSE": 0.7049721479415894,
|
| 342 |
+
"eval_SMAPE": 61.83990478515625,
|
| 343 |
+
"eval_runtime": 54.2682,
|
| 344 |
+
"eval_samples_per_second": 23165.769,
|
| 345 |
+
"eval_steps_per_second": 11.314,
|
| 346 |
+
"step": 20192
|
| 347 |
+
},
|
| 348 |
+
{
|
| 349 |
+
"epoch": 4.061014263074485,
|
| 350 |
+
"grad_norm": 0.1452902853488922,
|
| 351 |
+
"learning_rate": 0.0009671514148245245,
|
| 352 |
+
"loss": 0.423,
|
| 353 |
+
"step": 20500
|
| 354 |
+
},
|
| 355 |
+
{
|
| 356 |
+
"epoch": 4.160063391442155,
|
| 357 |
+
"grad_norm": 0.08417252451181412,
|
| 358 |
+
"learning_rate": 0.000965548306076207,
|
| 359 |
+
"loss": 0.4241,
|
| 360 |
+
"step": 21000
|
| 361 |
+
},
|
| 362 |
+
{
|
| 363 |
+
"epoch": 4.259112519809825,
|
| 364 |
+
"grad_norm": 0.13033604621887207,
|
| 365 |
+
"learning_rate": 0.0009639083991035288,
|
| 366 |
+
"loss": 0.4226,
|
| 367 |
+
"step": 21500
|
| 368 |
+
},
|
| 369 |
+
{
|
| 370 |
+
"epoch": 4.358161648177496,
|
| 371 |
+
"grad_norm": 0.15337254106998444,
|
| 372 |
+
"learning_rate": 0.0009622318235292677,
|
| 373 |
+
"loss": 0.4222,
|
| 374 |
+
"step": 22000
|
| 375 |
+
},
|
| 376 |
+
{
|
| 377 |
+
"epoch": 4.457210776545167,
|
| 378 |
+
"grad_norm": 0.12474048137664795,
|
| 379 |
+
"learning_rate": 0.0009605187118745895,
|
| 380 |
+
"loss": 0.4227,
|
| 381 |
+
"step": 22500
|
| 382 |
+
},
|
| 383 |
+
{
|
| 384 |
+
"epoch": 4.556259904912837,
|
| 385 |
+
"grad_norm": 0.11578098684549332,
|
| 386 |
+
"learning_rate": 0.0009587691995485724,
|
| 387 |
+
"loss": 0.4204,
|
| 388 |
+
"step": 23000
|
| 389 |
+
},
|
| 390 |
+
{
|
| 391 |
+
"epoch": 4.655309033280507,
|
| 392 |
+
"grad_norm": 0.08201416581869125,
|
| 393 |
+
"learning_rate": 0.000956983424837504,
|
| 394 |
+
"loss": 0.4237,
|
| 395 |
+
"step": 23500
|
| 396 |
+
},
|
| 397 |
+
{
|
| 398 |
+
"epoch": 4.754358161648177,
|
| 399 |
+
"grad_norm": 0.13240130245685577,
|
| 400 |
+
"learning_rate": 0.0009551615288939518,
|
| 401 |
+
"loss": 0.4224,
|
| 402 |
+
"step": 24000
|
| 403 |
+
},
|
| 404 |
+
{
|
| 405 |
+
"epoch": 4.853407290015848,
|
| 406 |
+
"grad_norm": 0.11744749546051025,
|
| 407 |
+
"learning_rate": 0.0009533036557256045,
|
| 408 |
+
"loss": 0.4219,
|
| 409 |
+
"step": 24500
|
| 410 |
+
},
|
| 411 |
+
{
|
| 412 |
+
"epoch": 4.952456418383518,
|
| 413 |
+
"grad_norm": 0.1341642141342163,
|
| 414 |
+
"learning_rate": 0.0009514099521838906,
|
| 415 |
+
"loss": 0.4208,
|
| 416 |
+
"step": 25000
|
| 417 |
+
},
|
| 418 |
+
{
|
| 419 |
+
"epoch": 5.0,
|
| 420 |
+
"eval_MAE": 0.3666294813156128,
|
| 421 |
+
"eval_MAPE": 356.06121826171875,
|
| 422 |
+
"eval_MSE": 0.49152618646621704,
|
| 423 |
+
"eval_MSPE": 207269.796875,
|
| 424 |
+
"eval_ND": 0.44195911288261414,
|
| 425 |
+
"eval_RMSE": 0.7010892629623413,
|
| 426 |
+
"eval_SMAPE": 60.314064025878906,
|
| 427 |
+
"eval_runtime": 54.6859,
|
| 428 |
+
"eval_samples_per_second": 22988.83,
|
| 429 |
+
"eval_steps_per_second": 11.228,
|
| 430 |
+
"step": 25240
|
| 431 |
+
},
|
| 432 |
+
{
|
| 433 |
+
"epoch": 5.051505546751189,
|
| 434 |
+
"grad_norm": 0.07947442680597305,
|
| 435 |
+
"learning_rate": 0.00094948056795237,
|
| 436 |
+
"loss": 0.4194,
|
| 437 |
+
"step": 25500
|
| 438 |
+
},
|
| 439 |
+
{
|
| 440 |
+
"epoch": 5.150554675118859,
|
| 441 |
+
"grad_norm": 0.13597752153873444,
|
| 442 |
+
"learning_rate": 0.000947515655534903,
|
| 443 |
+
"loss": 0.4208,
|
| 444 |
+
"step": 26000
|
| 445 |
+
},
|
| 446 |
+
{
|
| 447 |
+
"epoch": 5.249603803486529,
|
| 448 |
+
"grad_norm": 0.07506517320871353,
|
| 449 |
+
"learning_rate": 0.0009455153702435957,
|
| 450 |
+
"loss": 0.4207,
|
| 451 |
+
"step": 26500
|
| 452 |
+
},
|
| 453 |
+
{
|
| 454 |
+
"epoch": 5.3486529318541995,
|
| 455 |
+
"grad_norm": 0.204156756401062,
|
| 456 |
+
"learning_rate": 0.0009434798701865242,
|
| 457 |
+
"loss": 0.421,
|
| 458 |
+
"step": 27000
|
| 459 |
+
},
|
| 460 |
+
{
|
| 461 |
+
"epoch": 5.44770206022187,
|
| 462 |
+
"grad_norm": 0.10013840347528458,
|
| 463 |
+
"learning_rate": 0.000941409316255237,
|
| 464 |
+
"loss": 0.4206,
|
| 465 |
+
"step": 27500
|
| 466 |
+
},
|
| 467 |
+
{
|
| 468 |
+
"epoch": 5.546751188589541,
|
| 469 |
+
"grad_norm": 0.11126238107681274,
|
| 470 |
+
"learning_rate": 0.0009393038721120373,
|
| 471 |
+
"loss": 0.4209,
|
| 472 |
+
"step": 28000
|
| 473 |
+
},
|
| 474 |
+
{
|
| 475 |
+
"epoch": 5.645800316957211,
|
| 476 |
+
"grad_norm": 0.07826147228479385,
|
| 477 |
+
"learning_rate": 0.0009371637041770472,
|
| 478 |
+
"loss": 0.4199,
|
| 479 |
+
"step": 28500
|
| 480 |
+
},
|
| 481 |
+
{
|
| 482 |
+
"epoch": 5.744849445324881,
|
| 483 |
+
"grad_norm": 0.1696319282054901,
|
| 484 |
+
"learning_rate": 0.0009349889816150534,
|
| 485 |
+
"loss": 0.4202,
|
| 486 |
+
"step": 29000
|
| 487 |
+
},
|
| 488 |
+
{
|
| 489 |
+
"epoch": 5.8438985736925515,
|
| 490 |
+
"grad_norm": 0.13377857208251953,
|
| 491 |
+
"learning_rate": 0.0009327798763221355,
|
| 492 |
+
"loss": 0.4198,
|
| 493 |
+
"step": 29500
|
| 494 |
+
},
|
| 495 |
+
{
|
| 496 |
+
"epoch": 5.942947702060222,
|
| 497 |
+
"grad_norm": 0.09888964146375656,
|
| 498 |
+
"learning_rate": 0.0009305365629120796,
|
| 499 |
+
"loss": 0.4209,
|
| 500 |
+
"step": 30000
|
| 501 |
+
},
|
| 502 |
+
{
|
| 503 |
+
"epoch": 6.0,
|
| 504 |
+
"eval_MAE": 0.3674909174442291,
|
| 505 |
+
"eval_MAPE": 365.8409118652344,
|
| 506 |
+
"eval_MSE": 0.49068397283554077,
|
| 507 |
+
"eval_MSPE": 221797.953125,
|
| 508 |
+
"eval_ND": 0.442997545003891,
|
| 509 |
+
"eval_RMSE": 0.7004883885383606,
|
| 510 |
+
"eval_SMAPE": 60.21101379394531,
|
| 511 |
+
"eval_runtime": 53.7469,
|
| 512 |
+
"eval_samples_per_second": 23390.462,
|
| 513 |
+
"eval_steps_per_second": 11.424,
|
| 514 |
+
"step": 30288
|
| 515 |
+
},
|
| 516 |
+
{
|
| 517 |
+
"epoch": 6.041996830427892,
|
| 518 |
+
"grad_norm": 0.10051033645868301,
|
| 519 |
+
"learning_rate": 0.0009282592187025753,
|
| 520 |
+
"loss": 0.4202,
|
| 521 |
+
"step": 30500
|
| 522 |
+
},
|
| 523 |
+
{
|
| 524 |
+
"epoch": 6.141045958795562,
|
| 525 |
+
"grad_norm": 0.1751776486635208,
|
| 526 |
+
"learning_rate": 0.0009259480237012013,
|
| 527 |
+
"loss": 0.4205,
|
| 528 |
+
"step": 31000
|
| 529 |
+
},
|
| 530 |
+
{
|
| 531 |
+
"epoch": 6.240095087163233,
|
| 532 |
+
"grad_norm": 0.15444861352443695,
|
| 533 |
+
"learning_rate": 0.0009236031605911957,
|
| 534 |
+
"loss": 0.4184,
|
| 535 |
+
"step": 31500
|
| 536 |
+
},
|
| 537 |
+
{
|
| 538 |
+
"epoch": 6.3391442155309035,
|
| 539 |
+
"grad_norm": 0.09780099242925644,
|
| 540 |
+
"learning_rate": 0.0009212248147170174,
|
| 541 |
+
"loss": 0.42,
|
| 542 |
+
"step": 32000
|
| 543 |
+
},
|
| 544 |
+
{
|
| 545 |
+
"epoch": 6.438193343898574,
|
| 546 |
+
"grad_norm": 0.12165773659944534,
|
| 547 |
+
"learning_rate": 0.0009188131740696953,
|
| 548 |
+
"loss": 0.4191,
|
| 549 |
+
"step": 32500
|
| 550 |
+
},
|
| 551 |
+
{
|
| 552 |
+
"epoch": 6.537242472266244,
|
| 553 |
+
"grad_norm": 0.09260477870702744,
|
| 554 |
+
"learning_rate": 0.0009163684292719692,
|
| 555 |
+
"loss": 0.4193,
|
| 556 |
+
"step": 33000
|
| 557 |
+
},
|
| 558 |
+
{
|
| 559 |
+
"epoch": 6.636291600633914,
|
| 560 |
+
"grad_norm": 0.09463346749544144,
|
| 561 |
+
"learning_rate": 0.0009138907735632225,
|
| 562 |
+
"loss": 0.4184,
|
| 563 |
+
"step": 33500
|
| 564 |
+
},
|
| 565 |
+
{
|
| 566 |
+
"epoch": 6.735340729001585,
|
| 567 |
+
"grad_norm": 0.10615640878677368,
|
| 568 |
+
"learning_rate": 0.0009113804027842078,
|
| 569 |
+
"loss": 0.4179,
|
| 570 |
+
"step": 34000
|
| 571 |
+
},
|
| 572 |
+
{
|
| 573 |
+
"epoch": 6.834389857369255,
|
| 574 |
+
"grad_norm": 0.10410912334918976,
|
| 575 |
+
"learning_rate": 0.0009088375153615673,
|
| 576 |
+
"loss": 0.4189,
|
| 577 |
+
"step": 34500
|
| 578 |
+
},
|
| 579 |
+
{
|
| 580 |
+
"epoch": 6.933438985736926,
|
| 581 |
+
"grad_norm": 0.10201963037252426,
|
| 582 |
+
"learning_rate": 0.0009062623122921485,
|
| 583 |
+
"loss": 0.4187,
|
| 584 |
+
"step": 35000
|
| 585 |
+
},
|
| 586 |
+
{
|
| 587 |
+
"epoch": 7.0,
|
| 588 |
+
"eval_MAE": 0.36573904752731323,
|
| 589 |
+
"eval_MAPE": 352.3959045410156,
|
| 590 |
+
"eval_MSE": 0.48773863911628723,
|
| 591 |
+
"eval_MSPE": 185352.046875,
|
| 592 |
+
"eval_ND": 0.4408857226371765,
|
| 593 |
+
"eval_RMSE": 0.6983828544616699,
|
| 594 |
+
"eval_SMAPE": 59.932762145996094,
|
| 595 |
+
"eval_runtime": 54.521,
|
| 596 |
+
"eval_samples_per_second": 23058.35,
|
| 597 |
+
"eval_steps_per_second": 11.262,
|
| 598 |
+
"step": 35336
|
| 599 |
+
},
|
| 600 |
+
{
|
| 601 |
+
"epoch": 7.032488114104596,
|
| 602 |
+
"grad_norm": 0.11093030869960785,
|
| 603 |
+
"learning_rate": 0.000903654997127117,
|
| 604 |
+
"loss": 0.4191,
|
| 605 |
+
"step": 35500
|
| 606 |
+
},
|
| 607 |
+
{
|
| 608 |
+
"epoch": 7.131537242472266,
|
| 609 |
+
"grad_norm": 0.1134202629327774,
|
| 610 |
+
"learning_rate": 0.0009010157759558673,
|
| 611 |
+
"loss": 0.4186,
|
| 612 |
+
"step": 36000
|
| 613 |
+
},
|
| 614 |
+
{
|
| 615 |
+
"epoch": 7.230586370839936,
|
| 616 |
+
"grad_norm": 0.06474833935499191,
|
| 617 |
+
"learning_rate": 0.0008983448573897322,
|
| 618 |
+
"loss": 0.4191,
|
| 619 |
+
"step": 36500
|
| 620 |
+
},
|
| 621 |
+
{
|
| 622 |
+
"epoch": 7.329635499207607,
|
| 623 |
+
"grad_norm": 0.11325617134571075,
|
| 624 |
+
"learning_rate": 0.0008956424525454949,
|
| 625 |
+
"loss": 0.4164,
|
| 626 |
+
"step": 37000
|
| 627 |
+
},
|
| 628 |
+
{
|
| 629 |
+
"epoch": 7.428684627575278,
|
| 630 |
+
"grad_norm": 0.07661303877830505,
|
| 631 |
+
"learning_rate": 0.0008929087750287004,
|
| 632 |
+
"loss": 0.4179,
|
| 633 |
+
"step": 37500
|
| 634 |
+
},
|
| 635 |
+
{
|
| 636 |
+
"epoch": 7.527733755942948,
|
| 637 |
+
"grad_norm": 0.07534985989332199,
|
| 638 |
+
"learning_rate": 0.0008901440409167727,
|
| 639 |
+
"loss": 0.4191,
|
| 640 |
+
"step": 38000
|
| 641 |
+
},
|
| 642 |
+
{
|
| 643 |
+
"epoch": 7.626782884310618,
|
| 644 |
+
"grad_norm": 0.09289834648370743,
|
| 645 |
+
"learning_rate": 0.0008873484687419344,
|
| 646 |
+
"loss": 0.4177,
|
| 647 |
+
"step": 38500
|
| 648 |
+
},
|
| 649 |
+
{
|
| 650 |
+
"epoch": 7.725832012678288,
|
| 651 |
+
"grad_norm": 0.12024533003568649,
|
| 652 |
+
"learning_rate": 0.0008845222794739341,
|
| 653 |
+
"loss": 0.417,
|
| 654 |
+
"step": 39000
|
| 655 |
+
},
|
| 656 |
+
{
|
| 657 |
+
"epoch": 7.824881141045958,
|
| 658 |
+
"grad_norm": 0.0927654430270195,
|
| 659 |
+
"learning_rate": 0.00088166569650258,
|
| 660 |
+
"loss": 0.4181,
|
| 661 |
+
"step": 39500
|
| 662 |
+
},
|
| 663 |
+
{
|
| 664 |
+
"epoch": 7.9239302694136295,
|
| 665 |
+
"grad_norm": 0.14443668723106384,
|
| 666 |
+
"learning_rate": 0.0008787789456200823,
|
| 667 |
+
"loss": 0.4175,
|
| 668 |
+
"step": 40000
|
| 669 |
+
},
|
| 670 |
+
{
|
| 671 |
+
"epoch": 8.0,
|
| 672 |
+
"eval_MAE": 0.36536094546318054,
|
| 673 |
+
"eval_MAPE": 354.2453918457031,
|
| 674 |
+
"eval_MSE": 0.48749861121177673,
|
| 675 |
+
"eval_MSPE": 195818.65625,
|
| 676 |
+
"eval_ND": 0.4404299259185791,
|
| 677 |
+
"eval_RMSE": 0.6982110142707825,
|
| 678 |
+
"eval_SMAPE": 59.863468170166016,
|
| 679 |
+
"eval_runtime": 54.3633,
|
| 680 |
+
"eval_samples_per_second": 23125.25,
|
| 681 |
+
"eval_steps_per_second": 11.294,
|
| 682 |
+
"step": 40384
|
| 683 |
+
},
|
| 684 |
+
{
|
| 685 |
+
"epoch": 8.022979397781299,
|
| 686 |
+
"grad_norm": 0.0843660980463028,
|
| 687 |
+
"learning_rate": 0.0008758622550032065,
|
| 688 |
+
"loss": 0.418,
|
| 689 |
+
"step": 40500
|
| 690 |
+
},
|
| 691 |
+
{
|
| 692 |
+
"epoch": 8.12202852614897,
|
| 693 |
+
"grad_norm": 0.14738310873508453,
|
| 694 |
+
"learning_rate": 0.0008729158551952377,
|
| 695 |
+
"loss": 0.4173,
|
| 696 |
+
"step": 41000
|
| 697 |
+
},
|
| 698 |
+
{
|
| 699 |
+
"epoch": 8.221077654516641,
|
| 700 |
+
"grad_norm": 0.07845776528120041,
|
| 701 |
+
"learning_rate": 0.0008699399790877566,
|
| 702 |
+
"loss": 0.4176,
|
| 703 |
+
"step": 41500
|
| 704 |
+
},
|
| 705 |
+
{
|
| 706 |
+
"epoch": 8.32012678288431,
|
| 707 |
+
"grad_norm": 0.1038828119635582,
|
| 708 |
+
"learning_rate": 0.0008669348619022335,
|
| 709 |
+
"loss": 0.4175,
|
| 710 |
+
"step": 42000
|
| 711 |
+
},
|
| 712 |
+
{
|
| 713 |
+
"epoch": 8.419175911251982,
|
| 714 |
+
"grad_norm": 0.14202700555324554,
|
| 715 |
+
"learning_rate": 0.000863900741171433,
|
| 716 |
+
"loss": 0.417,
|
| 717 |
+
"step": 42500
|
| 718 |
+
},
|
| 719 |
+
{
|
| 720 |
+
"epoch": 8.51822503961965,
|
| 721 |
+
"grad_norm": 0.1151699423789978,
|
| 722 |
+
"learning_rate": 0.0008608378567206405,
|
| 723 |
+
"loss": 0.4181,
|
| 724 |
+
"step": 43000
|
| 725 |
+
},
|
| 726 |
+
{
|
| 727 |
+
"epoch": 8.617274167987322,
|
| 728 |
+
"grad_norm": 0.19360342621803284,
|
| 729 |
+
"learning_rate": 0.0008577464506487054,
|
| 730 |
+
"loss": 0.4153,
|
| 731 |
+
"step": 43500
|
| 732 |
+
},
|
| 733 |
+
{
|
| 734 |
+
"epoch": 8.716323296354991,
|
| 735 |
+
"grad_norm": 0.11765541136264801,
|
| 736 |
+
"learning_rate": 0.0008546267673089049,
|
| 737 |
+
"loss": 0.4159,
|
| 738 |
+
"step": 44000
|
| 739 |
+
},
|
| 740 |
+
{
|
| 741 |
+
"epoch": 8.815372424722662,
|
| 742 |
+
"grad_norm": 0.15739209949970245,
|
| 743 |
+
"learning_rate": 0.0008514790532896294,
|
| 744 |
+
"loss": 0.4162,
|
| 745 |
+
"step": 44500
|
| 746 |
+
},
|
| 747 |
+
{
|
| 748 |
+
"epoch": 8.914421553090333,
|
| 749 |
+
"grad_norm": 0.12929894030094147,
|
| 750 |
+
"learning_rate": 0.0008483035573948916,
|
| 751 |
+
"loss": 0.4161,
|
| 752 |
+
"step": 45000
|
| 753 |
+
},
|
| 754 |
+
{
|
| 755 |
+
"epoch": 9.0,
|
| 756 |
+
"eval_MAE": 0.3685202896595001,
|
| 757 |
+
"eval_MAPE": 360.7921447753906,
|
| 758 |
+
"eval_MSE": 0.48982617259025574,
|
| 759 |
+
"eval_MSPE": 237005.59375,
|
| 760 |
+
"eval_ND": 0.44423842430114746,
|
| 761 |
+
"eval_RMSE": 0.6998758316040039,
|
| 762 |
+
"eval_SMAPE": 60.21327590942383,
|
| 763 |
+
"eval_runtime": 55.2004,
|
| 764 |
+
"eval_samples_per_second": 22774.547,
|
| 765 |
+
"eval_steps_per_second": 11.123,
|
| 766 |
+
"step": 45432
|
| 767 |
+
},
|
| 768 |
+
{
|
| 769 |
+
"epoch": 9.013470681458003,
|
| 770 |
+
"grad_norm": 0.08673319220542908,
|
| 771 |
+
"learning_rate": 0.0008451005306246607,
|
| 772 |
+
"loss": 0.4164,
|
| 773 |
+
"step": 45500
|
| 774 |
+
},
|
| 775 |
+
{
|
| 776 |
+
"epoch": 9.112519809825674,
|
| 777 |
+
"grad_norm": 0.08345810323953629,
|
| 778 |
+
"learning_rate": 0.000841870226155022,
|
| 779 |
+
"loss": 0.4154,
|
| 780 |
+
"step": 46000
|
| 781 |
+
},
|
| 782 |
+
{
|
| 783 |
+
"epoch": 9.211568938193343,
|
| 784 |
+
"grad_norm": 0.12091132998466492,
|
| 785 |
+
"learning_rate": 0.0008386128993181656,
|
| 786 |
+
"loss": 0.4162,
|
| 787 |
+
"step": 46500
|
| 788 |
+
},
|
| 789 |
+
{
|
| 790 |
+
"epoch": 9.310618066561014,
|
| 791 |
+
"grad_norm": 0.09233805537223816,
|
| 792 |
+
"learning_rate": 0.0008353288075822044,
|
| 793 |
+
"loss": 0.417,
|
| 794 |
+
"step": 47000
|
| 795 |
+
},
|
| 796 |
+
{
|
| 797 |
+
"epoch": 9.409667194928685,
|
| 798 |
+
"grad_norm": 0.14294278621673584,
|
| 799 |
+
"learning_rate": 0.0008320182105308227,
|
| 800 |
+
"loss": 0.4164,
|
| 801 |
+
"step": 47500
|
| 802 |
+
},
|
| 803 |
+
{
|
| 804 |
+
"epoch": 9.508716323296355,
|
| 805 |
+
"grad_norm": 0.08588852733373642,
|
| 806 |
+
"learning_rate": 0.0008286813698427583,
|
| 807 |
+
"loss": 0.4151,
|
| 808 |
+
"step": 48000
|
| 809 |
+
},
|
| 810 |
+
{
|
| 811 |
+
"epoch": 9.607765451664026,
|
| 812 |
+
"grad_norm": 0.09648016840219498,
|
| 813 |
+
"learning_rate": 0.0008253185492711182,
|
| 814 |
+
"loss": 0.4158,
|
| 815 |
+
"step": 48500
|
| 816 |
+
},
|
| 817 |
+
{
|
| 818 |
+
"epoch": 9.706814580031695,
|
| 819 |
+
"grad_norm": 0.10604752600193024,
|
| 820 |
+
"learning_rate": 0.0008219300146225315,
|
| 821 |
+
"loss": 0.416,
|
| 822 |
+
"step": 49000
|
| 823 |
+
},
|
| 824 |
+
{
|
| 825 |
+
"epoch": 9.805863708399366,
|
| 826 |
+
"grad_norm": 0.07823313027620316,
|
| 827 |
+
"learning_rate": 0.0008185160337361388,
|
| 828 |
+
"loss": 0.414,
|
| 829 |
+
"step": 49500
|
| 830 |
+
},
|
| 831 |
+
{
|
| 832 |
+
"epoch": 9.904912836767036,
|
| 833 |
+
"grad_norm": 0.09862257540225983,
|
| 834 |
+
"learning_rate": 0.000815076876462422,
|
| 835 |
+
"loss": 0.4153,
|
| 836 |
+
"step": 50000
|
| 837 |
+
},
|
| 838 |
+
{
|
| 839 |
+
"epoch": 10.0,
|
| 840 |
+
"eval_MAE": 0.35969898104667664,
|
| 841 |
+
"eval_MAPE": 347.6515808105469,
|
| 842 |
+
"eval_MSE": 0.4838204085826874,
|
| 843 |
+
"eval_MSPE": 197941.25,
|
| 844 |
+
"eval_ND": 0.4336046278476715,
|
| 845 |
+
"eval_RMSE": 0.695572018623352,
|
| 846 |
+
"eval_SMAPE": 59.28236389160156,
|
| 847 |
+
"eval_runtime": 55.2356,
|
| 848 |
+
"eval_samples_per_second": 22760.038,
|
| 849 |
+
"eval_steps_per_second": 11.116,
|
| 850 |
+
"step": 50480
|
| 851 |
+
},
|
| 852 |
+
{
|
| 853 |
+
"epoch": 10.003961965134707,
|
| 854 |
+
"grad_norm": 0.09266933053731918,
|
| 855 |
+
"learning_rate": 0.0008116128146418738,
|
| 856 |
+
"loss": 0.4163,
|
| 857 |
+
"step": 50500
|
| 858 |
+
},
|
| 859 |
+
{
|
| 860 |
+
"epoch": 10.103011093502378,
|
| 861 |
+
"grad_norm": 0.1360657662153244,
|
| 862 |
+
"learning_rate": 0.0008081241220835112,
|
| 863 |
+
"loss": 0.4138,
|
| 864 |
+
"step": 51000
|
| 865 |
+
},
|
| 866 |
+
{
|
| 867 |
+
"epoch": 10.202060221870047,
|
| 868 |
+
"grad_norm": 0.12415830045938492,
|
| 869 |
+
"learning_rate": 0.0008046110745432329,
|
| 870 |
+
"loss": 0.4154,
|
| 871 |
+
"step": 51500
|
| 872 |
+
},
|
| 873 |
+
{
|
| 874 |
+
"epoch": 10.301109350237718,
|
| 875 |
+
"grad_norm": 0.12350622564554214,
|
| 876 |
+
"learning_rate": 0.0008010739497020226,
|
| 877 |
+
"loss": 0.4152,
|
| 878 |
+
"step": 52000
|
| 879 |
+
},
|
| 880 |
+
{
|
| 881 |
+
"epoch": 10.400158478605388,
|
| 882 |
+
"grad_norm": 0.09407506883144379,
|
| 883 |
+
"learning_rate": 0.0007975130271440001,
|
| 884 |
+
"loss": 0.4142,
|
| 885 |
+
"step": 52500
|
| 886 |
+
},
|
| 887 |
+
{
|
| 888 |
+
"epoch": 10.499207606973059,
|
| 889 |
+
"grad_norm": 0.07152153551578522,
|
| 890 |
+
"learning_rate": 0.000793928588334323,
|
| 891 |
+
"loss": 0.4152,
|
| 892 |
+
"step": 53000
|
| 893 |
+
},
|
| 894 |
+
{
|
| 895 |
+
"epoch": 10.59825673534073,
|
| 896 |
+
"grad_norm": 0.11555545032024384,
|
| 897 |
+
"learning_rate": 0.0007903209165969381,
|
| 898 |
+
"loss": 0.4149,
|
| 899 |
+
"step": 53500
|
| 900 |
+
},
|
| 901 |
+
{
|
| 902 |
+
"epoch": 10.697305863708399,
|
| 903 |
+
"grad_norm": 0.1005263477563858,
|
| 904 |
+
"learning_rate": 0.0007866902970921869,
|
| 905 |
+
"loss": 0.4151,
|
| 906 |
+
"step": 54000
|
| 907 |
+
},
|
| 908 |
+
{
|
| 909 |
+
"epoch": 10.79635499207607,
|
| 910 |
+
"grad_norm": 0.151838481426239,
|
| 911 |
+
"learning_rate": 0.0007830370167942662,
|
| 912 |
+
"loss": 0.4157,
|
| 913 |
+
"step": 54500
|
| 914 |
+
},
|
| 915 |
+
{
|
| 916 |
+
"epoch": 10.89540412044374,
|
| 917 |
+
"grad_norm": 0.12097581475973129,
|
| 918 |
+
"learning_rate": 0.0007793613644685442,
|
| 919 |
+
"loss": 0.4142,
|
| 920 |
+
"step": 55000
|
| 921 |
+
},
|
| 922 |
+
{
|
| 923 |
+
"epoch": 10.99445324881141,
|
| 924 |
+
"grad_norm": 0.07319813221693039,
|
| 925 |
+
"learning_rate": 0.0007756636306487361,
|
| 926 |
+
"loss": 0.4145,
|
| 927 |
+
"step": 55500
|
| 928 |
+
},
|
| 929 |
+
{
|
| 930 |
+
"epoch": 11.0,
|
| 931 |
+
"eval_MAE": 0.36152151226997375,
|
| 932 |
+
"eval_MAPE": 355.8778991699219,
|
| 933 |
+
"eval_MSE": 0.4874439239501953,
|
| 934 |
+
"eval_MSPE": 242275.75,
|
| 935 |
+
"eval_ND": 0.43580162525177,
|
| 936 |
+
"eval_RMSE": 0.698171854019165,
|
| 937 |
+
"eval_SMAPE": 59.33519744873047,
|
| 938 |
+
"eval_runtime": 54.9921,
|
| 939 |
+
"eval_samples_per_second": 22860.849,
|
| 940 |
+
"eval_steps_per_second": 11.165,
|
| 941 |
+
"step": 55528
|
| 942 |
+
},
|
| 943 |
+
{
|
| 944 |
+
"epoch": 11.09350237717908,
|
| 945 |
+
"grad_norm": 0.08475903421640396,
|
| 946 |
+
"learning_rate": 0.0007719441076139392,
|
| 947 |
+
"loss": 0.4144,
|
| 948 |
+
"step": 56000
|
| 949 |
+
},
|
| 950 |
+
{
|
| 951 |
+
"epoch": 11.192551505546751,
|
| 952 |
+
"grad_norm": 0.09286442399024963,
|
| 953 |
+
"learning_rate": 0.000768203089365531,
|
| 954 |
+
"loss": 0.414,
|
| 955 |
+
"step": 56500
|
| 956 |
+
},
|
| 957 |
+
{
|
| 958 |
+
"epoch": 11.291600633914422,
|
| 959 |
+
"grad_norm": 0.07604731619358063,
|
| 960 |
+
"learning_rate": 0.0007644408716039295,
|
| 961 |
+
"loss": 0.4132,
|
| 962 |
+
"step": 57000
|
| 963 |
+
},
|
| 964 |
+
{
|
| 965 |
+
"epoch": 11.390649762282091,
|
| 966 |
+
"grad_norm": 0.10099564492702484,
|
| 967 |
+
"learning_rate": 0.0007606577517052212,
|
| 968 |
+
"loss": 0.4128,
|
| 969 |
+
"step": 57500
|
| 970 |
+
},
|
| 971 |
+
{
|
| 972 |
+
"epoch": 11.489698890649763,
|
| 973 |
+
"grad_norm": 0.0886906161904335,
|
| 974 |
+
"learning_rate": 0.0007568540286976551,
|
| 975 |
+
"loss": 0.4144,
|
| 976 |
+
"step": 58000
|
| 977 |
+
},
|
| 978 |
+
{
|
| 979 |
+
"epoch": 11.588748019017432,
|
| 980 |
+
"grad_norm": 0.09952269494533539,
|
| 981 |
+
"learning_rate": 0.0007530300032380071,
|
| 982 |
+
"loss": 0.4153,
|
| 983 |
+
"step": 58500
|
| 984 |
+
},
|
| 985 |
+
{
|
| 986 |
+
"epoch": 11.687797147385103,
|
| 987 |
+
"grad_norm": 0.09850025922060013,
|
| 988 |
+
"learning_rate": 0.0007491859775878146,
|
| 989 |
+
"loss": 0.414,
|
| 990 |
+
"step": 59000
|
| 991 |
+
},
|
| 992 |
+
{
|
| 993 |
+
"epoch": 11.786846275752774,
|
| 994 |
+
"grad_norm": 0.16440285742282867,
|
| 995 |
+
"learning_rate": 0.0007453222555894856,
|
| 996 |
+
"loss": 0.4135,
|
| 997 |
+
"step": 59500
|
| 998 |
+
},
|
| 999 |
+
{
|
| 1000 |
+
"epoch": 11.885895404120443,
|
| 1001 |
+
"grad_norm": 0.09019125998020172,
|
| 1002 |
+
"learning_rate": 0.000741439142642282,
|
| 1003 |
+
"loss": 0.4141,
|
| 1004 |
+
"step": 60000
|
| 1005 |
+
},
|
| 1006 |
+
{
|
| 1007 |
+
"epoch": 11.984944532488115,
|
| 1008 |
+
"grad_norm": 0.11564897000789642,
|
| 1009 |
+
"learning_rate": 0.0007375369456781793,
|
| 1010 |
+
"loss": 0.4135,
|
| 1011 |
+
"step": 60500
|
| 1012 |
+
},
|
| 1013 |
+
{
|
| 1014 |
+
"epoch": 12.0,
|
| 1015 |
+
"eval_MAE": 0.3628363013267517,
|
| 1016 |
+
"eval_MAPE": 352.1346740722656,
|
| 1017 |
+
"eval_MSE": 0.48805665969848633,
|
| 1018 |
+
"eval_MSPE": 232840.859375,
|
| 1019 |
+
"eval_ND": 0.43738657236099243,
|
| 1020 |
+
"eval_RMSE": 0.6986105442047119,
|
| 1021 |
+
"eval_SMAPE": 59.68527603149414,
|
| 1022 |
+
"eval_runtime": 54.4659,
|
| 1023 |
+
"eval_samples_per_second": 23081.687,
|
| 1024 |
+
"eval_steps_per_second": 11.273,
|
| 1025 |
+
"step": 60576
|
| 1026 |
+
},
|
| 1027 |
+
{
|
| 1028 |
+
"epoch": 12.083993660855784,
|
| 1029 |
+
"grad_norm": 0.14516101777553558,
|
| 1030 |
+
"learning_rate": 0.0007336159731376071,
|
| 1031 |
+
"loss": 0.4132,
|
| 1032 |
+
"step": 61000
|
| 1033 |
+
},
|
| 1034 |
+
{
|
| 1035 |
+
"epoch": 12.183042789223455,
|
| 1036 |
+
"grad_norm": 0.10409526526927948,
|
| 1037 |
+
"learning_rate": 0.0007296765349450678,
|
| 1038 |
+
"loss": 0.4143,
|
| 1039 |
+
"step": 61500
|
| 1040 |
+
},
|
| 1041 |
+
{
|
| 1042 |
+
"epoch": 12.282091917591124,
|
| 1043 |
+
"grad_norm": 0.13454560935497284,
|
| 1044 |
+
"learning_rate": 0.0007257189424846407,
|
| 1045 |
+
"loss": 0.413,
|
| 1046 |
+
"step": 62000
|
| 1047 |
+
},
|
| 1048 |
+
{
|
| 1049 |
+
"epoch": 12.381141045958795,
|
| 1050 |
+
"grad_norm": 0.12388956546783447,
|
| 1051 |
+
"learning_rate": 0.0007217435085753679,
|
| 1052 |
+
"loss": 0.4144,
|
| 1053 |
+
"step": 62500
|
| 1054 |
+
},
|
| 1055 |
+
{
|
| 1056 |
+
"epoch": 12.480190174326466,
|
| 1057 |
+
"grad_norm": 0.12922845780849457,
|
| 1058 |
+
"learning_rate": 0.0007177505474465294,
|
| 1059 |
+
"loss": 0.412,
|
| 1060 |
+
"step": 63000
|
| 1061 |
+
},
|
| 1062 |
+
{
|
| 1063 |
+
"epoch": 12.579239302694136,
|
| 1064 |
+
"grad_norm": 0.11674097180366516,
|
| 1065 |
+
"learning_rate": 0.0007137403747128044,
|
| 1066 |
+
"loss": 0.4128,
|
| 1067 |
+
"step": 63500
|
| 1068 |
+
},
|
| 1069 |
+
{
|
| 1070 |
+
"epoch": 12.678288431061807,
|
| 1071 |
+
"grad_norm": 0.12132911384105682,
|
| 1072 |
+
"learning_rate": 0.000709713307349326,
|
| 1073 |
+
"loss": 0.4133,
|
| 1074 |
+
"step": 64000
|
| 1075 |
+
},
|
| 1076 |
+
{
|
| 1077 |
+
"epoch": 12.777337559429476,
|
| 1078 |
+
"grad_norm": 0.0879695862531662,
|
| 1079 |
+
"learning_rate": 0.0007056696636666243,
|
| 1080 |
+
"loss": 0.4134,
|
| 1081 |
+
"step": 64500
|
| 1082 |
+
},
|
| 1083 |
+
{
|
| 1084 |
+
"epoch": 12.876386687797147,
|
| 1085 |
+
"grad_norm": 0.0823468267917633,
|
| 1086 |
+
"learning_rate": 0.0007016097632854684,
|
| 1087 |
+
"loss": 0.4117,
|
| 1088 |
+
"step": 65000
|
| 1089 |
+
},
|
| 1090 |
+
{
|
| 1091 |
+
"epoch": 12.975435816164818,
|
| 1092 |
+
"grad_norm": 0.11856217682361603,
|
| 1093 |
+
"learning_rate": 0.0006975339271116012,
|
| 1094 |
+
"loss": 0.4126,
|
| 1095 |
+
"step": 65500
|
| 1096 |
+
},
|
| 1097 |
+
{
|
| 1098 |
+
"epoch": 13.0,
|
| 1099 |
+
"eval_MAE": 0.3578062355518341,
|
| 1100 |
+
"eval_MAPE": 340.23406982421875,
|
| 1101 |
+
"eval_MSE": 0.4831537902355194,
|
| 1102 |
+
"eval_MSPE": 227388.484375,
|
| 1103 |
+
"eval_ND": 0.43132299184799194,
|
| 1104 |
+
"eval_RMSE": 0.6950926780700684,
|
| 1105 |
+
"eval_SMAPE": 59.14822769165039,
|
| 1106 |
+
"eval_runtime": 54.2424,
|
| 1107 |
+
"eval_samples_per_second": 23176.786,
|
| 1108 |
+
"eval_steps_per_second": 11.32,
|
| 1109 |
+
"step": 65624
|
| 1110 |
+
}
|
| 1111 |
+
],
|
| 1112 |
+
"logging_steps": 500,
|
| 1113 |
+
"max_steps": 176680,
|
| 1114 |
+
"num_input_tokens_seen": 0,
|
| 1115 |
+
"num_train_epochs": 35,
|
| 1116 |
+
"save_steps": 500,
|
| 1117 |
+
"stateful_callbacks": {
|
| 1118 |
+
"EarlyStoppingCallback": {
|
| 1119 |
+
"args": {
|
| 1120 |
+
"early_stopping_patience": 3,
|
| 1121 |
+
"early_stopping_threshold": 0.0
|
| 1122 |
+
},
|
| 1123 |
+
"attributes": {
|
| 1124 |
+
"early_stopping_patience_counter": 0
|
| 1125 |
+
}
|
| 1126 |
+
},
|
| 1127 |
+
"TrainerControl": {
|
| 1128 |
+
"args": {
|
| 1129 |
+
"should_epoch_stop": false,
|
| 1130 |
+
"should_evaluate": false,
|
| 1131 |
+
"should_log": false,
|
| 1132 |
+
"should_save": true,
|
| 1133 |
+
"should_training_stop": false
|
| 1134 |
+
},
|
| 1135 |
+
"attributes": {}
|
| 1136 |
+
}
|
| 1137 |
+
},
|
| 1138 |
+
"total_flos": 0.0,
|
| 1139 |
+
"train_batch_size": 512,
|
| 1140 |
+
"trial_name": null,
|
| 1141 |
+
"trial_params": null
|
| 1142 |
+
}
|
OFA/PEMS04_336/checkpoint-65624/training_args.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f57783d1c775922679e371b39a13b1f0e1b0afa7e63698d2b8e6f438f59bab39
|
| 3 |
+
size 6584
|
OFA/Solar_192/checkpoint-14556/pytorch_model.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:60bd6475e443ba58da1fe6bff4125da6ebd4daa4f4026be81dc6ffae8a5d6965
|
| 3 |
+
size 261338858
|
OFA/Solar_192/checkpoint-14556/trainer_state.json
ADDED
|
@@ -0,0 +1,330 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"best_global_step": 14556,
|
| 3 |
+
"best_metric": 0.16441404819488525,
|
| 4 |
+
"best_model_checkpoint": "/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/OFA_layer/haspara/Solar/checkpoint-14556",
|
| 5 |
+
"epoch": 6.0,
|
| 6 |
+
"eval_steps": 500,
|
| 7 |
+
"global_step": 14556,
|
| 8 |
+
"is_hyper_param_search": false,
|
| 9 |
+
"is_local_process_zero": true,
|
| 10 |
+
"is_world_process_zero": true,
|
| 11 |
+
"log_history": [
|
| 12 |
+
{
|
| 13 |
+
"epoch": 0.20610057708161583,
|
| 14 |
+
"grad_norm": 0.13613636791706085,
|
| 15 |
+
"learning_rate": 0.0009999147860244627,
|
| 16 |
+
"loss": 0.4442,
|
| 17 |
+
"step": 500
|
| 18 |
+
},
|
| 19 |
+
{
|
| 20 |
+
"epoch": 0.41220115416323166,
|
| 21 |
+
"grad_norm": 0.09557799249887466,
|
| 22 |
+
"learning_rate": 0.0009996584898593923,
|
| 23 |
+
"loss": 0.1748,
|
| 24 |
+
"step": 1000
|
| 25 |
+
},
|
| 26 |
+
{
|
| 27 |
+
"epoch": 0.6183017312448474,
|
| 28 |
+
"grad_norm": 0.16066046059131622,
|
| 29 |
+
"learning_rate": 0.000999231198873098,
|
| 30 |
+
"loss": 0.1707,
|
| 31 |
+
"step": 1500
|
| 32 |
+
},
|
| 33 |
+
{
|
| 34 |
+
"epoch": 0.8244023083264633,
|
| 35 |
+
"grad_norm": 0.1717931628227234,
|
| 36 |
+
"learning_rate": 0.0009986330592945485,
|
| 37 |
+
"loss": 0.1679,
|
| 38 |
+
"step": 2000
|
| 39 |
+
},
|
| 40 |
+
{
|
| 41 |
+
"epoch": 1.0,
|
| 42 |
+
"eval_MAE": 0.24874718487262726,
|
| 43 |
+
"eval_MAPE": 180.4757843017578,
|
| 44 |
+
"eval_MSE": 0.16690030694007874,
|
| 45 |
+
"eval_MSPE": 9273.771484375,
|
| 46 |
+
"eval_ND": 0.2907600700855255,
|
| 47 |
+
"eval_RMSE": 0.40853434801101685,
|
| 48 |
+
"eval_SMAPE": 36.44480895996094,
|
| 49 |
+
"eval_runtime": 58.8031,
|
| 50 |
+
"eval_samples_per_second": 11800.483,
|
| 51 |
+
"eval_steps_per_second": 5.765,
|
| 52 |
+
"step": 2426
|
| 53 |
+
},
|
| 54 |
+
{
|
| 55 |
+
"epoch": 1.030502885408079,
|
| 56 |
+
"grad_norm": 0.11462152004241943,
|
| 57 |
+
"learning_rate": 0.000997864275821097,
|
| 58 |
+
"loss": 0.166,
|
| 59 |
+
"step": 2500
|
| 60 |
+
},
|
| 61 |
+
{
|
| 62 |
+
"epoch": 1.2366034624896949,
|
| 63 |
+
"grad_norm": 0.05391139164566994,
|
| 64 |
+
"learning_rate": 0.0009969251115484285,
|
| 65 |
+
"loss": 0.164,
|
| 66 |
+
"step": 3000
|
| 67 |
+
},
|
| 68 |
+
{
|
| 69 |
+
"epoch": 1.4427040395713109,
|
| 70 |
+
"grad_norm": 0.08242031186819077,
|
| 71 |
+
"learning_rate": 0.0009958158878805223,
|
| 72 |
+
"loss": 0.1628,
|
| 73 |
+
"step": 3500
|
| 74 |
+
},
|
| 75 |
+
{
|
| 76 |
+
"epoch": 1.6488046166529267,
|
| 77 |
+
"grad_norm": 0.09068718552589417,
|
| 78 |
+
"learning_rate": 0.0009945369844196596,
|
| 79 |
+
"loss": 0.1614,
|
| 80 |
+
"step": 4000
|
| 81 |
+
},
|
| 82 |
+
{
|
| 83 |
+
"epoch": 1.8549051937345424,
|
| 84 |
+
"grad_norm": 0.12416058033704758,
|
| 85 |
+
"learning_rate": 0.000993088838836516,
|
| 86 |
+
"loss": 0.1604,
|
| 87 |
+
"step": 4500
|
| 88 |
+
},
|
| 89 |
+
{
|
| 90 |
+
"epoch": 2.0,
|
| 91 |
+
"eval_MAE": 0.2530505359172821,
|
| 92 |
+
"eval_MAPE": 184.0513153076172,
|
| 93 |
+
"eval_MSE": 0.1713208258152008,
|
| 94 |
+
"eval_MSPE": 8967.3408203125,
|
| 95 |
+
"eval_ND": 0.29579025506973267,
|
| 96 |
+
"eval_RMSE": 0.4139091968536377,
|
| 97 |
+
"eval_SMAPE": 37.25147247314453,
|
| 98 |
+
"eval_runtime": 58.5412,
|
| 99 |
+
"eval_samples_per_second": 11853.276,
|
| 100 |
+
"eval_steps_per_second": 5.791,
|
| 101 |
+
"step": 4852
|
| 102 |
+
},
|
| 103 |
+
{
|
| 104 |
+
"epoch": 2.061005770816158,
|
| 105 |
+
"grad_norm": 0.11636342853307724,
|
| 106 |
+
"learning_rate": 0.000991471946720379,
|
| 107 |
+
"loss": 0.1596,
|
| 108 |
+
"step": 5000
|
| 109 |
+
},
|
| 110 |
+
{
|
| 111 |
+
"epoch": 2.267106347897774,
|
| 112 |
+
"grad_norm": 0.13195322453975677,
|
| 113 |
+
"learning_rate": 0.0009896868614095468,
|
| 114 |
+
"loss": 0.1585,
|
| 115 |
+
"step": 5500
|
| 116 |
+
},
|
| 117 |
+
{
|
| 118 |
+
"epoch": 2.4732069249793898,
|
| 119 |
+
"grad_norm": 0.1224469393491745,
|
| 120 |
+
"learning_rate": 0.0009877341938019622,
|
| 121 |
+
"loss": 0.1582,
|
| 122 |
+
"step": 6000
|
| 123 |
+
},
|
| 124 |
+
{
|
| 125 |
+
"epoch": 2.6793075020610058,
|
| 126 |
+
"grad_norm": 0.12972772121429443,
|
| 127 |
+
"learning_rate": 0.0009856146121461496,
|
| 128 |
+
"loss": 0.1574,
|
| 129 |
+
"step": 6500
|
| 130 |
+
},
|
| 131 |
+
{
|
| 132 |
+
"epoch": 2.8854080791426218,
|
| 133 |
+
"grad_norm": 0.12829646468162537,
|
| 134 |
+
"learning_rate": 0.0009833288418125239,
|
| 135 |
+
"loss": 0.1567,
|
| 136 |
+
"step": 7000
|
| 137 |
+
},
|
| 138 |
+
{
|
| 139 |
+
"epoch": 3.0,
|
| 140 |
+
"eval_MAE": 0.2434338480234146,
|
| 141 |
+
"eval_MAPE": 180.72421264648438,
|
| 142 |
+
"eval_MSE": 0.16737966239452362,
|
| 143 |
+
"eval_MSPE": 9376.3095703125,
|
| 144 |
+
"eval_ND": 0.2845493257045746,
|
| 145 |
+
"eval_RMSE": 0.4091205894947052,
|
| 146 |
+
"eval_SMAPE": 35.947811126708984,
|
| 147 |
+
"eval_runtime": 58.2785,
|
| 148 |
+
"eval_samples_per_second": 11906.705,
|
| 149 |
+
"eval_steps_per_second": 5.817,
|
| 150 |
+
"step": 7278
|
| 151 |
+
},
|
| 152 |
+
{
|
| 153 |
+
"epoch": 3.0915086562242373,
|
| 154 |
+
"grad_norm": 0.07943403720855713,
|
| 155 |
+
"learning_rate": 0.000980877665045153,
|
| 156 |
+
"loss": 0.1559,
|
| 157 |
+
"step": 7500
|
| 158 |
+
},
|
| 159 |
+
{
|
| 160 |
+
"epoch": 3.2976092333058533,
|
| 161 |
+
"grad_norm": 0.06782522052526474,
|
| 162 |
+
"learning_rate": 0.0009782619206940547,
|
| 163 |
+
"loss": 0.1552,
|
| 164 |
+
"step": 8000
|
| 165 |
+
},
|
| 166 |
+
{
|
| 167 |
+
"epoch": 3.503709810387469,
|
| 168 |
+
"grad_norm": 0.12171012163162231,
|
| 169 |
+
"learning_rate": 0.000975482503928123,
|
| 170 |
+
"loss": 0.155,
|
| 171 |
+
"step": 8500
|
| 172 |
+
},
|
| 173 |
+
{
|
| 174 |
+
"epoch": 3.709810387469085,
|
| 175 |
+
"grad_norm": 0.16907528042793274,
|
| 176 |
+
"learning_rate": 0.0009725403659287799,
|
| 177 |
+
"loss": 0.1543,
|
| 178 |
+
"step": 9000
|
| 179 |
+
},
|
| 180 |
+
{
|
| 181 |
+
"epoch": 3.915910964550701,
|
| 182 |
+
"grad_norm": 0.1479276567697525,
|
| 183 |
+
"learning_rate": 0.0009694365135644595,
|
| 184 |
+
"loss": 0.1538,
|
| 185 |
+
"step": 9500
|
| 186 |
+
},
|
| 187 |
+
{
|
| 188 |
+
"epoch": 4.0,
|
| 189 |
+
"eval_MAE": 0.23490256071090698,
|
| 190 |
+
"eval_MAPE": 181.50564575195312,
|
| 191 |
+
"eval_MSE": 0.16538722813129425,
|
| 192 |
+
"eval_MSPE": 9649.654296875,
|
| 193 |
+
"eval_ND": 0.27457714080810547,
|
| 194 |
+
"eval_RMSE": 0.40667828917503357,
|
| 195 |
+
"eval_SMAPE": 34.649436950683594,
|
| 196 |
+
"eval_runtime": 58.6782,
|
| 197 |
+
"eval_samples_per_second": 11825.598,
|
| 198 |
+
"eval_steps_per_second": 5.777,
|
| 199 |
+
"step": 9704
|
| 200 |
+
},
|
| 201 |
+
{
|
| 202 |
+
"epoch": 4.122011541632316,
|
| 203 |
+
"grad_norm": 0.09500300139188766,
|
| 204 |
+
"learning_rate": 0.0009661720090460337,
|
| 205 |
+
"loss": 0.1535,
|
| 206 |
+
"step": 10000
|
| 207 |
+
},
|
| 208 |
+
{
|
| 209 |
+
"epoch": 4.328112118713932,
|
| 210 |
+
"grad_norm": 0.0786258801817894,
|
| 211 |
+
"learning_rate": 0.0009627479695632988,
|
| 212 |
+
"loss": 0.153,
|
| 213 |
+
"step": 10500
|
| 214 |
+
},
|
| 215 |
+
{
|
| 216 |
+
"epoch": 4.534212695795548,
|
| 217 |
+
"grad_norm": 0.07586020231246948,
|
| 218 |
+
"learning_rate": 0.0009591655669026469,
|
| 219 |
+
"loss": 0.1523,
|
| 220 |
+
"step": 11000
|
| 221 |
+
},
|
| 222 |
+
{
|
| 223 |
+
"epoch": 4.740313272877164,
|
| 224 |
+
"grad_norm": 0.14045332372188568,
|
| 225 |
+
"learning_rate": 0.0009554260270460539,
|
| 226 |
+
"loss": 0.1517,
|
| 227 |
+
"step": 11500
|
| 228 |
+
},
|
| 229 |
+
{
|
| 230 |
+
"epoch": 4.9464138499587795,
|
| 231 |
+
"grad_norm": 0.08372505754232407,
|
| 232 |
+
"learning_rate": 0.0009515306297515187,
|
| 233 |
+
"loss": 0.1512,
|
| 234 |
+
"step": 12000
|
| 235 |
+
},
|
| 236 |
+
{
|
| 237 |
+
"epoch": 5.0,
|
| 238 |
+
"eval_MAE": 0.2414267361164093,
|
| 239 |
+
"eval_MAPE": 181.71209716796875,
|
| 240 |
+
"eval_MSE": 0.16475139558315277,
|
| 241 |
+
"eval_MSPE": 9144.029296875,
|
| 242 |
+
"eval_ND": 0.28220322728157043,
|
| 243 |
+
"eval_RMSE": 0.40589579939842224,
|
| 244 |
+
"eval_SMAPE": 35.766761779785156,
|
| 245 |
+
"eval_runtime": 59.913,
|
| 246 |
+
"eval_samples_per_second": 11581.882,
|
| 247 |
+
"eval_steps_per_second": 5.658,
|
| 248 |
+
"step": 12130
|
| 249 |
+
},
|
| 250 |
+
{
|
| 251 |
+
"epoch": 5.152514427040396,
|
| 252 |
+
"grad_norm": 0.12079860270023346,
|
| 253 |
+
"learning_rate": 0.0009474807081151011,
|
| 254 |
+
"loss": 0.1507,
|
| 255 |
+
"step": 12500
|
| 256 |
+
},
|
| 257 |
+
{
|
| 258 |
+
"epoch": 5.3586150041220115,
|
| 259 |
+
"grad_norm": 0.08449820429086685,
|
| 260 |
+
"learning_rate": 0.0009432776481147042,
|
| 261 |
+
"loss": 0.1504,
|
| 262 |
+
"step": 13000
|
| 263 |
+
},
|
| 264 |
+
{
|
| 265 |
+
"epoch": 5.564715581203627,
|
| 266 |
+
"grad_norm": 0.06070152297616005,
|
| 267 |
+
"learning_rate": 0.0009389228881357614,
|
| 268 |
+
"loss": 0.1501,
|
| 269 |
+
"step": 13500
|
| 270 |
+
},
|
| 271 |
+
{
|
| 272 |
+
"epoch": 5.7708161582852435,
|
| 273 |
+
"grad_norm": 0.1490793377161026,
|
| 274 |
+
"learning_rate": 0.0009344179184789862,
|
| 275 |
+
"loss": 0.1493,
|
| 276 |
+
"step": 14000
|
| 277 |
+
},
|
| 278 |
+
{
|
| 279 |
+
"epoch": 5.976916735366859,
|
| 280 |
+
"grad_norm": 0.15414516627788544,
|
| 281 |
+
"learning_rate": 0.0009297642808503576,
|
| 282 |
+
"loss": 0.1494,
|
| 283 |
+
"step": 14500
|
| 284 |
+
},
|
| 285 |
+
{
|
| 286 |
+
"epoch": 6.0,
|
| 287 |
+
"eval_MAE": 0.23061439394950867,
|
| 288 |
+
"eval_MAPE": 180.4387969970703,
|
| 289 |
+
"eval_MSE": 0.16441404819488525,
|
| 290 |
+
"eval_MSPE": 9619.1962890625,
|
| 291 |
+
"eval_ND": 0.269564688205719,
|
| 292 |
+
"eval_RMSE": 0.4054800271987915,
|
| 293 |
+
"eval_SMAPE": 33.97864532470703,
|
| 294 |
+
"eval_runtime": 58.9946,
|
| 295 |
+
"eval_samples_per_second": 11762.182,
|
| 296 |
+
"eval_steps_per_second": 5.746,
|
| 297 |
+
"step": 14556
|
| 298 |
+
}
|
| 299 |
+
],
|
| 300 |
+
"logging_steps": 500,
|
| 301 |
+
"max_steps": 84910,
|
| 302 |
+
"num_input_tokens_seen": 0,
|
| 303 |
+
"num_train_epochs": 35,
|
| 304 |
+
"save_steps": 500,
|
| 305 |
+
"stateful_callbacks": {
|
| 306 |
+
"EarlyStoppingCallback": {
|
| 307 |
+
"args": {
|
| 308 |
+
"early_stopping_patience": 3,
|
| 309 |
+
"early_stopping_threshold": 0.0
|
| 310 |
+
},
|
| 311 |
+
"attributes": {
|
| 312 |
+
"early_stopping_patience_counter": 0
|
| 313 |
+
}
|
| 314 |
+
},
|
| 315 |
+
"TrainerControl": {
|
| 316 |
+
"args": {
|
| 317 |
+
"should_epoch_stop": false,
|
| 318 |
+
"should_evaluate": false,
|
| 319 |
+
"should_log": false,
|
| 320 |
+
"should_save": true,
|
| 321 |
+
"should_training_stop": false
|
| 322 |
+
},
|
| 323 |
+
"attributes": {}
|
| 324 |
+
}
|
| 325 |
+
},
|
| 326 |
+
"total_flos": 0.0,
|
| 327 |
+
"train_batch_size": 512,
|
| 328 |
+
"trial_name": null,
|
| 329 |
+
"trial_params": null
|
| 330 |
+
}
|
OFA/Solar_192/checkpoint-14556/training_args.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c9af1751dc8b83f8179385ed6c0777b909ae98e57808be77ef242c73b56ec973
|
| 3 |
+
size 6584
|
OFA/exchange_rate_192/checkpoint-299/pytorch_model.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ace47295691bb7ba9ffee694abb9f152a520eb239be59f22bfe08a529a9d34dc
|
| 3 |
+
size 261338858
|
OFA/exchange_rate_192/checkpoint-299/trainer_state.json
ADDED
|
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"best_global_step": 299,
|
| 3 |
+
"best_metric": 0.25842341780662537,
|
| 4 |
+
"best_model_checkpoint": "/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/OFA_layer/haspara/192/exchange_rate/checkpoint-299",
|
| 5 |
+
"epoch": 1.0,
|
| 6 |
+
"eval_steps": 500,
|
| 7 |
+
"global_step": 299,
|
| 8 |
+
"is_hyper_param_search": false,
|
| 9 |
+
"is_local_process_zero": true,
|
| 10 |
+
"is_world_process_zero": true,
|
| 11 |
+
"log_history": [
|
| 12 |
+
{
|
| 13 |
+
"epoch": 1.0,
|
| 14 |
+
"eval_MAE": 0.3695564270019531,
|
| 15 |
+
"eval_MAPE": 25.642061233520508,
|
| 16 |
+
"eval_MSE": 0.25842341780662537,
|
| 17 |
+
"eval_MSPE": 2.612428903579712,
|
| 18 |
+
"eval_ND": 0.16934385895729065,
|
| 19 |
+
"eval_RMSE": 0.5083536505699158,
|
| 20 |
+
"eval_SMAPE": 21.528039932250977,
|
| 21 |
+
"eval_runtime": 1.45,
|
| 22 |
+
"eval_samples_per_second": 3139.386,
|
| 23 |
+
"eval_steps_per_second": 2.069,
|
| 24 |
+
"step": 299
|
| 25 |
+
}
|
| 26 |
+
],
|
| 27 |
+
"logging_steps": 500,
|
| 28 |
+
"max_steps": 10465,
|
| 29 |
+
"num_input_tokens_seen": 0,
|
| 30 |
+
"num_train_epochs": 35,
|
| 31 |
+
"save_steps": 500,
|
| 32 |
+
"stateful_callbacks": {
|
| 33 |
+
"EarlyStoppingCallback": {
|
| 34 |
+
"args": {
|
| 35 |
+
"early_stopping_patience": 3,
|
| 36 |
+
"early_stopping_threshold": 0.0
|
| 37 |
+
},
|
| 38 |
+
"attributes": {
|
| 39 |
+
"early_stopping_patience_counter": 0
|
| 40 |
+
}
|
| 41 |
+
},
|
| 42 |
+
"TrainerControl": {
|
| 43 |
+
"args": {
|
| 44 |
+
"should_epoch_stop": false,
|
| 45 |
+
"should_evaluate": false,
|
| 46 |
+
"should_log": false,
|
| 47 |
+
"should_save": true,
|
| 48 |
+
"should_training_stop": false
|
| 49 |
+
},
|
| 50 |
+
"attributes": {}
|
| 51 |
+
}
|
| 52 |
+
},
|
| 53 |
+
"total_flos": 0.0,
|
| 54 |
+
"train_batch_size": 32,
|
| 55 |
+
"trial_name": null,
|
| 56 |
+
"trial_params": null
|
| 57 |
+
}
|
OFA/exchange_rate_192/checkpoint-299/training_args.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:adab5dda675b19d540a6cd1fafe424edba1cec784a550ddf4f810ab1b4070ff1
|
| 3 |
+
size 6584
|
OFA/weather_720/checkpoint-368/pytorch_model.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d051349812c6ac062d248ad978de78868a2c6d603ebb19c4f43c0c6f40366c50
|
| 3 |
+
size 295402218
|
OFA/weather_720/checkpoint-368/trainer_state.json
ADDED
|
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"best_global_step": 368,
|
| 3 |
+
"best_metric": 0.6688529253005981,
|
| 4 |
+
"best_model_checkpoint": "/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/OFA_layer/haspara/720/weather/checkpoint-368",
|
| 5 |
+
"epoch": 1.0,
|
| 6 |
+
"eval_steps": 500,
|
| 7 |
+
"global_step": 368,
|
| 8 |
+
"is_hyper_param_search": false,
|
| 9 |
+
"is_local_process_zero": true,
|
| 10 |
+
"is_world_process_zero": true,
|
| 11 |
+
"log_history": [
|
| 12 |
+
{
|
| 13 |
+
"epoch": 1.0,
|
| 14 |
+
"eval_MAE": 0.454274982213974,
|
| 15 |
+
"eval_MAPE": 1555.0623779296875,
|
| 16 |
+
"eval_MSE": 0.6688529253005981,
|
| 17 |
+
"eval_MSPE": 14232647.0,
|
| 18 |
+
"eval_ND": 0.8133898377418518,
|
| 19 |
+
"eval_RMSE": 0.8178343176841736,
|
| 20 |
+
"eval_SMAPE": 97.3488540649414,
|
| 21 |
+
"eval_runtime": 11.3069,
|
| 22 |
+
"eval_samples_per_second": 8452.448,
|
| 23 |
+
"eval_steps_per_second": 4.157,
|
| 24 |
+
"step": 368
|
| 25 |
+
}
|
| 26 |
+
],
|
| 27 |
+
"logging_steps": 500,
|
| 28 |
+
"max_steps": 12880,
|
| 29 |
+
"num_input_tokens_seen": 0,
|
| 30 |
+
"num_train_epochs": 35,
|
| 31 |
+
"save_steps": 500,
|
| 32 |
+
"stateful_callbacks": {
|
| 33 |
+
"EarlyStoppingCallback": {
|
| 34 |
+
"args": {
|
| 35 |
+
"early_stopping_patience": 3,
|
| 36 |
+
"early_stopping_threshold": 0.0
|
| 37 |
+
},
|
| 38 |
+
"attributes": {
|
| 39 |
+
"early_stopping_patience_counter": 0
|
| 40 |
+
}
|
| 41 |
+
},
|
| 42 |
+
"TrainerControl": {
|
| 43 |
+
"args": {
|
| 44 |
+
"should_epoch_stop": false,
|
| 45 |
+
"should_evaluate": false,
|
| 46 |
+
"should_log": false,
|
| 47 |
+
"should_save": true,
|
| 48 |
+
"should_training_stop": false
|
| 49 |
+
},
|
| 50 |
+
"attributes": {}
|
| 51 |
+
}
|
| 52 |
+
},
|
| 53 |
+
"total_flos": 0.0,
|
| 54 |
+
"train_batch_size": 512,
|
| 55 |
+
"trial_name": null,
|
| 56 |
+
"trial_params": null
|
| 57 |
+
}
|
OFA/weather_720/checkpoint-368/training_args.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0d346ca809d2df8f0ecac6e575996ddf769302bed020738e91f3c3cc6371672c
|
| 3 |
+
size 6584
|
TimeLLM/ETTm1_512_192_TimeLLM_ETTm1_sl512_pl192_dm32_nh8_df128/checkpoint.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e257ac465464b06bd92e3e56d0b6dac59b9d7e8cea42d2f9916f02edae805960
|
| 3 |
+
size 714733599
|
TimeLLM/PEMS07_512_336_TimeLLM_PEMS07_sl512_pl336_dm16_nh8_df32/checkpoint.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8687faa15ce2fbfaccdb59744cc1fc6e0a068bf89df6d2ff4a61482e1c136674
|
| 3 |
+
size 703990367
|
TimeLLM/PEMS08_512_720_TimeLLM_PEMS08_sl512_pl720_dm16_nh8_df32/checkpoint.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:468dfc1f9e03632771833a9d83fb8524132277f5752ee38633952285427d1237
|
| 3 |
+
size 707137631
|
TimeLLM/PEMS08_512_720_TimeLLM_PEMS08_sl512_pl720_dm16_nh8_df32/log.txt
ADDED
|
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Namespace(model_id='PEMS08_512_720', model='TimeLLM', seed=2021, data='PEMS08', checkpoints='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/TimeLLM/hasparacheckpoints720/', load_ckp_base='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/TimeLLM/hasparacheckpoints96/', seq_len=512, pred_len=720, d_model=16, n_heads=8, d_ff=32, dropout=0.1, patch_size=16, stride=8, llm_dim=768, num_workers=16, train_epochs=10, batch_size=48, patience=3, learning_rate=0.01, lradj='type1', pct_start=0.2, gpt2_llama2='gpt2', part=0, pretrain=1, freeze=1, test_metrics_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/Time-LLM/scripts/test_metrics/720.txt', dual_FT=0, selected_layers=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], load_path='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/TimeLLM/hasparacheckpoints720/PEMS08_512_720_TimeLLM_PEMS08_sl512_pl720_dm16_nh8_df32/log.txt', content='PEMS08 is a traffic dataset collected by 170 sensors over a period of 62 days. It includes three types of features: flow, average speed, and average occupancy.')
|
| 2 |
+
Epoch: 1 cost time: 5270.287646770477
|
| 3 |
+
Epoch: 1 | Train Loss: 0.5284430 Vali Loss: 0.6219723
|
| 4 |
+
lr = 0.0004000000
|
| 5 |
+
Epoch: 2 cost time: 5211.907626390457
|
| 6 |
+
Epoch: 2 | Train Loss: 0.5061403 Vali Loss: 0.6203775
|
| 7 |
+
Epoch: 3 cost time: 5205.380863666534
|
| 8 |
+
Epoch: 3 | Train Loss: 0.4967325 Vali Loss: 0.6154433
|
| 9 |
+
Epoch: 4 cost time: 5146.1251401901245
|
| 10 |
+
Epoch: 4 | Train Loss: 0.4909644 Vali Loss: 0.6087528
|
| 11 |
+
Epoch: 5 cost time: 5147.690908670425
|
| 12 |
+
Epoch: 5 | Train Loss: 0.4880833 Vali Loss: 0.6050516
|
| 13 |
+
Epoch: 6 cost time: 5146.230623722076
|
| 14 |
+
Epoch: 6 | Train Loss: 0.4862469 Vali Loss: 0.6055145
|
| 15 |
+
EarlyStopping counter: 1 out of 3
|
| 16 |
+
Epoch: 7 cost time: 5129.8446407318115
|
| 17 |
+
Epoch: 7 | Train Loss: 0.4850045 Vali Loss: 0.6044727
|
| 18 |
+
Epoch: 8 cost time: 5116.925268650055
|
| 19 |
+
Epoch: 8 | Train Loss: 0.4858361 Vali Loss: 0.6041182
|
| 20 |
+
Epoch: 9 cost time: 5114.435469150543
|
| 21 |
+
Epoch: 9 | Train Loss: 0.4855596 Vali Loss: 0.6045282
|
| 22 |
+
EarlyStopping counter: 1 out of 3
|
| 23 |
+
Epoch: 10 cost time: 5103.860981225967
|
| 24 |
+
Epoch: 10 | Train Loss: 0.4847863 Vali Loss: 0.6048156
|
| 25 |
+
EarlyStopping counter: 2 out of 3
|
| 26 |
+
test shape: (1454520, 720, 1) (1454520, 720, 1)
|
| 27 |
+
PEMS08--MAE: 0.4079, MSE: 0.6099
|
| 28 |
+
finish
|
TimeLLM/electricity_512_192_TimeLLM_electricity_sl512_pl192_dm16_nh8_df32/checkpoint.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:88e322b8eb75cf0165dff18c4c154d7815a328b26845e57c52931c9ed93190cb
|
| 3 |
+
size 702810143
|