Mickey25 commited on
Commit
da9a01b
·
verified ·
1 Parent(s): f68a279

Upload folder using huggingface_hub

Browse files
Files changed (29) hide show
  1. CALF/PEMS03_sl96_pl192_gpt12/checkpoint.pth +3 -0
  2. CALF/PEMS03_sl96_pl192_gpt12/log.txt +23 -0
  3. CALF/PEMS03_sl96_pl336_gpt12/checkpoint.pth +3 -0
  4. CALF/PEMS03_sl96_pl336_gpt12/log.txt +22 -0
  5. CALF/traffic_sl96_pl720_gpt12/checkpoint.pth +3 -0
  6. CALF/traffic_sl96_pl720_gpt12/log.txt +13 -0
  7. CALF/traffic_sl96_pl96_gpt12/checkpoint.pth +3 -0
  8. CALF/traffic_sl96_pl96_gpt12/log.txt +12 -0
  9. FSCA/ETTm2_96/checkpoint.pth +3 -0
  10. FSCA/Electricity_96/checkpoint.pth +3 -0
  11. FSCA/Solar_96/checkpoint.pth +3 -0
  12. FSCA/weather_96/checkpoint.pth +3 -0
  13. OFA/PEMS04_336/checkpoint-65624/pytorch_model.bin +3 -0
  14. OFA/PEMS04_336/checkpoint-65624/trainer_state.json +1142 -0
  15. OFA/PEMS04_336/checkpoint-65624/training_args.bin +3 -0
  16. OFA/Solar_192/checkpoint-14556/pytorch_model.bin +3 -0
  17. OFA/Solar_192/checkpoint-14556/trainer_state.json +330 -0
  18. OFA/Solar_192/checkpoint-14556/training_args.bin +3 -0
  19. OFA/exchange_rate_192/checkpoint-299/pytorch_model.bin +3 -0
  20. OFA/exchange_rate_192/checkpoint-299/trainer_state.json +57 -0
  21. OFA/exchange_rate_192/checkpoint-299/training_args.bin +3 -0
  22. OFA/weather_720/checkpoint-368/pytorch_model.bin +3 -0
  23. OFA/weather_720/checkpoint-368/trainer_state.json +57 -0
  24. OFA/weather_720/checkpoint-368/training_args.bin +3 -0
  25. TimeLLM/ETTm1_512_192_TimeLLM_ETTm1_sl512_pl192_dm32_nh8_df128/checkpoint.pth +3 -0
  26. TimeLLM/PEMS07_512_336_TimeLLM_PEMS07_sl512_pl336_dm16_nh8_df32/checkpoint.pth +3 -0
  27. TimeLLM/PEMS08_512_720_TimeLLM_PEMS08_sl512_pl720_dm16_nh8_df32/checkpoint.pth +3 -0
  28. TimeLLM/PEMS08_512_720_TimeLLM_PEMS08_sl512_pl720_dm16_nh8_df32/log.txt +28 -0
  29. TimeLLM/electricity_512_192_TimeLLM_electricity_sl512_pl192_dm16_nh8_df32/checkpoint.pth +3 -0
CALF/PEMS03_sl96_pl192_gpt12/checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e9523a29e41abdf6d4cfd48d79b51becbf0b1e0fe249664a00ceac7dff3333d
3
+ size 1090570197
CALF/PEMS03_sl96_pl192_gpt12/log.txt ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ >>>>>>>start training>>>>>>>>>>>>>>
2
+ Namespace(is_training=1, model_id='PEMS03_96_192', model='CALF', data='PEMS03', features='M', target='OT', freq='h', checkpoints='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/', seq_len=96, pred_len=192, d_model=768, d_ff=768, dropout=0.3, embed='timeF', num_workers=20, train_epochs=100, batch_size=512, patience=3, learning_rate=0.001, lradj='type1', task_loss='smooth_l1', feature_loss='smooth_l1', output_loss='smooth_l1', tmax=20, cos=1, r=8, lora_alpha=32, lora_dropout=0.1, word_embedding_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/wte_pca_500.pt', task_w=1.0, feature_w=0.01, output_w=1.0, gpt_layers=12, test_metrics_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/test_metrics_path/96_192.txt', multi=0, block_or_sublayer='no', load_path='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/PEMS03_sl96_pl192_gpt12/log.txt')
3
+ Epoch: 1 cost time: 1213.5586137771606Epoch: 1, Steps: 12627 | Train Loss: 0.2405325 Vali Loss: 0.4243636lr = 0.0009045095Validation loss decreased (inf --> 0.424364). Saving model ...
4
+ Epoch: 2 cost time: 1214.5051691532135Epoch: 2, Steps: 12627 | Train Loss: 0.1846573 Vali Loss: 0.4036770lr = 0.0006545120Validation loss decreased (0.424364 --> 0.403677). Saving model ...
5
+ Epoch: 3 cost time: 1216.8372721672058Epoch: 3, Steps: 12627 | Train Loss: 0.1736200 Vali Loss: 0.3934794lr = 0.0003454980Validation loss decreased (0.403677 --> 0.393479). Saving model ...
6
+ Epoch: 4 cost time: 1209.1885206699371Epoch: 4, Steps: 12627 | Train Loss: 0.1654215 Vali Loss: 0.3756141lr = 0.0000955005Validation loss decreased (0.393479 --> 0.375614). Saving model ...
7
+ Epoch: 5 cost time: 1221.2066152095795Epoch: 5, Steps: 12627 | Train Loss: 0.1578077 Vali Loss: 0.3667139lr = 0.0000000100Validation loss decreased (0.375614 --> 0.366714). Saving model ...
8
+ Epoch: 6 cost time: 1215.5747256278992Epoch: 6, Steps: 12627 | Train Loss: 0.1550683 Vali Loss: 0.3650614lr = 0.0000955005Validation loss decreased (0.366714 --> 0.365061). Saving model ...
9
+ Epoch: 7 cost time: 1222.5011975765228Epoch: 7, Steps: 12627 | Train Loss: 0.1548908 Vali Loss: 0.3642367lr = 0.0003454980Validation loss decreased (0.365061 --> 0.364237). Saving model ...
10
+ Epoch: 8 cost time: 1218.6213192939758Epoch: 8, Steps: 12627 | Train Loss: 0.1566171 Vali Loss: 0.3638320lr = 0.0006545120Validation loss decreased (0.364237 --> 0.363832). Saving model ...
11
+ Epoch: 9 cost time: 1217.3795993328094Epoch: 9, Steps: 12627 | Train Loss: 0.1578000 Vali Loss: 0.3680637lr = 0.0009045095EarlyStopping counter: 1 out of 3
12
+ Epoch: 10 cost time: 1219.9044802188873Epoch: 10, Steps: 12627 | Train Loss: 0.1567791 Vali Loss: 0.3629943lr = 0.0010000000Validation loss decreased (0.363832 --> 0.362994). Saving model ...
13
+ Epoch: 11 cost time: 1219.7232382297516Epoch: 11, Steps: 12627 | Train Loss: 0.1545983 Vali Loss: 0.3605110lr = 0.0009045095Validation loss decreased (0.362994 --> 0.360511). Saving model ...
14
+ Epoch: 12 cost time: 1214.502541065216Epoch: 12, Steps: 12627 | Train Loss: 0.1509807 Vali Loss: 0.3564376lr = 0.0006545120Validation loss decreased (0.360511 --> 0.356438). Saving model ...
15
+ Epoch: 13 cost time: 1218.4142220020294Epoch: 13, Steps: 12627 | Train Loss: 0.1465258 Vali Loss: 0.3490480lr = 0.0003454980Validation loss decreased (0.356438 --> 0.349048). Saving model ...
16
+ Epoch: 14 cost time: 1216.8537709712982Epoch: 14, Steps: 12627 | Train Loss: 0.1419440 Vali Loss: 0.3439195lr = 0.0000955005Validation loss decreased (0.349048 --> 0.343919). Saving model ...
17
+ Epoch: 15 cost time: 1218.0101554393768Epoch: 15, Steps: 12627 | Train Loss: 0.1383972 Vali Loss: 0.3422484lr = 0.0000000100Validation loss decreased (0.343919 --> 0.342248). Saving model ...
18
+ Epoch: 16 cost time: 1222.7025911808014Epoch: 16, Steps: 12627 | Train Loss: 0.1374614 Vali Loss: 0.3408269lr = 0.0000955005Validation loss decreased (0.342248 --> 0.340827). Saving model ...
19
+ Epoch: 17 cost time: 1218.2776758670807Epoch: 17, Steps: 12627 | Train Loss: 0.1378242 Vali Loss: 0.3398700lr = 0.0003454980Validation loss decreased (0.340827 --> 0.339870). Saving model ...
20
+ Epoch: 18 cost time: 1214.6523876190186Epoch: 18, Steps: 12627 | Train Loss: 0.1396362 Vali Loss: 0.3457815lr = 0.0006545120EarlyStopping counter: 1 out of 3
21
+ Epoch: 19 cost time: 1215.1394522190094Epoch: 19, Steps: 12627 | Train Loss: 0.1422571 Vali Loss: 0.3470419lr = 0.0009045095EarlyStopping counter: 2 out of 3
22
+ Epoch: 20 cost time: 1215.5405178070068Epoch: 20, Steps: 12627 | Train Loss: 0.1439515 Vali Loss: 0.3507722lr = 0.0010000000EarlyStopping counter: 3 out of 3
23
+ Early stopping>>>>>>>testing>>>>>>>>>>>>>>
CALF/PEMS03_sl96_pl336_gpt12/checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fde94b09516373bbaf8b1651e40a6fe4c09d400e21a0dc5c6f5d8363931198e1
3
+ size 1091013141
CALF/PEMS03_sl96_pl336_gpt12/log.txt ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ >>>>>>>start training>>>>>>>>>>>>>>
2
+ Namespace(is_training=1, model_id='PEMS03_96_336', model='CALF', data='PEMS03', features='M', target='OT', freq='h', checkpoints='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/', seq_len=96, pred_len=336, d_model=768, d_ff=768, dropout=0.3, embed='timeF', num_workers=20, train_epochs=100, batch_size=512, patience=3, learning_rate=0.001, lradj='type1', task_loss='smooth_l1', feature_loss='smooth_l1', output_loss='smooth_l1', tmax=20, cos=1, r=8, lora_alpha=32, lora_dropout=0.1, word_embedding_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/wte_pca_500.pt', task_w=1.0, feature_w=0.01, output_w=1.0, gpt_layers=12, test_metrics_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/test_metrics_path/96_336.txt', multi=0, block_or_sublayer='no', load_path='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/PEMS03_sl96_pl336_gpt12/log.txt')
3
+ Epoch: 1 cost time: 1228.0863962173462Epoch: 1, Steps: 12526 | Train Loss: 0.1989904 Vali Loss: 0.3550510lr = 0.0009045095Validation loss decreased (inf --> 0.355051). Saving model ...
4
+ Epoch: 2 cost time: 1205.9942526817322Epoch: 2, Steps: 12526 | Train Loss: 0.1547784 Vali Loss: 0.3398961lr = 0.0006545120Validation loss decreased (0.355051 --> 0.339896). Saving model ...
5
+ Epoch: 3 cost time: 1219.171292066574Epoch: 3, Steps: 12526 | Train Loss: 0.1466388 Vali Loss: 0.3313598lr = 0.0003454980Validation loss decreased (0.339896 --> 0.331360). Saving model ...
6
+ Epoch: 4 cost time: 1218.2594282627106Epoch: 4, Steps: 12526 | Train Loss: 0.1410800 Vali Loss: 0.3234892lr = 0.0000955005Validation loss decreased (0.331360 --> 0.323489). Saving model ...
7
+ Epoch: 5 cost time: 1217.5273847579956Epoch: 5, Steps: 12526 | Train Loss: 0.1370407 Vali Loss: 0.3151883lr = 0.0000000100Validation loss decreased (0.323489 --> 0.315188). Saving model ...
8
+ Epoch: 6 cost time: 1220.6260316371918Epoch: 6, Steps: 12526 | Train Loss: 0.1349816 Vali Loss: 0.3143618lr = 0.0000955005Validation loss decreased (0.315188 --> 0.314362). Saving model ...
9
+ Epoch: 7 cost time: 1219.025268316269Epoch: 7, Steps: 12526 | Train Loss: 0.1350453 Vali Loss: 0.3129805lr = 0.0003454980Validation loss decreased (0.314362 --> 0.312980). Saving model ...
10
+ Epoch: 8 cost time: 1220.592013835907Epoch: 8, Steps: 12526 | Train Loss: 0.1358405 Vali Loss: 0.3148754lr = 0.0006545120EarlyStopping counter: 1 out of 3
11
+ Epoch: 9 cost time: 1220.2405898571014Epoch: 9, Steps: 12526 | Train Loss: 0.1357594 Vali Loss: 0.3106849lr = 0.0009045095Validation loss decreased (0.312980 --> 0.310685). Saving model ...
12
+ Epoch: 10 cost time: 1222.7074172496796Epoch: 10, Steps: 12526 | Train Loss: 0.1350104 Vali Loss: 0.3121745lr = 0.0010000000EarlyStopping counter: 1 out of 3
13
+ Epoch: 11 cost time: 1220.4871294498444Epoch: 11, Steps: 12526 | Train Loss: 0.1332038 Vali Loss: 0.3067603lr = 0.0009045095Validation loss decreased (0.310685 --> 0.306760). Saving model ...
14
+ Epoch: 12 cost time: 1214.7389903068542Epoch: 12, Steps: 12526 | Train Loss: 0.1303220 Vali Loss: 0.3014657lr = 0.0006545120Validation loss decreased (0.306760 --> 0.301466). Saving model ...
15
+ Epoch: 13 cost time: 1224.0501599311829Epoch: 13, Steps: 12526 | Train Loss: 0.1271204 Vali Loss: 0.2972590lr = 0.0003454980Validation loss decreased (0.301466 --> 0.297259). Saving model ...
16
+ Epoch: 14 cost time: 1222.7768981456757Epoch: 14, Steps: 12526 | Train Loss: 0.1238745 Vali Loss: 0.2942495lr = 0.0000955005Validation loss decreased (0.297259 --> 0.294249). Saving model ...
17
+ Epoch: 15 cost time: 1225.6895768642426Epoch: 15, Steps: 12526 | Train Loss: 0.1216603 Vali Loss: 0.2913513lr = 0.0000000100Validation loss decreased (0.294249 --> 0.291351). Saving model ...
18
+ Epoch: 16 cost time: 1216.2601499557495Epoch: 16, Steps: 12526 | Train Loss: 0.1207567 Vali Loss: 0.2911408lr = 0.0000955005Validation loss decreased (0.291351 --> 0.291141). Saving model ...
19
+ Epoch: 17 cost time: 1222.1440660953522Epoch: 17, Steps: 12526 | Train Loss: 0.1208047 Vali Loss: 0.2921268lr = 0.0003454980EarlyStopping counter: 1 out of 3
20
+ Epoch: 18 cost time: 1219.8968963623047Epoch: 18, Steps: 12526 | Train Loss: 0.1223113 Vali Loss: 0.2934834lr = 0.0006545120EarlyStopping counter: 2 out of 3
21
+ Epoch: 19 cost time: 1224.0830509662628Epoch: 19, Steps: 12526 | Train Loss: 0.1239117 Vali Loss: 0.2936475lr = 0.0009045095EarlyStopping counter: 3 out of 3
22
+ Early stopping>>>>>>>testing>>>>>>>>>>>>>>
CALF/traffic_sl96_pl720_gpt12/checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e051c17a4676999ab8d29b31fa984319dad9a4cdf81e53d8687c1348fe06754
3
+ size 1092194325
CALF/traffic_sl96_pl720_gpt12/log.txt ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ >>>>>>>start training>>>>>>>>>>>>>>
2
+ Namespace(is_training=1, model_id='traffic_96_720', model='CALF', data='traffic', features='M', target='OT', freq='h', checkpoints='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/', seq_len=96, pred_len=720, d_model=768, d_ff=768, dropout=0.3, embed='timeF', num_workers=20, train_epochs=100, batch_size=512, patience=3, learning_rate=0.001, lradj='type1', task_loss='smooth_l1', feature_loss='smooth_l1', output_loss='smooth_l1', tmax=20, cos=1, r=8, lora_alpha=32, lora_dropout=0.1, word_embedding_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/wte_pca_500.pt', task_w=1.0, feature_w=0.01, output_w=1.0, gpt_layers=12, test_metrics_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/test_metrics_path/96_720.txt', multi=0, block_or_sublayer='no', load_path='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/traffic_sl96_pl720_gpt12/log.txt')
3
+ Epoch: 1 cost time: 1706.1240828037262Epoch: 1, Steps: 19303 | Train Loss: 0.1389736 Vali Loss: 0.4617771lr = 0.0009045095Validation loss decreased (inf --> 0.461777). Saving model ...
4
+ Epoch: 2 cost time: 1691.999606847763Epoch: 2, Steps: 19303 | Train Loss: 0.1101452 Vali Loss: 0.4579828lr = 0.0006545120Validation loss decreased (0.461777 --> 0.457983). Saving model ...
5
+ Epoch: 3 cost time: 1694.3127541542053Epoch: 3, Steps: 19303 | Train Loss: 0.1069227 Vali Loss: 0.4562370lr = 0.0003454980Validation loss decreased (0.457983 --> 0.456237). Saving model ...
6
+ Epoch: 4 cost time: 1685.3278629779816Epoch: 4, Steps: 19303 | Train Loss: 0.1050713 Vali Loss: 0.4541570lr = 0.0000955005Validation loss decreased (0.456237 --> 0.454157). Saving model ...
7
+ Epoch: 5 cost time: 1705.2223370075226Epoch: 5, Steps: 19303 | Train Loss: 0.1038729 Vali Loss: 0.4523360lr = 0.0000000100Validation loss decreased (0.454157 --> 0.452336). Saving model ...
8
+ Epoch: 6 cost time: 1688.0082385540009Epoch: 6, Steps: 19303 | Train Loss: 0.1035937 Vali Loss: 0.4521805lr = 0.0000955005Validation loss decreased (0.452336 --> 0.452181). Saving model ...
9
+ Epoch: 7 cost time: 1696.0097732543945Epoch: 7, Steps: 19303 | Train Loss: 0.1033980 Vali Loss: 0.4520068lr = 0.0003454980Validation loss decreased (0.452181 --> 0.452007). Saving model ...
10
+ Epoch: 8 cost time: 1705.5962662696838Epoch: 8, Steps: 19303 | Train Loss: 0.1036567 Vali Loss: 0.4533293lr = 0.0006545120EarlyStopping counter: 1 out of 3
11
+ Epoch: 9 cost time: 1708.6191654205322Epoch: 9, Steps: 19303 | Train Loss: 0.1039235 Vali Loss: 0.4534400lr = 0.0009045095EarlyStopping counter: 2 out of 3
12
+ Epoch: 10 cost time: 1716.0039064884186Epoch: 10, Steps: 19303 | Train Loss: 0.1037365 Vali Loss: 0.4530395lr = 0.0010000000EarlyStopping counter: 3 out of 3
13
+ Early stopping>>>>>>>testing>>>>>>>>>>>>>>
CALF/traffic_sl96_pl96_gpt12/checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bd53739b9377458f71a36d60e6d70a31f0d2566752372fc58c57b329bf9c9d2c
3
+ size 1090274901
CALF/traffic_sl96_pl96_gpt12/log.txt ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ >>>>>>>start training>>>>>>>>>>>>>>
2
+ Namespace(is_training=1, model_id='traffic_96_96', model='CALF', data='traffic', features='M', target='OT', freq='h', checkpoints='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/', seq_len=96, pred_len=96, d_model=768, d_ff=768, dropout=0.3, embed='timeF', num_workers=20, train_epochs=100, batch_size=512, patience=3, learning_rate=0.001, lradj='type1', task_loss='smooth_l1', feature_loss='smooth_l1', output_loss='smooth_l1', tmax=20, cos=1, r=8, lora_alpha=32, lora_dropout=0.1, word_embedding_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/wte_pca_500.pt', task_w=1.0, feature_w=0.01, output_w=1.0, gpt_layers=12, test_metrics_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/CALF/test_metrics_path/96_96.txt', multi=0, block_or_sublayer='no', load_path='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/CALF/checkpoints/traffic_sl96_pl96_gpt12/log.txt')
3
+ Epoch: 1 cost time: 1765.7570405006409Epoch: 1, Steps: 20353 | Train Loss: 0.1241191 Vali Loss: 0.4174732lr = 0.0009045095Validation loss decreased (inf --> 0.417473). Saving model ...
4
+ Epoch: 2 cost time: 1800.9101555347443Epoch: 2, Steps: 20353 | Train Loss: 0.0977185 Vali Loss: 0.4121245lr = 0.0006545120Validation loss decreased (0.417473 --> 0.412125). Saving model ...
5
+ Epoch: 3 cost time: 1806.9728829860687Epoch: 3, Steps: 20353 | Train Loss: 0.0939899 Vali Loss: 0.4083741lr = 0.0003454980Validation loss decreased (0.412125 --> 0.408374). Saving model ...
6
+ Epoch: 4 cost time: 1796.8674626350403Epoch: 4, Steps: 20353 | Train Loss: 0.0918263 Vali Loss: 0.4060673lr = 0.0000955005Validation loss decreased (0.408374 --> 0.406067). Saving model ...
7
+ Epoch: 5 cost time: 1797.072764635086Epoch: 5, Steps: 20353 | Train Loss: 0.0907320 Vali Loss: 0.4039504lr = 0.0000000100Validation loss decreased (0.406067 --> 0.403950). Saving model ...
8
+ Epoch: 6 cost time: 1812.847502231598Epoch: 6, Steps: 20353 | Train Loss: 0.0902542 Vali Loss: 0.4035453lr = 0.0000955005Validation loss decreased (0.403950 --> 0.403545). Saving model ...
9
+ Epoch: 7 cost time: 1838.9145092964172Epoch: 7, Steps: 20353 | Train Loss: 0.0903154 Vali Loss: 0.4039853lr = 0.0003454980EarlyStopping counter: 1 out of 3
10
+ Epoch: 8 cost time: 1832.54545378685Epoch: 8, Steps: 20353 | Train Loss: 0.0908403 Vali Loss: 0.4056821lr = 0.0006545120EarlyStopping counter: 2 out of 3
11
+ Epoch: 9 cost time: 1830.4206190109253Epoch: 9, Steps: 20353 | Train Loss: 0.0911053 Vali Loss: 0.4059569lr = 0.0009045095EarlyStopping counter: 3 out of 3
12
+ Early stopping>>>>>>>testing>>>>>>>>>>>>>>
FSCA/ETTm2_96/checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:859c5748206934aaac6e851a0cd0384d33944ff164848650735f68a05a8a85cb
3
+ size 552849378
FSCA/Electricity_96/checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dbfef391437cc4efc56bfadebc64763f1d85404336e1744f4e07ba3dfe12a969
3
+ size 553242594
FSCA/Solar_96/checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8c19ef293164bbcd336d7af250fef398a71cde93e7380189ba074fb4344d6fb1
3
+ size 553242594
FSCA/weather_96/checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a69d95cea6110fb596ac710742e01f219cb8e20d22678e3f70e3823e59a1be6
3
+ size 553242594
OFA/PEMS04_336/checkpoint-65624/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:221dba394b4fab3926f62dd85f226fbeab914ffb70e9a93524aa149c89cbdd80
3
+ size 270628842
OFA/PEMS04_336/checkpoint-65624/trainer_state.json ADDED
@@ -0,0 +1,1142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_global_step": 65624,
3
+ "best_metric": 0.4831537902355194,
4
+ "best_model_checkpoint": "/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/OFA_layer/haspara/336/PEMS04/checkpoint-65624",
5
+ "epoch": 13.0,
6
+ "eval_steps": 500,
7
+ "global_step": 65624,
8
+ "is_hyper_param_search": false,
9
+ "is_local_process_zero": true,
10
+ "is_world_process_zero": true,
11
+ "log_history": [
12
+ {
13
+ "epoch": 0.09904912836767037,
14
+ "grad_norm": 0.06085604056715965,
15
+ "learning_rate": 0.0009999803182724851,
16
+ "loss": 0.6841,
17
+ "step": 500
18
+ },
19
+ {
20
+ "epoch": 0.19809825673534073,
21
+ "grad_norm": 0.11020142585039139,
22
+ "learning_rate": 0.0009999211167982776,
23
+ "loss": 0.466,
24
+ "step": 1000
25
+ },
26
+ {
27
+ "epoch": 0.2971473851030111,
28
+ "grad_norm": 0.2978723347187042,
29
+ "learning_rate": 0.0009998224001777822,
30
+ "loss": 0.4559,
31
+ "step": 1500
32
+ },
33
+ {
34
+ "epoch": 0.39619651347068147,
35
+ "grad_norm": 0.13569457828998566,
36
+ "learning_rate": 0.0009996841762138337,
37
+ "loss": 0.4501,
38
+ "step": 2000
39
+ },
40
+ {
41
+ "epoch": 0.49524564183835185,
42
+ "grad_norm": 0.11234386265277863,
43
+ "learning_rate": 0.0009995064558320358,
44
+ "loss": 0.4456,
45
+ "step": 2500
46
+ },
47
+ {
48
+ "epoch": 0.5942947702060222,
49
+ "grad_norm": 0.23295480012893677,
50
+ "learning_rate": 0.0009992892530798984,
51
+ "loss": 0.4427,
52
+ "step": 3000
53
+ },
54
+ {
55
+ "epoch": 0.6933438985736925,
56
+ "grad_norm": 0.07768674939870834,
57
+ "learning_rate": 0.0009990325851257273,
58
+ "loss": 0.4412,
59
+ "step": 3500
60
+ },
61
+ {
62
+ "epoch": 0.7923930269413629,
63
+ "grad_norm": 0.13024824857711792,
64
+ "learning_rate": 0.000998736472257267,
65
+ "loss": 0.4392,
66
+ "step": 4000
67
+ },
68
+ {
69
+ "epoch": 0.8914421553090333,
70
+ "grad_norm": 0.21133571863174438,
71
+ "learning_rate": 0.0009984009378800963,
72
+ "loss": 0.438,
73
+ "step": 4500
74
+ },
75
+ {
76
+ "epoch": 0.9904912836767037,
77
+ "grad_norm": 0.3099469542503357,
78
+ "learning_rate": 0.0009980260085157794,
79
+ "loss": 0.4366,
80
+ "step": 5000
81
+ },
82
+ {
83
+ "epoch": 1.0,
84
+ "eval_MAE": 0.3764905631542206,
85
+ "eval_MAPE": 349.27911376953125,
86
+ "eval_MSE": 0.5023903846740723,
87
+ "eval_MSPE": 154474.234375,
88
+ "eval_ND": 0.4538463056087494,
89
+ "eval_RMSE": 0.7087950110435486,
90
+ "eval_SMAPE": 61.74858474731445,
91
+ "eval_runtime": 67.2343,
92
+ "eval_samples_per_second": 18698.269,
93
+ "eval_steps_per_second": 9.132,
94
+ "step": 5048
95
+ },
96
+ {
97
+ "epoch": 1.089540412044374,
98
+ "grad_norm": 0.167328879237175,
99
+ "learning_rate": 0.0009976117137997695,
100
+ "loss": 0.4365,
101
+ "step": 5500
102
+ },
103
+ {
104
+ "epoch": 1.1885895404120443,
105
+ "grad_norm": 0.15923786163330078,
106
+ "learning_rate": 0.0009971580864790652,
107
+ "loss": 0.4335,
108
+ "step": 6000
109
+ },
110
+ {
111
+ "epoch": 1.2876386687797148,
112
+ "grad_norm": 0.23647330701351166,
113
+ "learning_rate": 0.0009966651624096236,
114
+ "loss": 0.433,
115
+ "step": 6500
116
+ },
117
+ {
118
+ "epoch": 1.3866877971473852,
119
+ "grad_norm": 0.08678620308637619,
120
+ "learning_rate": 0.0009961329805535251,
121
+ "loss": 0.4323,
122
+ "step": 7000
123
+ },
124
+ {
125
+ "epoch": 1.4857369255150554,
126
+ "grad_norm": 0.10297006368637085,
127
+ "learning_rate": 0.0009955615829758935,
128
+ "loss": 0.4327,
129
+ "step": 7500
130
+ },
131
+ {
132
+ "epoch": 1.5847860538827259,
133
+ "grad_norm": 0.12103519588708878,
134
+ "learning_rate": 0.0009949510148415722,
135
+ "loss": 0.4312,
136
+ "step": 8000
137
+ },
138
+ {
139
+ "epoch": 1.6838351822503963,
140
+ "grad_norm": 0.1411912441253662,
141
+ "learning_rate": 0.0009943013244115538,
142
+ "loss": 0.4317,
143
+ "step": 8500
144
+ },
145
+ {
146
+ "epoch": 1.7828843106180665,
147
+ "grad_norm": 0.16333575546741486,
148
+ "learning_rate": 0.0009936125630391644,
149
+ "loss": 0.4306,
150
+ "step": 9000
151
+ },
152
+ {
153
+ "epoch": 1.881933438985737,
154
+ "grad_norm": 0.25319305062294006,
155
+ "learning_rate": 0.000992884785166006,
156
+ "loss": 0.428,
157
+ "step": 9500
158
+ },
159
+ {
160
+ "epoch": 1.9809825673534074,
161
+ "grad_norm": 0.08323787152767181,
162
+ "learning_rate": 0.0009921180483176526,
163
+ "loss": 0.4294,
164
+ "step": 10000
165
+ },
166
+ {
167
+ "epoch": 2.0,
168
+ "eval_MAE": 0.37166666984558105,
169
+ "eval_MAPE": 348.12994384765625,
170
+ "eval_MSE": 0.49643462896347046,
171
+ "eval_MSPE": 161231.390625,
172
+ "eval_ND": 0.4480312764644623,
173
+ "eval_RMSE": 0.7045812010765076,
174
+ "eval_SMAPE": 61.25733947753906,
175
+ "eval_runtime": 54.411,
176
+ "eval_samples_per_second": 23104.958,
177
+ "eval_steps_per_second": 11.284,
178
+ "step": 10096
179
+ },
180
+ {
181
+ "epoch": 2.0800316957210776,
182
+ "grad_norm": 0.07184210419654846,
183
+ "learning_rate": 0.000991312413099103,
184
+ "loss": 0.4284,
185
+ "step": 10500
186
+ },
187
+ {
188
+ "epoch": 2.179080824088748,
189
+ "grad_norm": 0.3041013479232788,
190
+ "learning_rate": 0.0009904679431899906,
191
+ "loss": 0.4279,
192
+ "step": 11000
193
+ },
194
+ {
195
+ "epoch": 2.2781299524564185,
196
+ "grad_norm": 0.11070399731397629,
197
+ "learning_rate": 0.0009895847053395504,
198
+ "loss": 0.4286,
199
+ "step": 11500
200
+ },
201
+ {
202
+ "epoch": 2.3771790808240887,
203
+ "grad_norm": 0.14005513489246368,
204
+ "learning_rate": 0.0009886627693613424,
205
+ "loss": 0.427,
206
+ "step": 12000
207
+ },
208
+ {
209
+ "epoch": 2.476228209191759,
210
+ "grad_norm": 0.09888464212417603,
211
+ "learning_rate": 0.0009877022081277332,
212
+ "loss": 0.4266,
213
+ "step": 12500
214
+ },
215
+ {
216
+ "epoch": 2.5752773375594296,
217
+ "grad_norm": 0.31440699100494385,
218
+ "learning_rate": 0.000986703097564137,
219
+ "loss": 0.4276,
220
+ "step": 13000
221
+ },
222
+ {
223
+ "epoch": 2.6743264659270998,
224
+ "grad_norm": 0.1942104548215866,
225
+ "learning_rate": 0.0009856655166430126,
226
+ "loss": 0.4269,
227
+ "step": 13500
228
+ },
229
+ {
230
+ "epoch": 2.7733755942947704,
231
+ "grad_norm": 0.3018036186695099,
232
+ "learning_rate": 0.0009845895473776232,
233
+ "loss": 0.4263,
234
+ "step": 14000
235
+ },
236
+ {
237
+ "epoch": 2.8724247226624406,
238
+ "grad_norm": 0.10349088907241821,
239
+ "learning_rate": 0.0009834752748155522,
240
+ "loss": 0.4259,
241
+ "step": 14500
242
+ },
243
+ {
244
+ "epoch": 2.971473851030111,
245
+ "grad_norm": 0.12177316844463348,
246
+ "learning_rate": 0.0009823227870319814,
247
+ "loss": 0.426,
248
+ "step": 15000
249
+ },
250
+ {
251
+ "epoch": 3.0,
252
+ "eval_MAE": 0.3759180009365082,
253
+ "eval_MAPE": 352.69488525390625,
254
+ "eval_MSE": 0.4976975917816162,
255
+ "eval_MSPE": 169782.796875,
256
+ "eval_ND": 0.45315608382225037,
257
+ "eval_RMSE": 0.7054768800735474,
258
+ "eval_SMAPE": 61.35142517089844,
259
+ "eval_runtime": 54.733,
260
+ "eval_samples_per_second": 22969.03,
261
+ "eval_steps_per_second": 11.218,
262
+ "step": 15144
263
+ },
264
+ {
265
+ "epoch": 3.070522979397781,
266
+ "grad_norm": 0.09670817106962204,
267
+ "learning_rate": 0.0009811321751227293,
268
+ "loss": 0.4253,
269
+ "step": 15500
270
+ },
271
+ {
272
+ "epoch": 3.1695721077654517,
273
+ "grad_norm": 0.1024642288684845,
274
+ "learning_rate": 0.000979903533197051,
275
+ "loss": 0.4254,
276
+ "step": 16000
277
+ },
278
+ {
279
+ "epoch": 3.268621236133122,
280
+ "grad_norm": 0.07792851328849792,
281
+ "learning_rate": 0.0009786369583701987,
282
+ "loss": 0.4256,
283
+ "step": 16500
284
+ },
285
+ {
286
+ "epoch": 3.3676703645007926,
287
+ "grad_norm": 0.1396850198507309,
288
+ "learning_rate": 0.000977332550755746,
289
+ "loss": 0.4246,
290
+ "step": 17000
291
+ },
292
+ {
293
+ "epoch": 3.466719492868463,
294
+ "grad_norm": 0.2117234766483307,
295
+ "learning_rate": 0.0009759904134576747,
296
+ "loss": 0.4242,
297
+ "step": 17500
298
+ },
299
+ {
300
+ "epoch": 3.565768621236133,
301
+ "grad_norm": 0.09312921017408371,
302
+ "learning_rate": 0.0009746106525622252,
303
+ "loss": 0.4233,
304
+ "step": 18000
305
+ },
306
+ {
307
+ "epoch": 3.6648177496038032,
308
+ "grad_norm": 0.12480303645133972,
309
+ "learning_rate": 0.0009731933771295105,
310
+ "loss": 0.4246,
311
+ "step": 18500
312
+ },
313
+ {
314
+ "epoch": 3.763866877971474,
315
+ "grad_norm": 0.06942948698997498,
316
+ "learning_rate": 0.0009717386991848969,
317
+ "loss": 0.4251,
318
+ "step": 19000
319
+ },
320
+ {
321
+ "epoch": 3.862916006339144,
322
+ "grad_norm": 0.11137889325618744,
323
+ "learning_rate": 0.0009702467337101477,
324
+ "loss": 0.4229,
325
+ "step": 19500
326
+ },
327
+ {
328
+ "epoch": 3.9619651347068148,
329
+ "grad_norm": 0.15101556479930878,
330
+ "learning_rate": 0.0009687175986343367,
331
+ "loss": 0.4242,
332
+ "step": 20000
333
+ },
334
+ {
335
+ "epoch": 4.0,
336
+ "eval_MAE": 0.37888333201408386,
337
+ "eval_MAPE": 356.0927429199219,
338
+ "eval_MSE": 0.49698570370674133,
339
+ "eval_MSPE": 187043.203125,
340
+ "eval_ND": 0.4567306935787201,
341
+ "eval_RMSE": 0.7049721479415894,
342
+ "eval_SMAPE": 61.83990478515625,
343
+ "eval_runtime": 54.2682,
344
+ "eval_samples_per_second": 23165.769,
345
+ "eval_steps_per_second": 11.314,
346
+ "step": 20192
347
+ },
348
+ {
349
+ "epoch": 4.061014263074485,
350
+ "grad_norm": 0.1452902853488922,
351
+ "learning_rate": 0.0009671514148245245,
352
+ "loss": 0.423,
353
+ "step": 20500
354
+ },
355
+ {
356
+ "epoch": 4.160063391442155,
357
+ "grad_norm": 0.08417252451181412,
358
+ "learning_rate": 0.000965548306076207,
359
+ "loss": 0.4241,
360
+ "step": 21000
361
+ },
362
+ {
363
+ "epoch": 4.259112519809825,
364
+ "grad_norm": 0.13033604621887207,
365
+ "learning_rate": 0.0009639083991035288,
366
+ "loss": 0.4226,
367
+ "step": 21500
368
+ },
369
+ {
370
+ "epoch": 4.358161648177496,
371
+ "grad_norm": 0.15337254106998444,
372
+ "learning_rate": 0.0009622318235292677,
373
+ "loss": 0.4222,
374
+ "step": 22000
375
+ },
376
+ {
377
+ "epoch": 4.457210776545167,
378
+ "grad_norm": 0.12474048137664795,
379
+ "learning_rate": 0.0009605187118745895,
380
+ "loss": 0.4227,
381
+ "step": 22500
382
+ },
383
+ {
384
+ "epoch": 4.556259904912837,
385
+ "grad_norm": 0.11578098684549332,
386
+ "learning_rate": 0.0009587691995485724,
387
+ "loss": 0.4204,
388
+ "step": 23000
389
+ },
390
+ {
391
+ "epoch": 4.655309033280507,
392
+ "grad_norm": 0.08201416581869125,
393
+ "learning_rate": 0.000956983424837504,
394
+ "loss": 0.4237,
395
+ "step": 23500
396
+ },
397
+ {
398
+ "epoch": 4.754358161648177,
399
+ "grad_norm": 0.13240130245685577,
400
+ "learning_rate": 0.0009551615288939518,
401
+ "loss": 0.4224,
402
+ "step": 24000
403
+ },
404
+ {
405
+ "epoch": 4.853407290015848,
406
+ "grad_norm": 0.11744749546051025,
407
+ "learning_rate": 0.0009533036557256045,
408
+ "loss": 0.4219,
409
+ "step": 24500
410
+ },
411
+ {
412
+ "epoch": 4.952456418383518,
413
+ "grad_norm": 0.1341642141342163,
414
+ "learning_rate": 0.0009514099521838906,
415
+ "loss": 0.4208,
416
+ "step": 25000
417
+ },
418
+ {
419
+ "epoch": 5.0,
420
+ "eval_MAE": 0.3666294813156128,
421
+ "eval_MAPE": 356.06121826171875,
422
+ "eval_MSE": 0.49152618646621704,
423
+ "eval_MSPE": 207269.796875,
424
+ "eval_ND": 0.44195911288261414,
425
+ "eval_RMSE": 0.7010892629623413,
426
+ "eval_SMAPE": 60.314064025878906,
427
+ "eval_runtime": 54.6859,
428
+ "eval_samples_per_second": 22988.83,
429
+ "eval_steps_per_second": 11.228,
430
+ "step": 25240
431
+ },
432
+ {
433
+ "epoch": 5.051505546751189,
434
+ "grad_norm": 0.07947442680597305,
435
+ "learning_rate": 0.00094948056795237,
436
+ "loss": 0.4194,
437
+ "step": 25500
438
+ },
439
+ {
440
+ "epoch": 5.150554675118859,
441
+ "grad_norm": 0.13597752153873444,
442
+ "learning_rate": 0.000947515655534903,
443
+ "loss": 0.4208,
444
+ "step": 26000
445
+ },
446
+ {
447
+ "epoch": 5.249603803486529,
448
+ "grad_norm": 0.07506517320871353,
449
+ "learning_rate": 0.0009455153702435957,
450
+ "loss": 0.4207,
451
+ "step": 26500
452
+ },
453
+ {
454
+ "epoch": 5.3486529318541995,
455
+ "grad_norm": 0.204156756401062,
456
+ "learning_rate": 0.0009434798701865242,
457
+ "loss": 0.421,
458
+ "step": 27000
459
+ },
460
+ {
461
+ "epoch": 5.44770206022187,
462
+ "grad_norm": 0.10013840347528458,
463
+ "learning_rate": 0.000941409316255237,
464
+ "loss": 0.4206,
465
+ "step": 27500
466
+ },
467
+ {
468
+ "epoch": 5.546751188589541,
469
+ "grad_norm": 0.11126238107681274,
470
+ "learning_rate": 0.0009393038721120373,
471
+ "loss": 0.4209,
472
+ "step": 28000
473
+ },
474
+ {
475
+ "epoch": 5.645800316957211,
476
+ "grad_norm": 0.07826147228479385,
477
+ "learning_rate": 0.0009371637041770472,
478
+ "loss": 0.4199,
479
+ "step": 28500
480
+ },
481
+ {
482
+ "epoch": 5.744849445324881,
483
+ "grad_norm": 0.1696319282054901,
484
+ "learning_rate": 0.0009349889816150534,
485
+ "loss": 0.4202,
486
+ "step": 29000
487
+ },
488
+ {
489
+ "epoch": 5.8438985736925515,
490
+ "grad_norm": 0.13377857208251953,
491
+ "learning_rate": 0.0009327798763221355,
492
+ "loss": 0.4198,
493
+ "step": 29500
494
+ },
495
+ {
496
+ "epoch": 5.942947702060222,
497
+ "grad_norm": 0.09888964146375656,
498
+ "learning_rate": 0.0009305365629120796,
499
+ "loss": 0.4209,
500
+ "step": 30000
501
+ },
502
+ {
503
+ "epoch": 6.0,
504
+ "eval_MAE": 0.3674909174442291,
505
+ "eval_MAPE": 365.8409118652344,
506
+ "eval_MSE": 0.49068397283554077,
507
+ "eval_MSPE": 221797.953125,
508
+ "eval_ND": 0.442997545003891,
509
+ "eval_RMSE": 0.7004883885383606,
510
+ "eval_SMAPE": 60.21101379394531,
511
+ "eval_runtime": 53.7469,
512
+ "eval_samples_per_second": 23390.462,
513
+ "eval_steps_per_second": 11.424,
514
+ "step": 30288
515
+ },
516
+ {
517
+ "epoch": 6.041996830427892,
518
+ "grad_norm": 0.10051033645868301,
519
+ "learning_rate": 0.0009282592187025753,
520
+ "loss": 0.4202,
521
+ "step": 30500
522
+ },
523
+ {
524
+ "epoch": 6.141045958795562,
525
+ "grad_norm": 0.1751776486635208,
526
+ "learning_rate": 0.0009259480237012013,
527
+ "loss": 0.4205,
528
+ "step": 31000
529
+ },
530
+ {
531
+ "epoch": 6.240095087163233,
532
+ "grad_norm": 0.15444861352443695,
533
+ "learning_rate": 0.0009236031605911957,
534
+ "loss": 0.4184,
535
+ "step": 31500
536
+ },
537
+ {
538
+ "epoch": 6.3391442155309035,
539
+ "grad_norm": 0.09780099242925644,
540
+ "learning_rate": 0.0009212248147170174,
541
+ "loss": 0.42,
542
+ "step": 32000
543
+ },
544
+ {
545
+ "epoch": 6.438193343898574,
546
+ "grad_norm": 0.12165773659944534,
547
+ "learning_rate": 0.0009188131740696953,
548
+ "loss": 0.4191,
549
+ "step": 32500
550
+ },
551
+ {
552
+ "epoch": 6.537242472266244,
553
+ "grad_norm": 0.09260477870702744,
554
+ "learning_rate": 0.0009163684292719692,
555
+ "loss": 0.4193,
556
+ "step": 33000
557
+ },
558
+ {
559
+ "epoch": 6.636291600633914,
560
+ "grad_norm": 0.09463346749544144,
561
+ "learning_rate": 0.0009138907735632225,
562
+ "loss": 0.4184,
563
+ "step": 33500
564
+ },
565
+ {
566
+ "epoch": 6.735340729001585,
567
+ "grad_norm": 0.10615640878677368,
568
+ "learning_rate": 0.0009113804027842078,
569
+ "loss": 0.4179,
570
+ "step": 34000
571
+ },
572
+ {
573
+ "epoch": 6.834389857369255,
574
+ "grad_norm": 0.10410912334918976,
575
+ "learning_rate": 0.0009088375153615673,
576
+ "loss": 0.4189,
577
+ "step": 34500
578
+ },
579
+ {
580
+ "epoch": 6.933438985736926,
581
+ "grad_norm": 0.10201963037252426,
582
+ "learning_rate": 0.0009062623122921485,
583
+ "loss": 0.4187,
584
+ "step": 35000
585
+ },
586
+ {
587
+ "epoch": 7.0,
588
+ "eval_MAE": 0.36573904752731323,
589
+ "eval_MAPE": 352.3959045410156,
590
+ "eval_MSE": 0.48773863911628723,
591
+ "eval_MSPE": 185352.046875,
592
+ "eval_ND": 0.4408857226371765,
593
+ "eval_RMSE": 0.6983828544616699,
594
+ "eval_SMAPE": 59.932762145996094,
595
+ "eval_runtime": 54.521,
596
+ "eval_samples_per_second": 23058.35,
597
+ "eval_steps_per_second": 11.262,
598
+ "step": 35336
599
+ },
600
+ {
601
+ "epoch": 7.032488114104596,
602
+ "grad_norm": 0.11093030869960785,
603
+ "learning_rate": 0.000903654997127117,
604
+ "loss": 0.4191,
605
+ "step": 35500
606
+ },
607
+ {
608
+ "epoch": 7.131537242472266,
609
+ "grad_norm": 0.1134202629327774,
610
+ "learning_rate": 0.0009010157759558673,
611
+ "loss": 0.4186,
612
+ "step": 36000
613
+ },
614
+ {
615
+ "epoch": 7.230586370839936,
616
+ "grad_norm": 0.06474833935499191,
617
+ "learning_rate": 0.0008983448573897322,
618
+ "loss": 0.4191,
619
+ "step": 36500
620
+ },
621
+ {
622
+ "epoch": 7.329635499207607,
623
+ "grad_norm": 0.11325617134571075,
624
+ "learning_rate": 0.0008956424525454949,
625
+ "loss": 0.4164,
626
+ "step": 37000
627
+ },
628
+ {
629
+ "epoch": 7.428684627575278,
630
+ "grad_norm": 0.07661303877830505,
631
+ "learning_rate": 0.0008929087750287004,
632
+ "loss": 0.4179,
633
+ "step": 37500
634
+ },
635
+ {
636
+ "epoch": 7.527733755942948,
637
+ "grad_norm": 0.07534985989332199,
638
+ "learning_rate": 0.0008901440409167727,
639
+ "loss": 0.4191,
640
+ "step": 38000
641
+ },
642
+ {
643
+ "epoch": 7.626782884310618,
644
+ "grad_norm": 0.09289834648370743,
645
+ "learning_rate": 0.0008873484687419344,
646
+ "loss": 0.4177,
647
+ "step": 38500
648
+ },
649
+ {
650
+ "epoch": 7.725832012678288,
651
+ "grad_norm": 0.12024533003568649,
652
+ "learning_rate": 0.0008845222794739341,
653
+ "loss": 0.417,
654
+ "step": 39000
655
+ },
656
+ {
657
+ "epoch": 7.824881141045958,
658
+ "grad_norm": 0.0927654430270195,
659
+ "learning_rate": 0.00088166569650258,
660
+ "loss": 0.4181,
661
+ "step": 39500
662
+ },
663
+ {
664
+ "epoch": 7.9239302694136295,
665
+ "grad_norm": 0.14443668723106384,
666
+ "learning_rate": 0.0008787789456200823,
667
+ "loss": 0.4175,
668
+ "step": 40000
669
+ },
670
+ {
671
+ "epoch": 8.0,
672
+ "eval_MAE": 0.36536094546318054,
673
+ "eval_MAPE": 354.2453918457031,
674
+ "eval_MSE": 0.48749861121177673,
675
+ "eval_MSPE": 195818.65625,
676
+ "eval_ND": 0.4404299259185791,
677
+ "eval_RMSE": 0.6982110142707825,
678
+ "eval_SMAPE": 59.863468170166016,
679
+ "eval_runtime": 54.3633,
680
+ "eval_samples_per_second": 23125.25,
681
+ "eval_steps_per_second": 11.294,
682
+ "step": 40384
683
+ },
684
+ {
685
+ "epoch": 8.022979397781299,
686
+ "grad_norm": 0.0843660980463028,
687
+ "learning_rate": 0.0008758622550032065,
688
+ "loss": 0.418,
689
+ "step": 40500
690
+ },
691
+ {
692
+ "epoch": 8.12202852614897,
693
+ "grad_norm": 0.14738310873508453,
694
+ "learning_rate": 0.0008729158551952377,
695
+ "loss": 0.4173,
696
+ "step": 41000
697
+ },
698
+ {
699
+ "epoch": 8.221077654516641,
700
+ "grad_norm": 0.07845776528120041,
701
+ "learning_rate": 0.0008699399790877566,
702
+ "loss": 0.4176,
703
+ "step": 41500
704
+ },
705
+ {
706
+ "epoch": 8.32012678288431,
707
+ "grad_norm": 0.1038828119635582,
708
+ "learning_rate": 0.0008669348619022335,
709
+ "loss": 0.4175,
710
+ "step": 42000
711
+ },
712
+ {
713
+ "epoch": 8.419175911251982,
714
+ "grad_norm": 0.14202700555324554,
715
+ "learning_rate": 0.000863900741171433,
716
+ "loss": 0.417,
717
+ "step": 42500
718
+ },
719
+ {
720
+ "epoch": 8.51822503961965,
721
+ "grad_norm": 0.1151699423789978,
722
+ "learning_rate": 0.0008608378567206405,
723
+ "loss": 0.4181,
724
+ "step": 43000
725
+ },
726
+ {
727
+ "epoch": 8.617274167987322,
728
+ "grad_norm": 0.19360342621803284,
729
+ "learning_rate": 0.0008577464506487054,
730
+ "loss": 0.4153,
731
+ "step": 43500
732
+ },
733
+ {
734
+ "epoch": 8.716323296354991,
735
+ "grad_norm": 0.11765541136264801,
736
+ "learning_rate": 0.0008546267673089049,
737
+ "loss": 0.4159,
738
+ "step": 44000
739
+ },
740
+ {
741
+ "epoch": 8.815372424722662,
742
+ "grad_norm": 0.15739209949970245,
743
+ "learning_rate": 0.0008514790532896294,
744
+ "loss": 0.4162,
745
+ "step": 44500
746
+ },
747
+ {
748
+ "epoch": 8.914421553090333,
749
+ "grad_norm": 0.12929894030094147,
750
+ "learning_rate": 0.0008483035573948916,
751
+ "loss": 0.4161,
752
+ "step": 45000
753
+ },
754
+ {
755
+ "epoch": 9.0,
756
+ "eval_MAE": 0.3685202896595001,
757
+ "eval_MAPE": 360.7921447753906,
758
+ "eval_MSE": 0.48982617259025574,
759
+ "eval_MSPE": 237005.59375,
760
+ "eval_ND": 0.44423842430114746,
761
+ "eval_RMSE": 0.6998758316040039,
762
+ "eval_SMAPE": 60.21327590942383,
763
+ "eval_runtime": 55.2004,
764
+ "eval_samples_per_second": 22774.547,
765
+ "eval_steps_per_second": 11.123,
766
+ "step": 45432
767
+ },
768
+ {
769
+ "epoch": 9.013470681458003,
770
+ "grad_norm": 0.08673319220542908,
771
+ "learning_rate": 0.0008451005306246607,
772
+ "loss": 0.4164,
773
+ "step": 45500
774
+ },
775
+ {
776
+ "epoch": 9.112519809825674,
777
+ "grad_norm": 0.08345810323953629,
778
+ "learning_rate": 0.000841870226155022,
779
+ "loss": 0.4154,
780
+ "step": 46000
781
+ },
782
+ {
783
+ "epoch": 9.211568938193343,
784
+ "grad_norm": 0.12091132998466492,
785
+ "learning_rate": 0.0008386128993181656,
786
+ "loss": 0.4162,
787
+ "step": 46500
788
+ },
789
+ {
790
+ "epoch": 9.310618066561014,
791
+ "grad_norm": 0.09233805537223816,
792
+ "learning_rate": 0.0008353288075822044,
793
+ "loss": 0.417,
794
+ "step": 47000
795
+ },
796
+ {
797
+ "epoch": 9.409667194928685,
798
+ "grad_norm": 0.14294278621673584,
799
+ "learning_rate": 0.0008320182105308227,
800
+ "loss": 0.4164,
801
+ "step": 47500
802
+ },
803
+ {
804
+ "epoch": 9.508716323296355,
805
+ "grad_norm": 0.08588852733373642,
806
+ "learning_rate": 0.0008286813698427583,
807
+ "loss": 0.4151,
808
+ "step": 48000
809
+ },
810
+ {
811
+ "epoch": 9.607765451664026,
812
+ "grad_norm": 0.09648016840219498,
813
+ "learning_rate": 0.0008253185492711182,
814
+ "loss": 0.4158,
815
+ "step": 48500
816
+ },
817
+ {
818
+ "epoch": 9.706814580031695,
819
+ "grad_norm": 0.10604752600193024,
820
+ "learning_rate": 0.0008219300146225315,
821
+ "loss": 0.416,
822
+ "step": 49000
823
+ },
824
+ {
825
+ "epoch": 9.805863708399366,
826
+ "grad_norm": 0.07823313027620316,
827
+ "learning_rate": 0.0008185160337361388,
828
+ "loss": 0.414,
829
+ "step": 49500
830
+ },
831
+ {
832
+ "epoch": 9.904912836767036,
833
+ "grad_norm": 0.09862257540225983,
834
+ "learning_rate": 0.000815076876462422,
835
+ "loss": 0.4153,
836
+ "step": 50000
837
+ },
838
+ {
839
+ "epoch": 10.0,
840
+ "eval_MAE": 0.35969898104667664,
841
+ "eval_MAPE": 347.6515808105469,
842
+ "eval_MSE": 0.4838204085826874,
843
+ "eval_MSPE": 197941.25,
844
+ "eval_ND": 0.4336046278476715,
845
+ "eval_RMSE": 0.695572018623352,
846
+ "eval_SMAPE": 59.28236389160156,
847
+ "eval_runtime": 55.2356,
848
+ "eval_samples_per_second": 22760.038,
849
+ "eval_steps_per_second": 11.116,
850
+ "step": 50480
851
+ },
852
+ {
853
+ "epoch": 10.003961965134707,
854
+ "grad_norm": 0.09266933053731918,
855
+ "learning_rate": 0.0008116128146418738,
856
+ "loss": 0.4163,
857
+ "step": 50500
858
+ },
859
+ {
860
+ "epoch": 10.103011093502378,
861
+ "grad_norm": 0.1360657662153244,
862
+ "learning_rate": 0.0008081241220835112,
863
+ "loss": 0.4138,
864
+ "step": 51000
865
+ },
866
+ {
867
+ "epoch": 10.202060221870047,
868
+ "grad_norm": 0.12415830045938492,
869
+ "learning_rate": 0.0008046110745432329,
870
+ "loss": 0.4154,
871
+ "step": 51500
872
+ },
873
+ {
874
+ "epoch": 10.301109350237718,
875
+ "grad_norm": 0.12350622564554214,
876
+ "learning_rate": 0.0008010739497020226,
877
+ "loss": 0.4152,
878
+ "step": 52000
879
+ },
880
+ {
881
+ "epoch": 10.400158478605388,
882
+ "grad_norm": 0.09407506883144379,
883
+ "learning_rate": 0.0007975130271440001,
884
+ "loss": 0.4142,
885
+ "step": 52500
886
+ },
887
+ {
888
+ "epoch": 10.499207606973059,
889
+ "grad_norm": 0.07152153551578522,
890
+ "learning_rate": 0.000793928588334323,
891
+ "loss": 0.4152,
892
+ "step": 53000
893
+ },
894
+ {
895
+ "epoch": 10.59825673534073,
896
+ "grad_norm": 0.11555545032024384,
897
+ "learning_rate": 0.0007903209165969381,
898
+ "loss": 0.4149,
899
+ "step": 53500
900
+ },
901
+ {
902
+ "epoch": 10.697305863708399,
903
+ "grad_norm": 0.1005263477563858,
904
+ "learning_rate": 0.0007866902970921869,
905
+ "loss": 0.4151,
906
+ "step": 54000
907
+ },
908
+ {
909
+ "epoch": 10.79635499207607,
910
+ "grad_norm": 0.151838481426239,
911
+ "learning_rate": 0.0007830370167942662,
912
+ "loss": 0.4157,
913
+ "step": 54500
914
+ },
915
+ {
916
+ "epoch": 10.89540412044374,
917
+ "grad_norm": 0.12097581475973129,
918
+ "learning_rate": 0.0007793613644685442,
919
+ "loss": 0.4142,
920
+ "step": 55000
921
+ },
922
+ {
923
+ "epoch": 10.99445324881141,
924
+ "grad_norm": 0.07319813221693039,
925
+ "learning_rate": 0.0007756636306487361,
926
+ "loss": 0.4145,
927
+ "step": 55500
928
+ },
929
+ {
930
+ "epoch": 11.0,
931
+ "eval_MAE": 0.36152151226997375,
932
+ "eval_MAPE": 355.8778991699219,
933
+ "eval_MSE": 0.4874439239501953,
934
+ "eval_MSPE": 242275.75,
935
+ "eval_ND": 0.43580162525177,
936
+ "eval_RMSE": 0.698171854019165,
937
+ "eval_SMAPE": 59.33519744873047,
938
+ "eval_runtime": 54.9921,
939
+ "eval_samples_per_second": 22860.849,
940
+ "eval_steps_per_second": 11.165,
941
+ "step": 55528
942
+ },
943
+ {
944
+ "epoch": 11.09350237717908,
945
+ "grad_norm": 0.08475903421640396,
946
+ "learning_rate": 0.0007719441076139392,
947
+ "loss": 0.4144,
948
+ "step": 56000
949
+ },
950
+ {
951
+ "epoch": 11.192551505546751,
952
+ "grad_norm": 0.09286442399024963,
953
+ "learning_rate": 0.000768203089365531,
954
+ "loss": 0.414,
955
+ "step": 56500
956
+ },
957
+ {
958
+ "epoch": 11.291600633914422,
959
+ "grad_norm": 0.07604731619358063,
960
+ "learning_rate": 0.0007644408716039295,
961
+ "loss": 0.4132,
962
+ "step": 57000
963
+ },
964
+ {
965
+ "epoch": 11.390649762282091,
966
+ "grad_norm": 0.10099564492702484,
967
+ "learning_rate": 0.0007606577517052212,
968
+ "loss": 0.4128,
969
+ "step": 57500
970
+ },
971
+ {
972
+ "epoch": 11.489698890649763,
973
+ "grad_norm": 0.0886906161904335,
974
+ "learning_rate": 0.0007568540286976551,
975
+ "loss": 0.4144,
976
+ "step": 58000
977
+ },
978
+ {
979
+ "epoch": 11.588748019017432,
980
+ "grad_norm": 0.09952269494533539,
981
+ "learning_rate": 0.0007530300032380071,
982
+ "loss": 0.4153,
983
+ "step": 58500
984
+ },
985
+ {
986
+ "epoch": 11.687797147385103,
987
+ "grad_norm": 0.09850025922060013,
988
+ "learning_rate": 0.0007491859775878146,
989
+ "loss": 0.414,
990
+ "step": 59000
991
+ },
992
+ {
993
+ "epoch": 11.786846275752774,
994
+ "grad_norm": 0.16440285742282867,
995
+ "learning_rate": 0.0007453222555894856,
996
+ "loss": 0.4135,
997
+ "step": 59500
998
+ },
999
+ {
1000
+ "epoch": 11.885895404120443,
1001
+ "grad_norm": 0.09019125998020172,
1002
+ "learning_rate": 0.000741439142642282,
1003
+ "loss": 0.4141,
1004
+ "step": 60000
1005
+ },
1006
+ {
1007
+ "epoch": 11.984944532488115,
1008
+ "grad_norm": 0.11564897000789642,
1009
+ "learning_rate": 0.0007375369456781793,
1010
+ "loss": 0.4135,
1011
+ "step": 60500
1012
+ },
1013
+ {
1014
+ "epoch": 12.0,
1015
+ "eval_MAE": 0.3628363013267517,
1016
+ "eval_MAPE": 352.1346740722656,
1017
+ "eval_MSE": 0.48805665969848633,
1018
+ "eval_MSPE": 232840.859375,
1019
+ "eval_ND": 0.43738657236099243,
1020
+ "eval_RMSE": 0.6986105442047119,
1021
+ "eval_SMAPE": 59.68527603149414,
1022
+ "eval_runtime": 54.4659,
1023
+ "eval_samples_per_second": 23081.687,
1024
+ "eval_steps_per_second": 11.273,
1025
+ "step": 60576
1026
+ },
1027
+ {
1028
+ "epoch": 12.083993660855784,
1029
+ "grad_norm": 0.14516101777553558,
1030
+ "learning_rate": 0.0007336159731376071,
1031
+ "loss": 0.4132,
1032
+ "step": 61000
1033
+ },
1034
+ {
1035
+ "epoch": 12.183042789223455,
1036
+ "grad_norm": 0.10409526526927948,
1037
+ "learning_rate": 0.0007296765349450678,
1038
+ "loss": 0.4143,
1039
+ "step": 61500
1040
+ },
1041
+ {
1042
+ "epoch": 12.282091917591124,
1043
+ "grad_norm": 0.13454560935497284,
1044
+ "learning_rate": 0.0007257189424846407,
1045
+ "loss": 0.413,
1046
+ "step": 62000
1047
+ },
1048
+ {
1049
+ "epoch": 12.381141045958795,
1050
+ "grad_norm": 0.12388956546783447,
1051
+ "learning_rate": 0.0007217435085753679,
1052
+ "loss": 0.4144,
1053
+ "step": 62500
1054
+ },
1055
+ {
1056
+ "epoch": 12.480190174326466,
1057
+ "grad_norm": 0.12922845780849457,
1058
+ "learning_rate": 0.0007177505474465294,
1059
+ "loss": 0.412,
1060
+ "step": 63000
1061
+ },
1062
+ {
1063
+ "epoch": 12.579239302694136,
1064
+ "grad_norm": 0.11674097180366516,
1065
+ "learning_rate": 0.0007137403747128044,
1066
+ "loss": 0.4128,
1067
+ "step": 63500
1068
+ },
1069
+ {
1070
+ "epoch": 12.678288431061807,
1071
+ "grad_norm": 0.12132911384105682,
1072
+ "learning_rate": 0.000709713307349326,
1073
+ "loss": 0.4133,
1074
+ "step": 64000
1075
+ },
1076
+ {
1077
+ "epoch": 12.777337559429476,
1078
+ "grad_norm": 0.0879695862531662,
1079
+ "learning_rate": 0.0007056696636666243,
1080
+ "loss": 0.4134,
1081
+ "step": 64500
1082
+ },
1083
+ {
1084
+ "epoch": 12.876386687797147,
1085
+ "grad_norm": 0.0823468267917633,
1086
+ "learning_rate": 0.0007016097632854684,
1087
+ "loss": 0.4117,
1088
+ "step": 65000
1089
+ },
1090
+ {
1091
+ "epoch": 12.975435816164818,
1092
+ "grad_norm": 0.11856217682361603,
1093
+ "learning_rate": 0.0006975339271116012,
1094
+ "loss": 0.4126,
1095
+ "step": 65500
1096
+ },
1097
+ {
1098
+ "epoch": 13.0,
1099
+ "eval_MAE": 0.3578062355518341,
1100
+ "eval_MAPE": 340.23406982421875,
1101
+ "eval_MSE": 0.4831537902355194,
1102
+ "eval_MSPE": 227388.484375,
1103
+ "eval_ND": 0.43132299184799194,
1104
+ "eval_RMSE": 0.6950926780700684,
1105
+ "eval_SMAPE": 59.14822769165039,
1106
+ "eval_runtime": 54.2424,
1107
+ "eval_samples_per_second": 23176.786,
1108
+ "eval_steps_per_second": 11.32,
1109
+ "step": 65624
1110
+ }
1111
+ ],
1112
+ "logging_steps": 500,
1113
+ "max_steps": 176680,
1114
+ "num_input_tokens_seen": 0,
1115
+ "num_train_epochs": 35,
1116
+ "save_steps": 500,
1117
+ "stateful_callbacks": {
1118
+ "EarlyStoppingCallback": {
1119
+ "args": {
1120
+ "early_stopping_patience": 3,
1121
+ "early_stopping_threshold": 0.0
1122
+ },
1123
+ "attributes": {
1124
+ "early_stopping_patience_counter": 0
1125
+ }
1126
+ },
1127
+ "TrainerControl": {
1128
+ "args": {
1129
+ "should_epoch_stop": false,
1130
+ "should_evaluate": false,
1131
+ "should_log": false,
1132
+ "should_save": true,
1133
+ "should_training_stop": false
1134
+ },
1135
+ "attributes": {}
1136
+ }
1137
+ },
1138
+ "total_flos": 0.0,
1139
+ "train_batch_size": 512,
1140
+ "trial_name": null,
1141
+ "trial_params": null
1142
+ }
OFA/PEMS04_336/checkpoint-65624/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f57783d1c775922679e371b39a13b1f0e1b0afa7e63698d2b8e6f438f59bab39
3
+ size 6584
OFA/Solar_192/checkpoint-14556/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:60bd6475e443ba58da1fe6bff4125da6ebd4daa4f4026be81dc6ffae8a5d6965
3
+ size 261338858
OFA/Solar_192/checkpoint-14556/trainer_state.json ADDED
@@ -0,0 +1,330 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_global_step": 14556,
3
+ "best_metric": 0.16441404819488525,
4
+ "best_model_checkpoint": "/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/OFA_layer/haspara/Solar/checkpoint-14556",
5
+ "epoch": 6.0,
6
+ "eval_steps": 500,
7
+ "global_step": 14556,
8
+ "is_hyper_param_search": false,
9
+ "is_local_process_zero": true,
10
+ "is_world_process_zero": true,
11
+ "log_history": [
12
+ {
13
+ "epoch": 0.20610057708161583,
14
+ "grad_norm": 0.13613636791706085,
15
+ "learning_rate": 0.0009999147860244627,
16
+ "loss": 0.4442,
17
+ "step": 500
18
+ },
19
+ {
20
+ "epoch": 0.41220115416323166,
21
+ "grad_norm": 0.09557799249887466,
22
+ "learning_rate": 0.0009996584898593923,
23
+ "loss": 0.1748,
24
+ "step": 1000
25
+ },
26
+ {
27
+ "epoch": 0.6183017312448474,
28
+ "grad_norm": 0.16066046059131622,
29
+ "learning_rate": 0.000999231198873098,
30
+ "loss": 0.1707,
31
+ "step": 1500
32
+ },
33
+ {
34
+ "epoch": 0.8244023083264633,
35
+ "grad_norm": 0.1717931628227234,
36
+ "learning_rate": 0.0009986330592945485,
37
+ "loss": 0.1679,
38
+ "step": 2000
39
+ },
40
+ {
41
+ "epoch": 1.0,
42
+ "eval_MAE": 0.24874718487262726,
43
+ "eval_MAPE": 180.4757843017578,
44
+ "eval_MSE": 0.16690030694007874,
45
+ "eval_MSPE": 9273.771484375,
46
+ "eval_ND": 0.2907600700855255,
47
+ "eval_RMSE": 0.40853434801101685,
48
+ "eval_SMAPE": 36.44480895996094,
49
+ "eval_runtime": 58.8031,
50
+ "eval_samples_per_second": 11800.483,
51
+ "eval_steps_per_second": 5.765,
52
+ "step": 2426
53
+ },
54
+ {
55
+ "epoch": 1.030502885408079,
56
+ "grad_norm": 0.11462152004241943,
57
+ "learning_rate": 0.000997864275821097,
58
+ "loss": 0.166,
59
+ "step": 2500
60
+ },
61
+ {
62
+ "epoch": 1.2366034624896949,
63
+ "grad_norm": 0.05391139164566994,
64
+ "learning_rate": 0.0009969251115484285,
65
+ "loss": 0.164,
66
+ "step": 3000
67
+ },
68
+ {
69
+ "epoch": 1.4427040395713109,
70
+ "grad_norm": 0.08242031186819077,
71
+ "learning_rate": 0.0009958158878805223,
72
+ "loss": 0.1628,
73
+ "step": 3500
74
+ },
75
+ {
76
+ "epoch": 1.6488046166529267,
77
+ "grad_norm": 0.09068718552589417,
78
+ "learning_rate": 0.0009945369844196596,
79
+ "loss": 0.1614,
80
+ "step": 4000
81
+ },
82
+ {
83
+ "epoch": 1.8549051937345424,
84
+ "grad_norm": 0.12416058033704758,
85
+ "learning_rate": 0.000993088838836516,
86
+ "loss": 0.1604,
87
+ "step": 4500
88
+ },
89
+ {
90
+ "epoch": 2.0,
91
+ "eval_MAE": 0.2530505359172821,
92
+ "eval_MAPE": 184.0513153076172,
93
+ "eval_MSE": 0.1713208258152008,
94
+ "eval_MSPE": 8967.3408203125,
95
+ "eval_ND": 0.29579025506973267,
96
+ "eval_RMSE": 0.4139091968536377,
97
+ "eval_SMAPE": 37.25147247314453,
98
+ "eval_runtime": 58.5412,
99
+ "eval_samples_per_second": 11853.276,
100
+ "eval_steps_per_second": 5.791,
101
+ "step": 4852
102
+ },
103
+ {
104
+ "epoch": 2.061005770816158,
105
+ "grad_norm": 0.11636342853307724,
106
+ "learning_rate": 0.000991471946720379,
107
+ "loss": 0.1596,
108
+ "step": 5000
109
+ },
110
+ {
111
+ "epoch": 2.267106347897774,
112
+ "grad_norm": 0.13195322453975677,
113
+ "learning_rate": 0.0009896868614095468,
114
+ "loss": 0.1585,
115
+ "step": 5500
116
+ },
117
+ {
118
+ "epoch": 2.4732069249793898,
119
+ "grad_norm": 0.1224469393491745,
120
+ "learning_rate": 0.0009877341938019622,
121
+ "loss": 0.1582,
122
+ "step": 6000
123
+ },
124
+ {
125
+ "epoch": 2.6793075020610058,
126
+ "grad_norm": 0.12972772121429443,
127
+ "learning_rate": 0.0009856146121461496,
128
+ "loss": 0.1574,
129
+ "step": 6500
130
+ },
131
+ {
132
+ "epoch": 2.8854080791426218,
133
+ "grad_norm": 0.12829646468162537,
134
+ "learning_rate": 0.0009833288418125239,
135
+ "loss": 0.1567,
136
+ "step": 7000
137
+ },
138
+ {
139
+ "epoch": 3.0,
140
+ "eval_MAE": 0.2434338480234146,
141
+ "eval_MAPE": 180.72421264648438,
142
+ "eval_MSE": 0.16737966239452362,
143
+ "eval_MSPE": 9376.3095703125,
144
+ "eval_ND": 0.2845493257045746,
145
+ "eval_RMSE": 0.4091205894947052,
146
+ "eval_SMAPE": 35.947811126708984,
147
+ "eval_runtime": 58.2785,
148
+ "eval_samples_per_second": 11906.705,
149
+ "eval_steps_per_second": 5.817,
150
+ "step": 7278
151
+ },
152
+ {
153
+ "epoch": 3.0915086562242373,
154
+ "grad_norm": 0.07943403720855713,
155
+ "learning_rate": 0.000980877665045153,
156
+ "loss": 0.1559,
157
+ "step": 7500
158
+ },
159
+ {
160
+ "epoch": 3.2976092333058533,
161
+ "grad_norm": 0.06782522052526474,
162
+ "learning_rate": 0.0009782619206940547,
163
+ "loss": 0.1552,
164
+ "step": 8000
165
+ },
166
+ {
167
+ "epoch": 3.503709810387469,
168
+ "grad_norm": 0.12171012163162231,
169
+ "learning_rate": 0.000975482503928123,
170
+ "loss": 0.155,
171
+ "step": 8500
172
+ },
173
+ {
174
+ "epoch": 3.709810387469085,
175
+ "grad_norm": 0.16907528042793274,
176
+ "learning_rate": 0.0009725403659287799,
177
+ "loss": 0.1543,
178
+ "step": 9000
179
+ },
180
+ {
181
+ "epoch": 3.915910964550701,
182
+ "grad_norm": 0.1479276567697525,
183
+ "learning_rate": 0.0009694365135644595,
184
+ "loss": 0.1538,
185
+ "step": 9500
186
+ },
187
+ {
188
+ "epoch": 4.0,
189
+ "eval_MAE": 0.23490256071090698,
190
+ "eval_MAPE": 181.50564575195312,
191
+ "eval_MSE": 0.16538722813129425,
192
+ "eval_MSPE": 9649.654296875,
193
+ "eval_ND": 0.27457714080810547,
194
+ "eval_RMSE": 0.40667828917503357,
195
+ "eval_SMAPE": 34.649436950683594,
196
+ "eval_runtime": 58.6782,
197
+ "eval_samples_per_second": 11825.598,
198
+ "eval_steps_per_second": 5.777,
199
+ "step": 9704
200
+ },
201
+ {
202
+ "epoch": 4.122011541632316,
203
+ "grad_norm": 0.09500300139188766,
204
+ "learning_rate": 0.0009661720090460337,
205
+ "loss": 0.1535,
206
+ "step": 10000
207
+ },
208
+ {
209
+ "epoch": 4.328112118713932,
210
+ "grad_norm": 0.0786258801817894,
211
+ "learning_rate": 0.0009627479695632988,
212
+ "loss": 0.153,
213
+ "step": 10500
214
+ },
215
+ {
216
+ "epoch": 4.534212695795548,
217
+ "grad_norm": 0.07586020231246948,
218
+ "learning_rate": 0.0009591655669026469,
219
+ "loss": 0.1523,
220
+ "step": 11000
221
+ },
222
+ {
223
+ "epoch": 4.740313272877164,
224
+ "grad_norm": 0.14045332372188568,
225
+ "learning_rate": 0.0009554260270460539,
226
+ "loss": 0.1517,
227
+ "step": 11500
228
+ },
229
+ {
230
+ "epoch": 4.9464138499587795,
231
+ "grad_norm": 0.08372505754232407,
232
+ "learning_rate": 0.0009515306297515187,
233
+ "loss": 0.1512,
234
+ "step": 12000
235
+ },
236
+ {
237
+ "epoch": 5.0,
238
+ "eval_MAE": 0.2414267361164093,
239
+ "eval_MAPE": 181.71209716796875,
240
+ "eval_MSE": 0.16475139558315277,
241
+ "eval_MSPE": 9144.029296875,
242
+ "eval_ND": 0.28220322728157043,
243
+ "eval_RMSE": 0.40589579939842224,
244
+ "eval_SMAPE": 35.766761779785156,
245
+ "eval_runtime": 59.913,
246
+ "eval_samples_per_second": 11581.882,
247
+ "eval_steps_per_second": 5.658,
248
+ "step": 12130
249
+ },
250
+ {
251
+ "epoch": 5.152514427040396,
252
+ "grad_norm": 0.12079860270023346,
253
+ "learning_rate": 0.0009474807081151011,
254
+ "loss": 0.1507,
255
+ "step": 12500
256
+ },
257
+ {
258
+ "epoch": 5.3586150041220115,
259
+ "grad_norm": 0.08449820429086685,
260
+ "learning_rate": 0.0009432776481147042,
261
+ "loss": 0.1504,
262
+ "step": 13000
263
+ },
264
+ {
265
+ "epoch": 5.564715581203627,
266
+ "grad_norm": 0.06070152297616005,
267
+ "learning_rate": 0.0009389228881357614,
268
+ "loss": 0.1501,
269
+ "step": 13500
270
+ },
271
+ {
272
+ "epoch": 5.7708161582852435,
273
+ "grad_norm": 0.1490793377161026,
274
+ "learning_rate": 0.0009344179184789862,
275
+ "loss": 0.1493,
276
+ "step": 14000
277
+ },
278
+ {
279
+ "epoch": 5.976916735366859,
280
+ "grad_norm": 0.15414516627788544,
281
+ "learning_rate": 0.0009297642808503576,
282
+ "loss": 0.1494,
283
+ "step": 14500
284
+ },
285
+ {
286
+ "epoch": 6.0,
287
+ "eval_MAE": 0.23061439394950867,
288
+ "eval_MAPE": 180.4387969970703,
289
+ "eval_MSE": 0.16441404819488525,
290
+ "eval_MSPE": 9619.1962890625,
291
+ "eval_ND": 0.269564688205719,
292
+ "eval_RMSE": 0.4054800271987915,
293
+ "eval_SMAPE": 33.97864532470703,
294
+ "eval_runtime": 58.9946,
295
+ "eval_samples_per_second": 11762.182,
296
+ "eval_steps_per_second": 5.746,
297
+ "step": 14556
298
+ }
299
+ ],
300
+ "logging_steps": 500,
301
+ "max_steps": 84910,
302
+ "num_input_tokens_seen": 0,
303
+ "num_train_epochs": 35,
304
+ "save_steps": 500,
305
+ "stateful_callbacks": {
306
+ "EarlyStoppingCallback": {
307
+ "args": {
308
+ "early_stopping_patience": 3,
309
+ "early_stopping_threshold": 0.0
310
+ },
311
+ "attributes": {
312
+ "early_stopping_patience_counter": 0
313
+ }
314
+ },
315
+ "TrainerControl": {
316
+ "args": {
317
+ "should_epoch_stop": false,
318
+ "should_evaluate": false,
319
+ "should_log": false,
320
+ "should_save": true,
321
+ "should_training_stop": false
322
+ },
323
+ "attributes": {}
324
+ }
325
+ },
326
+ "total_flos": 0.0,
327
+ "train_batch_size": 512,
328
+ "trial_name": null,
329
+ "trial_params": null
330
+ }
OFA/Solar_192/checkpoint-14556/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c9af1751dc8b83f8179385ed6c0777b909ae98e57808be77ef242c73b56ec973
3
+ size 6584
OFA/exchange_rate_192/checkpoint-299/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ace47295691bb7ba9ffee694abb9f152a520eb239be59f22bfe08a529a9d34dc
3
+ size 261338858
OFA/exchange_rate_192/checkpoint-299/trainer_state.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_global_step": 299,
3
+ "best_metric": 0.25842341780662537,
4
+ "best_model_checkpoint": "/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/OFA_layer/haspara/192/exchange_rate/checkpoint-299",
5
+ "epoch": 1.0,
6
+ "eval_steps": 500,
7
+ "global_step": 299,
8
+ "is_hyper_param_search": false,
9
+ "is_local_process_zero": true,
10
+ "is_world_process_zero": true,
11
+ "log_history": [
12
+ {
13
+ "epoch": 1.0,
14
+ "eval_MAE": 0.3695564270019531,
15
+ "eval_MAPE": 25.642061233520508,
16
+ "eval_MSE": 0.25842341780662537,
17
+ "eval_MSPE": 2.612428903579712,
18
+ "eval_ND": 0.16934385895729065,
19
+ "eval_RMSE": 0.5083536505699158,
20
+ "eval_SMAPE": 21.528039932250977,
21
+ "eval_runtime": 1.45,
22
+ "eval_samples_per_second": 3139.386,
23
+ "eval_steps_per_second": 2.069,
24
+ "step": 299
25
+ }
26
+ ],
27
+ "logging_steps": 500,
28
+ "max_steps": 10465,
29
+ "num_input_tokens_seen": 0,
30
+ "num_train_epochs": 35,
31
+ "save_steps": 500,
32
+ "stateful_callbacks": {
33
+ "EarlyStoppingCallback": {
34
+ "args": {
35
+ "early_stopping_patience": 3,
36
+ "early_stopping_threshold": 0.0
37
+ },
38
+ "attributes": {
39
+ "early_stopping_patience_counter": 0
40
+ }
41
+ },
42
+ "TrainerControl": {
43
+ "args": {
44
+ "should_epoch_stop": false,
45
+ "should_evaluate": false,
46
+ "should_log": false,
47
+ "should_save": true,
48
+ "should_training_stop": false
49
+ },
50
+ "attributes": {}
51
+ }
52
+ },
53
+ "total_flos": 0.0,
54
+ "train_batch_size": 32,
55
+ "trial_name": null,
56
+ "trial_params": null
57
+ }
OFA/exchange_rate_192/checkpoint-299/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:adab5dda675b19d540a6cd1fafe424edba1cec784a550ddf4f810ab1b4070ff1
3
+ size 6584
OFA/weather_720/checkpoint-368/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d051349812c6ac062d248ad978de78868a2c6d603ebb19c4f43c0c6f40366c50
3
+ size 295402218
OFA/weather_720/checkpoint-368/trainer_state.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_global_step": 368,
3
+ "best_metric": 0.6688529253005981,
4
+ "best_model_checkpoint": "/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/OFA_layer/haspara/720/weather/checkpoint-368",
5
+ "epoch": 1.0,
6
+ "eval_steps": 500,
7
+ "global_step": 368,
8
+ "is_hyper_param_search": false,
9
+ "is_local_process_zero": true,
10
+ "is_world_process_zero": true,
11
+ "log_history": [
12
+ {
13
+ "epoch": 1.0,
14
+ "eval_MAE": 0.454274982213974,
15
+ "eval_MAPE": 1555.0623779296875,
16
+ "eval_MSE": 0.6688529253005981,
17
+ "eval_MSPE": 14232647.0,
18
+ "eval_ND": 0.8133898377418518,
19
+ "eval_RMSE": 0.8178343176841736,
20
+ "eval_SMAPE": 97.3488540649414,
21
+ "eval_runtime": 11.3069,
22
+ "eval_samples_per_second": 8452.448,
23
+ "eval_steps_per_second": 4.157,
24
+ "step": 368
25
+ }
26
+ ],
27
+ "logging_steps": 500,
28
+ "max_steps": 12880,
29
+ "num_input_tokens_seen": 0,
30
+ "num_train_epochs": 35,
31
+ "save_steps": 500,
32
+ "stateful_callbacks": {
33
+ "EarlyStoppingCallback": {
34
+ "args": {
35
+ "early_stopping_patience": 3,
36
+ "early_stopping_threshold": 0.0
37
+ },
38
+ "attributes": {
39
+ "early_stopping_patience_counter": 0
40
+ }
41
+ },
42
+ "TrainerControl": {
43
+ "args": {
44
+ "should_epoch_stop": false,
45
+ "should_evaluate": false,
46
+ "should_log": false,
47
+ "should_save": true,
48
+ "should_training_stop": false
49
+ },
50
+ "attributes": {}
51
+ }
52
+ },
53
+ "total_flos": 0.0,
54
+ "train_batch_size": 512,
55
+ "trial_name": null,
56
+ "trial_params": null
57
+ }
OFA/weather_720/checkpoint-368/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0d346ca809d2df8f0ecac6e575996ddf769302bed020738e91f3c3cc6371672c
3
+ size 6584
TimeLLM/ETTm1_512_192_TimeLLM_ETTm1_sl512_pl192_dm32_nh8_df128/checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e257ac465464b06bd92e3e56d0b6dac59b9d7e8cea42d2f9916f02edae805960
3
+ size 714733599
TimeLLM/PEMS07_512_336_TimeLLM_PEMS07_sl512_pl336_dm16_nh8_df32/checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8687faa15ce2fbfaccdb59744cc1fc6e0a068bf89df6d2ff4a61482e1c136674
3
+ size 703990367
TimeLLM/PEMS08_512_720_TimeLLM_PEMS08_sl512_pl720_dm16_nh8_df32/checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:468dfc1f9e03632771833a9d83fb8524132277f5752ee38633952285427d1237
3
+ size 707137631
TimeLLM/PEMS08_512_720_TimeLLM_PEMS08_sl512_pl720_dm16_nh8_df32/log.txt ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Namespace(model_id='PEMS08_512_720', model='TimeLLM', seed=2021, data='PEMS08', checkpoints='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/TimeLLM/hasparacheckpoints720/', load_ckp_base='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/TimeLLM/hasparacheckpoints96/', seq_len=512, pred_len=720, d_model=16, n_heads=8, d_ff=32, dropout=0.1, patch_size=16, stride=8, llm_dim=768, num_workers=16, train_epochs=10, batch_size=48, patience=3, learning_rate=0.01, lradj='type1', pct_start=0.2, gpt2_llama2='gpt2', part=0, pretrain=1, freeze=1, test_metrics_path='/home/hk-project-p0022189/tum_yvc3016/junlong/qx/Time-LLM/scripts/test_metrics/720.txt', dual_FT=0, selected_layers=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], load_path='/hkfs/work/workspace/scratch/tum_yvc3016-R1/qx_data/TimeLLM/hasparacheckpoints720/PEMS08_512_720_TimeLLM_PEMS08_sl512_pl720_dm16_nh8_df32/log.txt', content='PEMS08 is a traffic dataset collected by 170 sensors over a period of 62 days. It includes three types of features: flow, average speed, and average occupancy.')
2
+ Epoch: 1 cost time: 5270.287646770477
3
+ Epoch: 1 | Train Loss: 0.5284430 Vali Loss: 0.6219723
4
+ lr = 0.0004000000
5
+ Epoch: 2 cost time: 5211.907626390457
6
+ Epoch: 2 | Train Loss: 0.5061403 Vali Loss: 0.6203775
7
+ Epoch: 3 cost time: 5205.380863666534
8
+ Epoch: 3 | Train Loss: 0.4967325 Vali Loss: 0.6154433
9
+ Epoch: 4 cost time: 5146.1251401901245
10
+ Epoch: 4 | Train Loss: 0.4909644 Vali Loss: 0.6087528
11
+ Epoch: 5 cost time: 5147.690908670425
12
+ Epoch: 5 | Train Loss: 0.4880833 Vali Loss: 0.6050516
13
+ Epoch: 6 cost time: 5146.230623722076
14
+ Epoch: 6 | Train Loss: 0.4862469 Vali Loss: 0.6055145
15
+ EarlyStopping counter: 1 out of 3
16
+ Epoch: 7 cost time: 5129.8446407318115
17
+ Epoch: 7 | Train Loss: 0.4850045 Vali Loss: 0.6044727
18
+ Epoch: 8 cost time: 5116.925268650055
19
+ Epoch: 8 | Train Loss: 0.4858361 Vali Loss: 0.6041182
20
+ Epoch: 9 cost time: 5114.435469150543
21
+ Epoch: 9 | Train Loss: 0.4855596 Vali Loss: 0.6045282
22
+ EarlyStopping counter: 1 out of 3
23
+ Epoch: 10 cost time: 5103.860981225967
24
+ Epoch: 10 | Train Loss: 0.4847863 Vali Loss: 0.6048156
25
+ EarlyStopping counter: 2 out of 3
26
+ test shape: (1454520, 720, 1) (1454520, 720, 1)
27
+ PEMS08--MAE: 0.4079, MSE: 0.6099
28
+ finish
TimeLLM/electricity_512_192_TimeLLM_electricity_sl512_pl192_dm16_nh8_df32/checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:88e322b8eb75cf0165dff18c4c154d7815a328b26845e57c52931c9ed93190cb
3
+ size 702810143