| 2023-04-10 13:00:16,297 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:00:16,300 Model: "SequenceTagger( | |
| (embeddings): TransformerWordEmbeddings( | |
| (model): RobertaModel( | |
| (embeddings): RobertaEmbeddings( | |
| (word_embeddings): Embedding(50263, 768) | |
| (position_embeddings): Embedding(514, 768, padding_idx=1) | |
| (token_type_embeddings): Embedding(1, 768) | |
| (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) | |
| (dropout): Dropout(p=0.1, inplace=False) | |
| ) | |
| (encoder): RobertaEncoder( | |
| (layer): ModuleList( | |
| (0-11): 12 x RobertaLayer( | |
| (attention): RobertaAttention( | |
| (self): RobertaSelfAttention( | |
| (query): Linear(in_features=768, out_features=768, bias=True) | |
| (key): Linear(in_features=768, out_features=768, bias=True) | |
| (value): Linear(in_features=768, out_features=768, bias=True) | |
| (dropout): Dropout(p=0.1, inplace=False) | |
| ) | |
| (output): RobertaSelfOutput( | |
| (dense): Linear(in_features=768, out_features=768, bias=True) | |
| (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) | |
| (dropout): Dropout(p=0.1, inplace=False) | |
| ) | |
| ) | |
| (intermediate): RobertaIntermediate( | |
| (dense): Linear(in_features=768, out_features=3072, bias=True) | |
| (intermediate_act_fn): GELUActivation() | |
| ) | |
| (output): RobertaOutput( | |
| (dense): Linear(in_features=3072, out_features=768, bias=True) | |
| (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) | |
| (dropout): Dropout(p=0.1, inplace=False) | |
| ) | |
| ) | |
| ) | |
| ) | |
| (pooler): RobertaPooler( | |
| (dense): Linear(in_features=768, out_features=768, bias=True) | |
| (activation): Tanh() | |
| ) | |
| ) | |
| ) | |
| (locked_dropout): LockedDropout(p=0.5) | |
| (linear): Linear(in_features=768, out_features=17, bias=True) | |
| (loss_function): CrossEntropyLoss() | |
| )" | |
| 2023-04-10 13:00:16,300 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:00:16,302 Corpus: "Corpus: 12554 train + 4549 dev + 4505 test sentences" | |
| 2023-04-10 13:00:16,303 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:00:16,305 Parameters: | |
| 2023-04-10 13:00:16,305 - learning_rate: "0.000005" | |
| 2023-04-10 13:00:16,307 - mini_batch_size: "16" | |
| 2023-04-10 13:00:16,308 - patience: "3" | |
| 2023-04-10 13:00:16,310 - anneal_factor: "0.5" | |
| 2023-04-10 13:00:16,312 - max_epochs: "20" | |
| 2023-04-10 13:00:16,313 - shuffle: "True" | |
| 2023-04-10 13:00:16,315 - train_with_dev: "False" | |
| 2023-04-10 13:00:16,316 - batch_growth_annealing: "False" | |
| 2023-04-10 13:00:16,317 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:00:16,319 Model training base path: "CREBMSP_results" | |
| 2023-04-10 13:00:16,320 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:00:16,322 Device: cuda | |
| 2023-04-10 13:00:16,323 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:00:16,329 Embeddings storage mode: none | |
| 2023-04-10 13:00:16,329 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:01:05,878 epoch 1 - iter 78/785 - loss 2.72638388 - time (sec): 49.55 - samples/sec: 640.40 - lr: 0.000000 | |
| 2023-04-10 13:01:51,226 epoch 1 - iter 156/785 - loss 2.67369637 - time (sec): 94.89 - samples/sec: 661.65 - lr: 0.000000 | |
| 2023-04-10 13:02:36,982 epoch 1 - iter 234/785 - loss 2.59907317 - time (sec): 140.65 - samples/sec: 673.76 - lr: 0.000001 | |
| 2023-04-10 13:03:22,714 epoch 1 - iter 312/785 - loss 2.54109267 - time (sec): 186.38 - samples/sec: 607.84 - lr: 0.000001 | |
| 2023-04-10 13:04:08,546 epoch 1 - iter 390/785 - loss 2.44853293 - time (sec): 232.21 - samples/sec: 553.27 - lr: 0.000001 | |
| 2023-04-10 13:04:54,366 epoch 1 - iter 468/785 - loss 2.33490476 - time (sec): 278.03 - samples/sec: 516.47 - lr: 0.000001 | |
| 2023-04-10 13:05:39,909 epoch 1 - iter 546/785 - loss 2.23039567 - time (sec): 323.58 - samples/sec: 493.22 - lr: 0.000002 | |
| 2023-04-10 13:06:25,708 epoch 1 - iter 624/785 - loss 2.13753946 - time (sec): 369.38 - samples/sec: 474.69 - lr: 0.000002 | |
| 2023-04-10 13:07:11,470 epoch 1 - iter 702/785 - loss 2.05023541 - time (sec): 415.14 - samples/sec: 459.71 - lr: 0.000002 | |
| 2023-04-10 13:07:57,005 epoch 1 - iter 780/785 - loss 1.96306759 - time (sec): 460.67 - samples/sec: 449.14 - lr: 0.000002 | |
| 2023-04-10 13:07:59,739 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:07:59,740 EPOCH 1 done: loss 1.9575 - lr 0.000002 | |
| 2023-04-10 13:08:24,560 Evaluating as a multi-label problem: False | |
| 2023-04-10 13:08:24,631 DEV : loss 0.7945712804794312 - f1-score (micro avg) 0.1859 | |
| 2023-04-10 13:08:24,716 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:09:10,697 epoch 2 - iter 78/785 - loss 0.80433163 - time (sec): 45.98 - samples/sec: 433.02 - lr: 0.000003 | |
| 2023-04-10 13:09:56,456 epoch 2 - iter 156/785 - loss 0.77925513 - time (sec): 91.74 - samples/sec: 446.90 - lr: 0.000003 | |
| 2023-04-10 13:10:42,110 epoch 2 - iter 234/785 - loss 0.75449825 - time (sec): 137.39 - samples/sec: 451.07 - lr: 0.000003 | |
| 2023-04-10 13:11:28,041 epoch 2 - iter 312/785 - loss 0.73931223 - time (sec): 183.32 - samples/sec: 452.65 - lr: 0.000003 | |
| 2023-04-10 13:12:13,619 epoch 2 - iter 390/785 - loss 0.71775506 - time (sec): 228.90 - samples/sec: 453.04 - lr: 0.000004 | |
| 2023-04-10 13:12:59,428 epoch 2 - iter 468/785 - loss 0.70077066 - time (sec): 274.71 - samples/sec: 454.23 - lr: 0.000004 | |
| 2023-04-10 13:13:45,255 epoch 2 - iter 546/785 - loss 0.67654616 - time (sec): 320.54 - samples/sec: 453.06 - lr: 0.000004 | |
| 2023-04-10 13:14:31,147 epoch 2 - iter 624/785 - loss 0.65446315 - time (sec): 366.43 - samples/sec: 452.84 - lr: 0.000004 | |
| 2023-04-10 13:15:17,010 epoch 2 - iter 702/785 - loss 0.63531321 - time (sec): 412.29 - samples/sec: 453.08 - lr: 0.000005 | |
| 2023-04-10 13:16:02,846 epoch 2 - iter 780/785 - loss 0.61564112 - time (sec): 458.13 - samples/sec: 451.65 - lr: 0.000005 | |
| 2023-04-10 13:16:05,578 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:16:05,580 EPOCH 2 done: loss 0.6148 - lr 0.000005 | |
| 2023-04-10 13:16:31,519 Evaluating as a multi-label problem: False | |
| 2023-04-10 13:16:31,599 DEV : loss 0.3734082877635956 - f1-score (micro avg) 0.6995 | |
| 2023-04-10 13:16:31,683 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:17:17,398 epoch 3 - iter 78/785 - loss 0.42211544 - time (sec): 45.71 - samples/sec: 460.06 - lr: 0.000005 | |
| 2023-04-10 13:18:03,476 epoch 3 - iter 156/785 - loss 0.39062543 - time (sec): 91.79 - samples/sec: 450.57 - lr: 0.000005 | |
| 2023-04-10 13:18:49,456 epoch 3 - iter 234/785 - loss 0.38367027 - time (sec): 137.77 - samples/sec: 448.39 - lr: 0.000005 | |
| 2023-04-10 13:19:35,061 epoch 3 - iter 312/785 - loss 0.37454659 - time (sec): 183.38 - samples/sec: 450.10 - lr: 0.000005 | |
| 2023-04-10 13:20:20,826 epoch 3 - iter 390/785 - loss 0.36558572 - time (sec): 229.14 - samples/sec: 448.39 - lr: 0.000005 | |
| 2023-04-10 13:21:06,862 epoch 3 - iter 468/785 - loss 0.36016623 - time (sec): 275.18 - samples/sec: 450.31 - lr: 0.000005 | |
| 2023-04-10 13:21:52,802 epoch 3 - iter 546/785 - loss 0.35301531 - time (sec): 321.12 - samples/sec: 450.40 - lr: 0.000005 | |
| 2023-04-10 13:22:38,631 epoch 3 - iter 624/785 - loss 0.34785227 - time (sec): 366.95 - samples/sec: 451.31 - lr: 0.000005 | |
| 2023-04-10 13:23:24,542 epoch 3 - iter 702/785 - loss 0.34289183 - time (sec): 412.86 - samples/sec: 450.14 - lr: 0.000005 | |
| 2023-04-10 13:24:10,698 epoch 3 - iter 780/785 - loss 0.33617042 - time (sec): 459.01 - samples/sec: 450.47 - lr: 0.000005 | |
| 2023-04-10 13:24:13,424 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:24:13,425 EPOCH 3 done: loss 0.3359 - lr 0.000005 | |
| 2023-04-10 13:24:39,219 Evaluating as a multi-label problem: False | |
| 2023-04-10 13:24:39,294 DEV : loss 0.2478274405002594 - f1-score (micro avg) 0.7669 | |
| 2023-04-10 13:24:39,378 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:25:25,057 epoch 4 - iter 78/785 - loss 0.24856806 - time (sec): 45.68 - samples/sec: 460.40 - lr: 0.000005 | |
| 2023-04-10 13:26:11,029 epoch 4 - iter 156/785 - loss 0.24520289 - time (sec): 91.65 - samples/sec: 457.73 - lr: 0.000005 | |
| 2023-04-10 13:26:56,833 epoch 4 - iter 234/785 - loss 0.24918267 - time (sec): 137.45 - samples/sec: 446.42 - lr: 0.000005 | |
| 2023-04-10 13:27:42,768 epoch 4 - iter 312/785 - loss 0.24994835 - time (sec): 183.39 - samples/sec: 445.87 - lr: 0.000005 | |
| 2023-04-10 13:28:28,729 epoch 4 - iter 390/785 - loss 0.24670791 - time (sec): 229.35 - samples/sec: 443.71 - lr: 0.000005 | |
| 2023-04-10 13:29:14,517 epoch 4 - iter 468/785 - loss 0.24363947 - time (sec): 275.14 - samples/sec: 447.73 - lr: 0.000005 | |
| 2023-04-10 13:30:00,442 epoch 4 - iter 546/785 - loss 0.24232958 - time (sec): 321.06 - samples/sec: 446.61 - lr: 0.000005 | |
| 2023-04-10 13:30:46,468 epoch 4 - iter 624/785 - loss 0.23891458 - time (sec): 367.09 - samples/sec: 447.11 - lr: 0.000005 | |
| 2023-04-10 13:31:32,246 epoch 4 - iter 702/785 - loss 0.23581434 - time (sec): 412.87 - samples/sec: 450.67 - lr: 0.000004 | |
| 2023-04-10 13:32:18,461 epoch 4 - iter 780/785 - loss 0.23410588 - time (sec): 459.08 - samples/sec: 450.67 - lr: 0.000004 | |
| 2023-04-10 13:32:21,126 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:32:21,128 EPOCH 4 done: loss 0.2340 - lr 0.000004 | |
| 2023-04-10 13:32:46,956 Evaluating as a multi-label problem: False | |
| 2023-04-10 13:32:47,034 DEV : loss 0.21353298425674438 - f1-score (micro avg) 0.7899 | |
| 2023-04-10 13:32:47,119 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:33:32,954 epoch 5 - iter 78/785 - loss 0.18971301 - time (sec): 45.83 - samples/sec: 455.71 - lr: 0.000004 | |
| 2023-04-10 13:34:18,919 epoch 5 - iter 156/785 - loss 0.19396453 - time (sec): 91.80 - samples/sec: 453.89 - lr: 0.000004 | |
| 2023-04-10 13:35:04,859 epoch 5 - iter 234/785 - loss 0.19108296 - time (sec): 137.74 - samples/sec: 453.20 - lr: 0.000004 | |
| 2023-04-10 13:35:50,732 epoch 5 - iter 312/785 - loss 0.18832768 - time (sec): 183.61 - samples/sec: 449.66 - lr: 0.000004 | |
| 2023-04-10 13:36:36,467 epoch 5 - iter 390/785 - loss 0.18825695 - time (sec): 229.35 - samples/sec: 452.64 - lr: 0.000004 | |
| 2023-04-10 13:37:22,590 epoch 5 - iter 468/785 - loss 0.18787454 - time (sec): 275.47 - samples/sec: 451.57 - lr: 0.000004 | |
| 2023-04-10 13:38:08,477 epoch 5 - iter 546/785 - loss 0.18615161 - time (sec): 321.36 - samples/sec: 451.47 - lr: 0.000004 | |
| 2023-04-10 13:38:54,257 epoch 5 - iter 624/785 - loss 0.18594722 - time (sec): 367.14 - samples/sec: 450.59 - lr: 0.000004 | |
| 2023-04-10 13:39:40,394 epoch 5 - iter 702/785 - loss 0.18508805 - time (sec): 413.27 - samples/sec: 450.44 - lr: 0.000004 | |
| 2023-04-10 13:40:26,352 epoch 5 - iter 780/785 - loss 0.18421189 - time (sec): 459.23 - samples/sec: 450.18 - lr: 0.000004 | |
| 2023-04-10 13:40:29,092 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:40:29,093 EPOCH 5 done: loss 0.1843 - lr 0.000004 | |
| 2023-04-10 13:40:54,758 Evaluating as a multi-label problem: False | |
| 2023-04-10 13:40:54,836 DEV : loss 0.19248297810554504 - f1-score (micro avg) 0.8091 | |
| 2023-04-10 13:40:54,920 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:41:40,954 epoch 6 - iter 78/785 - loss 0.16169561 - time (sec): 46.03 - samples/sec: 440.82 - lr: 0.000004 | |
| 2023-04-10 13:42:26,837 epoch 6 - iter 156/785 - loss 0.15868632 - time (sec): 91.92 - samples/sec: 441.25 - lr: 0.000004 | |
| 2023-04-10 13:43:12,870 epoch 6 - iter 234/785 - loss 0.16192991 - time (sec): 137.95 - samples/sec: 443.43 - lr: 0.000004 | |
| 2023-04-10 13:43:58,927 epoch 6 - iter 312/785 - loss 0.15837734 - time (sec): 184.01 - samples/sec: 445.62 - lr: 0.000004 | |
| 2023-04-10 13:44:44,996 epoch 6 - iter 390/785 - loss 0.15549650 - time (sec): 230.07 - samples/sec: 442.88 - lr: 0.000004 | |
| 2023-04-10 13:45:31,130 epoch 6 - iter 468/785 - loss 0.15509965 - time (sec): 276.21 - samples/sec: 440.68 - lr: 0.000004 | |
| 2023-04-10 13:46:17,430 epoch 6 - iter 546/785 - loss 0.15536700 - time (sec): 322.51 - samples/sec: 444.17 - lr: 0.000004 | |
| 2023-04-10 13:47:03,271 epoch 6 - iter 624/785 - loss 0.15596272 - time (sec): 368.35 - samples/sec: 447.46 - lr: 0.000004 | |
| 2023-04-10 13:47:49,333 epoch 6 - iter 702/785 - loss 0.15470882 - time (sec): 414.41 - samples/sec: 446.90 - lr: 0.000004 | |
| 2023-04-10 13:48:35,335 epoch 6 - iter 780/785 - loss 0.15353726 - time (sec): 460.41 - samples/sec: 449.07 - lr: 0.000004 | |
| 2023-04-10 13:48:38,091 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:48:38,093 EPOCH 6 done: loss 0.1537 - lr 0.000004 | |
| 2023-04-10 13:49:03,872 Evaluating as a multi-label problem: False | |
| 2023-04-10 13:49:03,948 DEV : loss 0.19085420668125153 - f1-score (micro avg) 0.8218 | |
| 2023-04-10 13:49:04,033 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:49:49,709 epoch 7 - iter 78/785 - loss 0.13949250 - time (sec): 45.68 - samples/sec: 451.69 - lr: 0.000004 | |
| 2023-04-10 13:50:35,575 epoch 7 - iter 156/785 - loss 0.13991533 - time (sec): 91.54 - samples/sec: 451.74 - lr: 0.000004 | |
| 2023-04-10 13:51:21,688 epoch 7 - iter 234/785 - loss 0.13727018 - time (sec): 137.65 - samples/sec: 446.11 - lr: 0.000004 | |
| 2023-04-10 13:52:07,734 epoch 7 - iter 312/785 - loss 0.13962965 - time (sec): 183.70 - samples/sec: 443.38 - lr: 0.000004 | |
| 2023-04-10 13:52:53,850 epoch 7 - iter 390/785 - loss 0.13871141 - time (sec): 229.82 - samples/sec: 444.60 - lr: 0.000004 | |
| 2023-04-10 13:53:39,863 epoch 7 - iter 468/785 - loss 0.13783456 - time (sec): 275.83 - samples/sec: 446.99 - lr: 0.000004 | |
| 2023-04-10 13:54:25,630 epoch 7 - iter 546/785 - loss 0.13700803 - time (sec): 321.60 - samples/sec: 449.44 - lr: 0.000004 | |
| 2023-04-10 13:55:11,589 epoch 7 - iter 624/785 - loss 0.13522205 - time (sec): 367.55 - samples/sec: 450.51 - lr: 0.000004 | |
| 2023-04-10 13:55:57,520 epoch 7 - iter 702/785 - loss 0.13499711 - time (sec): 413.49 - samples/sec: 449.78 - lr: 0.000004 | |
| 2023-04-10 13:56:43,432 epoch 7 - iter 780/785 - loss 0.13300228 - time (sec): 459.40 - samples/sec: 450.08 - lr: 0.000004 | |
| 2023-04-10 13:56:46,156 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:56:46,157 EPOCH 7 done: loss 0.1327 - lr 0.000004 | |
| 2023-04-10 13:57:11,974 Evaluating as a multi-label problem: False | |
| 2023-04-10 13:57:12,051 DEV : loss 0.188308447599411 - f1-score (micro avg) 0.8331 | |
| 2023-04-10 13:57:12,135 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 13:57:58,126 epoch 8 - iter 78/785 - loss 0.12315730 - time (sec): 45.99 - samples/sec: 450.64 - lr: 0.000004 | |
| 2023-04-10 13:58:44,253 epoch 8 - iter 156/785 - loss 0.11813120 - time (sec): 92.12 - samples/sec: 436.42 - lr: 0.000004 | |
| 2023-04-10 13:59:30,059 epoch 8 - iter 234/785 - loss 0.11978297 - time (sec): 137.92 - samples/sec: 444.20 - lr: 0.000004 | |
| 2023-04-10 14:00:16,019 epoch 8 - iter 312/785 - loss 0.11865398 - time (sec): 183.88 - samples/sec: 448.73 - lr: 0.000004 | |
| 2023-04-10 14:01:01,667 epoch 8 - iter 390/785 - loss 0.11649927 - time (sec): 229.53 - samples/sec: 449.84 - lr: 0.000003 | |
| 2023-04-10 14:01:47,632 epoch 8 - iter 468/785 - loss 0.11678690 - time (sec): 275.50 - samples/sec: 450.66 - lr: 0.000003 | |
| 2023-04-10 14:02:33,760 epoch 8 - iter 546/785 - loss 0.11768582 - time (sec): 321.62 - samples/sec: 451.69 - lr: 0.000003 | |
| 2023-04-10 14:03:19,469 epoch 8 - iter 624/785 - loss 0.11698818 - time (sec): 367.33 - samples/sec: 450.46 - lr: 0.000003 | |
| 2023-04-10 14:04:05,350 epoch 8 - iter 702/785 - loss 0.11655166 - time (sec): 413.21 - samples/sec: 449.45 - lr: 0.000003 | |
| 2023-04-10 14:04:51,185 epoch 8 - iter 780/785 - loss 0.11644185 - time (sec): 459.05 - samples/sec: 450.72 - lr: 0.000003 | |
| 2023-04-10 14:04:53,872 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:04:53,874 EPOCH 8 done: loss 0.1164 - lr 0.000003 | |
| 2023-04-10 14:05:19,663 Evaluating as a multi-label problem: False | |
| 2023-04-10 14:05:19,743 DEV : loss 0.18473857641220093 - f1-score (micro avg) 0.8406 | |
| 2023-04-10 14:05:19,828 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:06:05,453 epoch 9 - iter 78/785 - loss 0.10657376 - time (sec): 45.62 - samples/sec: 440.02 - lr: 0.000003 | |
| 2023-04-10 14:06:51,368 epoch 9 - iter 156/785 - loss 0.10863598 - time (sec): 91.54 - samples/sec: 439.74 - lr: 0.000003 | |
| 2023-04-10 14:07:36,988 epoch 9 - iter 234/785 - loss 0.10633920 - time (sec): 137.16 - samples/sec: 445.35 - lr: 0.000003 | |
| 2023-04-10 14:08:22,898 epoch 9 - iter 312/785 - loss 0.10460097 - time (sec): 183.07 - samples/sec: 446.32 - lr: 0.000003 | |
| 2023-04-10 14:09:08,636 epoch 9 - iter 390/785 - loss 0.10531387 - time (sec): 228.81 - samples/sec: 446.18 - lr: 0.000003 | |
| 2023-04-10 14:09:54,238 epoch 9 - iter 468/785 - loss 0.10648494 - time (sec): 274.41 - samples/sec: 446.35 - lr: 0.000003 | |
| 2023-04-10 14:10:39,806 epoch 9 - iter 546/785 - loss 0.10488251 - time (sec): 319.98 - samples/sec: 448.60 - lr: 0.000003 | |
| 2023-04-10 14:11:25,286 epoch 9 - iter 624/785 - loss 0.10527523 - time (sec): 365.46 - samples/sec: 450.60 - lr: 0.000003 | |
| 2023-04-10 14:12:10,986 epoch 9 - iter 702/785 - loss 0.10473876 - time (sec): 411.16 - samples/sec: 451.31 - lr: 0.000003 | |
| 2023-04-10 14:12:56,700 epoch 9 - iter 780/785 - loss 0.10399221 - time (sec): 456.87 - samples/sec: 452.89 - lr: 0.000003 | |
| 2023-04-10 14:12:59,380 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:12:59,382 EPOCH 9 done: loss 0.1042 - lr 0.000003 | |
| 2023-04-10 14:13:25,061 Evaluating as a multi-label problem: False | |
| 2023-04-10 14:13:25,138 DEV : loss 0.19332602620124817 - f1-score (micro avg) 0.8443 | |
| 2023-04-10 14:13:25,224 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:14:10,842 epoch 10 - iter 78/785 - loss 0.09722209 - time (sec): 45.62 - samples/sec: 457.53 - lr: 0.000003 | |
| 2023-04-10 14:14:56,574 epoch 10 - iter 156/785 - loss 0.09960375 - time (sec): 91.35 - samples/sec: 452.13 - lr: 0.000003 | |
| 2023-04-10 14:15:42,382 epoch 10 - iter 234/785 - loss 0.09791734 - time (sec): 137.16 - samples/sec: 451.64 - lr: 0.000003 | |
| 2023-04-10 14:16:28,359 epoch 10 - iter 312/785 - loss 0.09533145 - time (sec): 183.13 - samples/sec: 455.08 - lr: 0.000003 | |
| 2023-04-10 14:17:13,898 epoch 10 - iter 390/785 - loss 0.09546462 - time (sec): 228.67 - samples/sec: 455.61 - lr: 0.000003 | |
| 2023-04-10 14:17:59,305 epoch 10 - iter 468/785 - loss 0.09469376 - time (sec): 274.08 - samples/sec: 452.76 - lr: 0.000003 | |
| 2023-04-10 14:18:44,944 epoch 10 - iter 546/785 - loss 0.09461890 - time (sec): 319.72 - samples/sec: 453.17 - lr: 0.000003 | |
| 2023-04-10 14:19:30,380 epoch 10 - iter 624/785 - loss 0.09481641 - time (sec): 365.16 - samples/sec: 453.85 - lr: 0.000003 | |
| 2023-04-10 14:20:15,968 epoch 10 - iter 702/785 - loss 0.09459196 - time (sec): 410.74 - samples/sec: 453.32 - lr: 0.000003 | |
| 2023-04-10 14:21:01,721 epoch 10 - iter 780/785 - loss 0.09403540 - time (sec): 456.50 - samples/sec: 452.79 - lr: 0.000003 | |
| 2023-04-10 14:21:04,407 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:21:04,409 EPOCH 10 done: loss 0.0938 - lr 0.000003 | |
| 2023-04-10 14:21:30,389 Evaluating as a multi-label problem: False | |
| 2023-04-10 14:21:30,467 DEV : loss 0.18941430747509003 - f1-score (micro avg) 0.8458 | |
| 2023-04-10 14:21:30,553 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:22:16,229 epoch 11 - iter 78/785 - loss 0.07965436 - time (sec): 45.67 - samples/sec: 469.58 - lr: 0.000003 | |
| 2023-04-10 14:23:02,286 epoch 11 - iter 156/785 - loss 0.08242477 - time (sec): 91.73 - samples/sec: 467.46 - lr: 0.000003 | |
| 2023-04-10 14:23:47,919 epoch 11 - iter 234/785 - loss 0.08405410 - time (sec): 137.36 - samples/sec: 461.18 - lr: 0.000003 | |
| 2023-04-10 14:24:33,686 epoch 11 - iter 312/785 - loss 0.08238391 - time (sec): 183.13 - samples/sec: 455.77 - lr: 0.000003 | |
| 2023-04-10 14:25:19,484 epoch 11 - iter 390/785 - loss 0.08149592 - time (sec): 228.93 - samples/sec: 453.93 - lr: 0.000003 | |
| 2023-04-10 14:26:05,233 epoch 11 - iter 468/785 - loss 0.08168820 - time (sec): 274.68 - samples/sec: 452.09 - lr: 0.000003 | |
| 2023-04-10 14:26:50,901 epoch 11 - iter 546/785 - loss 0.08177046 - time (sec): 320.35 - samples/sec: 452.25 - lr: 0.000003 | |
| 2023-04-10 14:27:36,661 epoch 11 - iter 624/785 - loss 0.08271731 - time (sec): 366.11 - samples/sec: 452.99 - lr: 0.000003 | |
| 2023-04-10 14:28:22,099 epoch 11 - iter 702/785 - loss 0.08254577 - time (sec): 411.54 - samples/sec: 452.31 - lr: 0.000003 | |
| 2023-04-10 14:29:07,671 epoch 11 - iter 780/785 - loss 0.08371043 - time (sec): 457.12 - samples/sec: 453.09 - lr: 0.000003 | |
| 2023-04-10 14:29:10,316 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:29:10,318 EPOCH 11 done: loss 0.0837 - lr 0.000003 | |
| 2023-04-10 14:29:35,090 Evaluating as a multi-label problem: False | |
| 2023-04-10 14:29:35,166 DEV : loss 0.2022610902786255 - f1-score (micro avg) 0.8404 | |
| 2023-04-10 14:29:35,259 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:30:21,283 epoch 12 - iter 78/785 - loss 0.06767359 - time (sec): 46.02 - samples/sec: 454.91 - lr: 0.000002 | |
| 2023-04-10 14:31:07,146 epoch 12 - iter 156/785 - loss 0.07443837 - time (sec): 91.89 - samples/sec: 449.99 - lr: 0.000002 | |
| 2023-04-10 14:31:53,003 epoch 12 - iter 234/785 - loss 0.07629224 - time (sec): 137.74 - samples/sec: 451.86 - lr: 0.000002 | |
| 2023-04-10 14:32:38,934 epoch 12 - iter 312/785 - loss 0.07741157 - time (sec): 183.67 - samples/sec: 452.03 - lr: 0.000002 | |
| 2023-04-10 14:33:24,791 epoch 12 - iter 390/785 - loss 0.07706257 - time (sec): 229.53 - samples/sec: 454.13 - lr: 0.000002 | |
| 2023-04-10 14:34:10,546 epoch 12 - iter 468/785 - loss 0.07581749 - time (sec): 275.29 - samples/sec: 454.58 - lr: 0.000002 | |
| 2023-04-10 14:34:56,353 epoch 12 - iter 546/785 - loss 0.07615371 - time (sec): 321.09 - samples/sec: 453.24 - lr: 0.000002 | |
| 2023-04-10 14:35:41,715 epoch 12 - iter 624/785 - loss 0.07630547 - time (sec): 366.45 - samples/sec: 451.94 - lr: 0.000002 | |
| 2023-04-10 14:36:27,902 epoch 12 - iter 702/785 - loss 0.07703151 - time (sec): 412.64 - samples/sec: 451.11 - lr: 0.000002 | |
| 2023-04-10 14:37:13,513 epoch 12 - iter 780/785 - loss 0.07688972 - time (sec): 458.25 - samples/sec: 451.11 - lr: 0.000002 | |
| 2023-04-10 14:37:16,212 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:37:16,214 EPOCH 12 done: loss 0.0769 - lr 0.000002 | |
| 2023-04-10 14:37:42,267 Evaluating as a multi-label problem: False | |
| 2023-04-10 14:37:42,343 DEV : loss 0.19032613933086395 - f1-score (micro avg) 0.8513 | |
| 2023-04-10 14:37:42,429 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:38:28,309 epoch 13 - iter 78/785 - loss 0.06781882 - time (sec): 45.88 - samples/sec: 437.28 - lr: 0.000002 | |
| 2023-04-10 14:39:13,998 epoch 13 - iter 156/785 - loss 0.06953428 - time (sec): 91.57 - samples/sec: 442.07 - lr: 0.000002 | |
| 2023-04-10 14:39:59,400 epoch 13 - iter 234/785 - loss 0.06968786 - time (sec): 136.97 - samples/sec: 447.35 - lr: 0.000002 | |
| 2023-04-10 14:40:45,242 epoch 13 - iter 312/785 - loss 0.07032229 - time (sec): 182.81 - samples/sec: 449.08 - lr: 0.000002 | |
| 2023-04-10 14:41:30,932 epoch 13 - iter 390/785 - loss 0.07052987 - time (sec): 228.50 - samples/sec: 445.56 - lr: 0.000002 | |
| 2023-04-10 14:42:16,884 epoch 13 - iter 468/785 - loss 0.07176712 - time (sec): 274.45 - samples/sec: 444.54 - lr: 0.000002 | |
| 2023-04-10 14:43:02,911 epoch 13 - iter 546/785 - loss 0.07183614 - time (sec): 320.48 - samples/sec: 446.39 - lr: 0.000002 | |
| 2023-04-10 14:43:48,816 epoch 13 - iter 624/785 - loss 0.07253765 - time (sec): 366.39 - samples/sec: 446.93 - lr: 0.000002 | |
| 2023-04-10 14:44:34,491 epoch 13 - iter 702/785 - loss 0.07213498 - time (sec): 412.06 - samples/sec: 449.17 - lr: 0.000002 | |
| 2023-04-10 14:45:20,007 epoch 13 - iter 780/785 - loss 0.07218568 - time (sec): 457.58 - samples/sec: 451.86 - lr: 0.000002 | |
| 2023-04-10 14:45:22,772 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:45:22,774 EPOCH 13 done: loss 0.0722 - lr 0.000002 | |
| 2023-04-10 14:45:48,608 Evaluating as a multi-label problem: False | |
| 2023-04-10 14:45:48,685 DEV : loss 0.19682374596595764 - f1-score (micro avg) 0.853 | |
| 2023-04-10 14:45:48,772 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:46:34,526 epoch 14 - iter 78/785 - loss 0.05882194 - time (sec): 45.75 - samples/sec: 442.48 - lr: 0.000002 | |
| 2023-04-10 14:47:20,308 epoch 14 - iter 156/785 - loss 0.06553124 - time (sec): 91.53 - samples/sec: 446.65 - lr: 0.000002 | |
| 2023-04-10 14:48:06,130 epoch 14 - iter 234/785 - loss 0.06636154 - time (sec): 137.36 - samples/sec: 445.02 - lr: 0.000002 | |
| 2023-04-10 14:48:51,621 epoch 14 - iter 312/785 - loss 0.06544912 - time (sec): 182.85 - samples/sec: 448.03 - lr: 0.000002 | |
| 2023-04-10 14:49:37,323 epoch 14 - iter 390/785 - loss 0.06512617 - time (sec): 228.55 - samples/sec: 448.79 - lr: 0.000002 | |
| 2023-04-10 14:50:23,228 epoch 14 - iter 468/785 - loss 0.06536846 - time (sec): 274.46 - samples/sec: 448.59 - lr: 0.000002 | |
| 2023-04-10 14:51:08,762 epoch 14 - iter 546/785 - loss 0.06540547 - time (sec): 319.99 - samples/sec: 450.40 - lr: 0.000002 | |
| 2023-04-10 14:51:54,701 epoch 14 - iter 624/785 - loss 0.06641531 - time (sec): 365.93 - samples/sec: 448.39 - lr: 0.000002 | |
| 2023-04-10 14:52:40,613 epoch 14 - iter 702/785 - loss 0.06649606 - time (sec): 411.84 - samples/sec: 449.74 - lr: 0.000002 | |
| 2023-04-10 14:53:26,281 epoch 14 - iter 780/785 - loss 0.06663863 - time (sec): 457.51 - samples/sec: 452.11 - lr: 0.000002 | |
| 2023-04-10 14:53:29,011 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:53:29,013 EPOCH 14 done: loss 0.0665 - lr 0.000002 | |
| 2023-04-10 14:53:54,922 Evaluating as a multi-label problem: False | |
| 2023-04-10 14:53:54,995 DEV : loss 0.19152763485908508 - f1-score (micro avg) 0.8543 | |
| 2023-04-10 14:53:55,084 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 14:54:40,977 epoch 15 - iter 78/785 - loss 0.05893628 - time (sec): 45.89 - samples/sec: 434.64 - lr: 0.000002 | |
| 2023-04-10 14:55:26,530 epoch 15 - iter 156/785 - loss 0.06296731 - time (sec): 91.44 - samples/sec: 438.22 - lr: 0.000002 | |
| 2023-04-10 14:56:11,919 epoch 15 - iter 234/785 - loss 0.06296709 - time (sec): 136.83 - samples/sec: 449.49 - lr: 0.000002 | |
| 2023-04-10 14:56:57,537 epoch 15 - iter 312/785 - loss 0.06042086 - time (sec): 182.45 - samples/sec: 452.57 - lr: 0.000002 | |
| 2023-04-10 14:57:43,221 epoch 15 - iter 390/785 - loss 0.06400154 - time (sec): 228.13 - samples/sec: 453.81 - lr: 0.000002 | |
| 2023-04-10 14:58:29,151 epoch 15 - iter 468/785 - loss 0.06384428 - time (sec): 274.07 - samples/sec: 450.67 - lr: 0.000002 | |
| 2023-04-10 14:59:14,871 epoch 15 - iter 546/785 - loss 0.06211280 - time (sec): 319.79 - samples/sec: 452.05 - lr: 0.000001 | |
| 2023-04-10 15:00:00,212 epoch 15 - iter 624/785 - loss 0.06312445 - time (sec): 365.13 - samples/sec: 453.40 - lr: 0.000001 | |
| 2023-04-10 15:00:46,083 epoch 15 - iter 702/785 - loss 0.06365620 - time (sec): 411.00 - samples/sec: 453.08 - lr: 0.000001 | |
| 2023-04-10 15:01:31,831 epoch 15 - iter 780/785 - loss 0.06357545 - time (sec): 456.74 - samples/sec: 452.90 - lr: 0.000001 | |
| 2023-04-10 15:01:34,610 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 15:01:34,612 EPOCH 15 done: loss 0.0635 - lr 0.000001 | |
| 2023-04-10 15:02:00,571 Evaluating as a multi-label problem: False | |
| 2023-04-10 15:02:00,650 DEV : loss 0.19623318314552307 - f1-score (micro avg) 0.8562 | |
| 2023-04-10 15:02:00,739 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 15:02:46,516 epoch 16 - iter 78/785 - loss 0.05263069 - time (sec): 45.78 - samples/sec: 447.66 - lr: 0.000001 | |
| 2023-04-10 15:03:32,217 epoch 16 - iter 156/785 - loss 0.05540555 - time (sec): 91.48 - samples/sec: 458.04 - lr: 0.000001 | |
| 2023-04-10 15:04:17,802 epoch 16 - iter 234/785 - loss 0.05653095 - time (sec): 137.06 - samples/sec: 454.61 - lr: 0.000001 | |
| 2023-04-10 15:05:03,756 epoch 16 - iter 312/785 - loss 0.05690468 - time (sec): 183.01 - samples/sec: 453.77 - lr: 0.000001 | |
| 2023-04-10 15:05:49,407 epoch 16 - iter 390/785 - loss 0.05848835 - time (sec): 228.67 - samples/sec: 454.04 - lr: 0.000001 | |
| 2023-04-10 15:06:35,060 epoch 16 - iter 468/785 - loss 0.05897047 - time (sec): 274.32 - samples/sec: 453.35 - lr: 0.000001 | |
| 2023-04-10 15:07:20,765 epoch 16 - iter 546/785 - loss 0.05940641 - time (sec): 320.02 - samples/sec: 452.29 - lr: 0.000001 | |
| 2023-04-10 15:08:06,518 epoch 16 - iter 624/785 - loss 0.05878874 - time (sec): 365.78 - samples/sec: 452.31 - lr: 0.000001 | |
| 2023-04-10 15:08:52,406 epoch 16 - iter 702/785 - loss 0.05878710 - time (sec): 411.67 - samples/sec: 452.43 - lr: 0.000001 | |
| 2023-04-10 15:09:38,261 epoch 16 - iter 780/785 - loss 0.05871527 - time (sec): 457.52 - samples/sec: 452.47 - lr: 0.000001 | |
| 2023-04-10 15:09:41,139 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 15:09:41,141 EPOCH 16 done: loss 0.0587 - lr 0.000001 | |
| 2023-04-10 15:10:06,206 Evaluating as a multi-label problem: False | |
| 2023-04-10 15:10:06,282 DEV : loss 0.19955378770828247 - f1-score (micro avg) 0.8578 | |
| 2023-04-10 15:10:06,370 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 15:10:52,361 epoch 17 - iter 78/785 - loss 0.05076330 - time (sec): 45.99 - samples/sec: 462.06 - lr: 0.000001 | |
| 2023-04-10 15:11:38,184 epoch 17 - iter 156/785 - loss 0.05519241 - time (sec): 91.81 - samples/sec: 462.21 - lr: 0.000001 | |
| 2023-04-10 15:12:24,115 epoch 17 - iter 234/785 - loss 0.05342529 - time (sec): 137.74 - samples/sec: 457.55 - lr: 0.000001 | |
| 2023-04-10 15:13:09,882 epoch 17 - iter 312/785 - loss 0.05189467 - time (sec): 183.51 - samples/sec: 455.00 - lr: 0.000001 | |
| 2023-04-10 15:13:55,976 epoch 17 - iter 390/785 - loss 0.05405067 - time (sec): 229.60 - samples/sec: 453.17 - lr: 0.000001 | |
| 2023-04-10 15:14:41,579 epoch 17 - iter 468/785 - loss 0.05398715 - time (sec): 275.21 - samples/sec: 453.21 - lr: 0.000001 | |
| 2023-04-10 15:15:27,308 epoch 17 - iter 546/785 - loss 0.05539713 - time (sec): 320.94 - samples/sec: 454.08 - lr: 0.000001 | |
| 2023-04-10 15:16:13,512 epoch 17 - iter 624/785 - loss 0.05586570 - time (sec): 367.14 - samples/sec: 453.67 - lr: 0.000001 | |
| 2023-04-10 15:16:59,624 epoch 17 - iter 702/785 - loss 0.05576616 - time (sec): 413.25 - samples/sec: 452.94 - lr: 0.000001 | |
| 2023-04-10 15:17:45,460 epoch 17 - iter 780/785 - loss 0.05531521 - time (sec): 459.09 - samples/sec: 450.39 - lr: 0.000001 | |
| 2023-04-10 15:17:48,168 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 15:17:48,170 EPOCH 17 done: loss 0.0553 - lr 0.000001 | |
| 2023-04-10 15:18:14,080 Evaluating as a multi-label problem: False | |
| 2023-04-10 15:18:14,155 DEV : loss 0.20788049697875977 - f1-score (micro avg) 0.8562 | |
| 2023-04-10 15:18:14,243 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 15:18:48,097 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 15:18:48,099 Exiting from training early. | |
| 2023-04-10 15:18:48,100 Saving model ... | |
| 2023-04-10 15:18:48,949 Done. | |
| 2023-04-10 15:18:48,952 ---------------------------------------------------------------------------------------------------- | |
| 2023-04-10 15:18:48,954 Testing using last state of model ... | |
| 2023-04-10 15:19:14,468 Evaluating as a multi-label problem: False | |
| 2023-04-10 15:19:14,541 0.8346 0.868 0.851 0.7477 | |
| 2023-04-10 15:19:14,543 | |
| Results: | |
| - F-score (micro) 0.851 | |
| - F-score (macro) 0.8197 | |
| - Accuracy 0.7477 | |
| By class: | |
| precision recall f1-score support | |
| PROC 0.8033 0.8731 0.8368 3364 | |
| DISO 0.8552 0.8722 0.8636 2472 | |
| CHEM 0.8973 0.8933 0.8953 1565 | |
| ANAT 0.7138 0.6551 0.6832 316 | |
| micro avg 0.8346 0.8680 0.8510 7717 | |
| macro avg 0.8174 0.8234 0.8197 7717 | |
| weighted avg 0.8353 0.8680 0.8509 7717 | |
| 2023-04-10 15:19:14,544 ---------------------------------------------------------------------------------------------------- | |