CTEBMSP_20e_FLERT / training.log
roscazo's picture
Initial commit
94d531a
2023-04-10 13:00:16,297 ----------------------------------------------------------------------------------------------------
2023-04-10 13:00:16,300 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): RobertaModel(
(embeddings): RobertaEmbeddings(
(word_embeddings): Embedding(50263, 768)
(position_embeddings): Embedding(514, 768, padding_idx=1)
(token_type_embeddings): Embedding(1, 768)
(LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): RobertaEncoder(
(layer): ModuleList(
(0-11): 12 x RobertaLayer(
(attention): RobertaAttention(
(self): RobertaSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): RobertaSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): RobertaIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): RobertaOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): RobertaPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-04-10 13:00:16,300 ----------------------------------------------------------------------------------------------------
2023-04-10 13:00:16,302 Corpus: "Corpus: 12554 train + 4549 dev + 4505 test sentences"
2023-04-10 13:00:16,303 ----------------------------------------------------------------------------------------------------
2023-04-10 13:00:16,305 Parameters:
2023-04-10 13:00:16,305 - learning_rate: "0.000005"
2023-04-10 13:00:16,307 - mini_batch_size: "16"
2023-04-10 13:00:16,308 - patience: "3"
2023-04-10 13:00:16,310 - anneal_factor: "0.5"
2023-04-10 13:00:16,312 - max_epochs: "20"
2023-04-10 13:00:16,313 - shuffle: "True"
2023-04-10 13:00:16,315 - train_with_dev: "False"
2023-04-10 13:00:16,316 - batch_growth_annealing: "False"
2023-04-10 13:00:16,317 ----------------------------------------------------------------------------------------------------
2023-04-10 13:00:16,319 Model training base path: "CREBMSP_results"
2023-04-10 13:00:16,320 ----------------------------------------------------------------------------------------------------
2023-04-10 13:00:16,322 Device: cuda
2023-04-10 13:00:16,323 ----------------------------------------------------------------------------------------------------
2023-04-10 13:00:16,329 Embeddings storage mode: none
2023-04-10 13:00:16,329 ----------------------------------------------------------------------------------------------------
2023-04-10 13:01:05,878 epoch 1 - iter 78/785 - loss 2.72638388 - time (sec): 49.55 - samples/sec: 640.40 - lr: 0.000000
2023-04-10 13:01:51,226 epoch 1 - iter 156/785 - loss 2.67369637 - time (sec): 94.89 - samples/sec: 661.65 - lr: 0.000000
2023-04-10 13:02:36,982 epoch 1 - iter 234/785 - loss 2.59907317 - time (sec): 140.65 - samples/sec: 673.76 - lr: 0.000001
2023-04-10 13:03:22,714 epoch 1 - iter 312/785 - loss 2.54109267 - time (sec): 186.38 - samples/sec: 607.84 - lr: 0.000001
2023-04-10 13:04:08,546 epoch 1 - iter 390/785 - loss 2.44853293 - time (sec): 232.21 - samples/sec: 553.27 - lr: 0.000001
2023-04-10 13:04:54,366 epoch 1 - iter 468/785 - loss 2.33490476 - time (sec): 278.03 - samples/sec: 516.47 - lr: 0.000001
2023-04-10 13:05:39,909 epoch 1 - iter 546/785 - loss 2.23039567 - time (sec): 323.58 - samples/sec: 493.22 - lr: 0.000002
2023-04-10 13:06:25,708 epoch 1 - iter 624/785 - loss 2.13753946 - time (sec): 369.38 - samples/sec: 474.69 - lr: 0.000002
2023-04-10 13:07:11,470 epoch 1 - iter 702/785 - loss 2.05023541 - time (sec): 415.14 - samples/sec: 459.71 - lr: 0.000002
2023-04-10 13:07:57,005 epoch 1 - iter 780/785 - loss 1.96306759 - time (sec): 460.67 - samples/sec: 449.14 - lr: 0.000002
2023-04-10 13:07:59,739 ----------------------------------------------------------------------------------------------------
2023-04-10 13:07:59,740 EPOCH 1 done: loss 1.9575 - lr 0.000002
2023-04-10 13:08:24,560 Evaluating as a multi-label problem: False
2023-04-10 13:08:24,631 DEV : loss 0.7945712804794312 - f1-score (micro avg) 0.1859
2023-04-10 13:08:24,716 ----------------------------------------------------------------------------------------------------
2023-04-10 13:09:10,697 epoch 2 - iter 78/785 - loss 0.80433163 - time (sec): 45.98 - samples/sec: 433.02 - lr: 0.000003
2023-04-10 13:09:56,456 epoch 2 - iter 156/785 - loss 0.77925513 - time (sec): 91.74 - samples/sec: 446.90 - lr: 0.000003
2023-04-10 13:10:42,110 epoch 2 - iter 234/785 - loss 0.75449825 - time (sec): 137.39 - samples/sec: 451.07 - lr: 0.000003
2023-04-10 13:11:28,041 epoch 2 - iter 312/785 - loss 0.73931223 - time (sec): 183.32 - samples/sec: 452.65 - lr: 0.000003
2023-04-10 13:12:13,619 epoch 2 - iter 390/785 - loss 0.71775506 - time (sec): 228.90 - samples/sec: 453.04 - lr: 0.000004
2023-04-10 13:12:59,428 epoch 2 - iter 468/785 - loss 0.70077066 - time (sec): 274.71 - samples/sec: 454.23 - lr: 0.000004
2023-04-10 13:13:45,255 epoch 2 - iter 546/785 - loss 0.67654616 - time (sec): 320.54 - samples/sec: 453.06 - lr: 0.000004
2023-04-10 13:14:31,147 epoch 2 - iter 624/785 - loss 0.65446315 - time (sec): 366.43 - samples/sec: 452.84 - lr: 0.000004
2023-04-10 13:15:17,010 epoch 2 - iter 702/785 - loss 0.63531321 - time (sec): 412.29 - samples/sec: 453.08 - lr: 0.000005
2023-04-10 13:16:02,846 epoch 2 - iter 780/785 - loss 0.61564112 - time (sec): 458.13 - samples/sec: 451.65 - lr: 0.000005
2023-04-10 13:16:05,578 ----------------------------------------------------------------------------------------------------
2023-04-10 13:16:05,580 EPOCH 2 done: loss 0.6148 - lr 0.000005
2023-04-10 13:16:31,519 Evaluating as a multi-label problem: False
2023-04-10 13:16:31,599 DEV : loss 0.3734082877635956 - f1-score (micro avg) 0.6995
2023-04-10 13:16:31,683 ----------------------------------------------------------------------------------------------------
2023-04-10 13:17:17,398 epoch 3 - iter 78/785 - loss 0.42211544 - time (sec): 45.71 - samples/sec: 460.06 - lr: 0.000005
2023-04-10 13:18:03,476 epoch 3 - iter 156/785 - loss 0.39062543 - time (sec): 91.79 - samples/sec: 450.57 - lr: 0.000005
2023-04-10 13:18:49,456 epoch 3 - iter 234/785 - loss 0.38367027 - time (sec): 137.77 - samples/sec: 448.39 - lr: 0.000005
2023-04-10 13:19:35,061 epoch 3 - iter 312/785 - loss 0.37454659 - time (sec): 183.38 - samples/sec: 450.10 - lr: 0.000005
2023-04-10 13:20:20,826 epoch 3 - iter 390/785 - loss 0.36558572 - time (sec): 229.14 - samples/sec: 448.39 - lr: 0.000005
2023-04-10 13:21:06,862 epoch 3 - iter 468/785 - loss 0.36016623 - time (sec): 275.18 - samples/sec: 450.31 - lr: 0.000005
2023-04-10 13:21:52,802 epoch 3 - iter 546/785 - loss 0.35301531 - time (sec): 321.12 - samples/sec: 450.40 - lr: 0.000005
2023-04-10 13:22:38,631 epoch 3 - iter 624/785 - loss 0.34785227 - time (sec): 366.95 - samples/sec: 451.31 - lr: 0.000005
2023-04-10 13:23:24,542 epoch 3 - iter 702/785 - loss 0.34289183 - time (sec): 412.86 - samples/sec: 450.14 - lr: 0.000005
2023-04-10 13:24:10,698 epoch 3 - iter 780/785 - loss 0.33617042 - time (sec): 459.01 - samples/sec: 450.47 - lr: 0.000005
2023-04-10 13:24:13,424 ----------------------------------------------------------------------------------------------------
2023-04-10 13:24:13,425 EPOCH 3 done: loss 0.3359 - lr 0.000005
2023-04-10 13:24:39,219 Evaluating as a multi-label problem: False
2023-04-10 13:24:39,294 DEV : loss 0.2478274405002594 - f1-score (micro avg) 0.7669
2023-04-10 13:24:39,378 ----------------------------------------------------------------------------------------------------
2023-04-10 13:25:25,057 epoch 4 - iter 78/785 - loss 0.24856806 - time (sec): 45.68 - samples/sec: 460.40 - lr: 0.000005
2023-04-10 13:26:11,029 epoch 4 - iter 156/785 - loss 0.24520289 - time (sec): 91.65 - samples/sec: 457.73 - lr: 0.000005
2023-04-10 13:26:56,833 epoch 4 - iter 234/785 - loss 0.24918267 - time (sec): 137.45 - samples/sec: 446.42 - lr: 0.000005
2023-04-10 13:27:42,768 epoch 4 - iter 312/785 - loss 0.24994835 - time (sec): 183.39 - samples/sec: 445.87 - lr: 0.000005
2023-04-10 13:28:28,729 epoch 4 - iter 390/785 - loss 0.24670791 - time (sec): 229.35 - samples/sec: 443.71 - lr: 0.000005
2023-04-10 13:29:14,517 epoch 4 - iter 468/785 - loss 0.24363947 - time (sec): 275.14 - samples/sec: 447.73 - lr: 0.000005
2023-04-10 13:30:00,442 epoch 4 - iter 546/785 - loss 0.24232958 - time (sec): 321.06 - samples/sec: 446.61 - lr: 0.000005
2023-04-10 13:30:46,468 epoch 4 - iter 624/785 - loss 0.23891458 - time (sec): 367.09 - samples/sec: 447.11 - lr: 0.000005
2023-04-10 13:31:32,246 epoch 4 - iter 702/785 - loss 0.23581434 - time (sec): 412.87 - samples/sec: 450.67 - lr: 0.000004
2023-04-10 13:32:18,461 epoch 4 - iter 780/785 - loss 0.23410588 - time (sec): 459.08 - samples/sec: 450.67 - lr: 0.000004
2023-04-10 13:32:21,126 ----------------------------------------------------------------------------------------------------
2023-04-10 13:32:21,128 EPOCH 4 done: loss 0.2340 - lr 0.000004
2023-04-10 13:32:46,956 Evaluating as a multi-label problem: False
2023-04-10 13:32:47,034 DEV : loss 0.21353298425674438 - f1-score (micro avg) 0.7899
2023-04-10 13:32:47,119 ----------------------------------------------------------------------------------------------------
2023-04-10 13:33:32,954 epoch 5 - iter 78/785 - loss 0.18971301 - time (sec): 45.83 - samples/sec: 455.71 - lr: 0.000004
2023-04-10 13:34:18,919 epoch 5 - iter 156/785 - loss 0.19396453 - time (sec): 91.80 - samples/sec: 453.89 - lr: 0.000004
2023-04-10 13:35:04,859 epoch 5 - iter 234/785 - loss 0.19108296 - time (sec): 137.74 - samples/sec: 453.20 - lr: 0.000004
2023-04-10 13:35:50,732 epoch 5 - iter 312/785 - loss 0.18832768 - time (sec): 183.61 - samples/sec: 449.66 - lr: 0.000004
2023-04-10 13:36:36,467 epoch 5 - iter 390/785 - loss 0.18825695 - time (sec): 229.35 - samples/sec: 452.64 - lr: 0.000004
2023-04-10 13:37:22,590 epoch 5 - iter 468/785 - loss 0.18787454 - time (sec): 275.47 - samples/sec: 451.57 - lr: 0.000004
2023-04-10 13:38:08,477 epoch 5 - iter 546/785 - loss 0.18615161 - time (sec): 321.36 - samples/sec: 451.47 - lr: 0.000004
2023-04-10 13:38:54,257 epoch 5 - iter 624/785 - loss 0.18594722 - time (sec): 367.14 - samples/sec: 450.59 - lr: 0.000004
2023-04-10 13:39:40,394 epoch 5 - iter 702/785 - loss 0.18508805 - time (sec): 413.27 - samples/sec: 450.44 - lr: 0.000004
2023-04-10 13:40:26,352 epoch 5 - iter 780/785 - loss 0.18421189 - time (sec): 459.23 - samples/sec: 450.18 - lr: 0.000004
2023-04-10 13:40:29,092 ----------------------------------------------------------------------------------------------------
2023-04-10 13:40:29,093 EPOCH 5 done: loss 0.1843 - lr 0.000004
2023-04-10 13:40:54,758 Evaluating as a multi-label problem: False
2023-04-10 13:40:54,836 DEV : loss 0.19248297810554504 - f1-score (micro avg) 0.8091
2023-04-10 13:40:54,920 ----------------------------------------------------------------------------------------------------
2023-04-10 13:41:40,954 epoch 6 - iter 78/785 - loss 0.16169561 - time (sec): 46.03 - samples/sec: 440.82 - lr: 0.000004
2023-04-10 13:42:26,837 epoch 6 - iter 156/785 - loss 0.15868632 - time (sec): 91.92 - samples/sec: 441.25 - lr: 0.000004
2023-04-10 13:43:12,870 epoch 6 - iter 234/785 - loss 0.16192991 - time (sec): 137.95 - samples/sec: 443.43 - lr: 0.000004
2023-04-10 13:43:58,927 epoch 6 - iter 312/785 - loss 0.15837734 - time (sec): 184.01 - samples/sec: 445.62 - lr: 0.000004
2023-04-10 13:44:44,996 epoch 6 - iter 390/785 - loss 0.15549650 - time (sec): 230.07 - samples/sec: 442.88 - lr: 0.000004
2023-04-10 13:45:31,130 epoch 6 - iter 468/785 - loss 0.15509965 - time (sec): 276.21 - samples/sec: 440.68 - lr: 0.000004
2023-04-10 13:46:17,430 epoch 6 - iter 546/785 - loss 0.15536700 - time (sec): 322.51 - samples/sec: 444.17 - lr: 0.000004
2023-04-10 13:47:03,271 epoch 6 - iter 624/785 - loss 0.15596272 - time (sec): 368.35 - samples/sec: 447.46 - lr: 0.000004
2023-04-10 13:47:49,333 epoch 6 - iter 702/785 - loss 0.15470882 - time (sec): 414.41 - samples/sec: 446.90 - lr: 0.000004
2023-04-10 13:48:35,335 epoch 6 - iter 780/785 - loss 0.15353726 - time (sec): 460.41 - samples/sec: 449.07 - lr: 0.000004
2023-04-10 13:48:38,091 ----------------------------------------------------------------------------------------------------
2023-04-10 13:48:38,093 EPOCH 6 done: loss 0.1537 - lr 0.000004
2023-04-10 13:49:03,872 Evaluating as a multi-label problem: False
2023-04-10 13:49:03,948 DEV : loss 0.19085420668125153 - f1-score (micro avg) 0.8218
2023-04-10 13:49:04,033 ----------------------------------------------------------------------------------------------------
2023-04-10 13:49:49,709 epoch 7 - iter 78/785 - loss 0.13949250 - time (sec): 45.68 - samples/sec: 451.69 - lr: 0.000004
2023-04-10 13:50:35,575 epoch 7 - iter 156/785 - loss 0.13991533 - time (sec): 91.54 - samples/sec: 451.74 - lr: 0.000004
2023-04-10 13:51:21,688 epoch 7 - iter 234/785 - loss 0.13727018 - time (sec): 137.65 - samples/sec: 446.11 - lr: 0.000004
2023-04-10 13:52:07,734 epoch 7 - iter 312/785 - loss 0.13962965 - time (sec): 183.70 - samples/sec: 443.38 - lr: 0.000004
2023-04-10 13:52:53,850 epoch 7 - iter 390/785 - loss 0.13871141 - time (sec): 229.82 - samples/sec: 444.60 - lr: 0.000004
2023-04-10 13:53:39,863 epoch 7 - iter 468/785 - loss 0.13783456 - time (sec): 275.83 - samples/sec: 446.99 - lr: 0.000004
2023-04-10 13:54:25,630 epoch 7 - iter 546/785 - loss 0.13700803 - time (sec): 321.60 - samples/sec: 449.44 - lr: 0.000004
2023-04-10 13:55:11,589 epoch 7 - iter 624/785 - loss 0.13522205 - time (sec): 367.55 - samples/sec: 450.51 - lr: 0.000004
2023-04-10 13:55:57,520 epoch 7 - iter 702/785 - loss 0.13499711 - time (sec): 413.49 - samples/sec: 449.78 - lr: 0.000004
2023-04-10 13:56:43,432 epoch 7 - iter 780/785 - loss 0.13300228 - time (sec): 459.40 - samples/sec: 450.08 - lr: 0.000004
2023-04-10 13:56:46,156 ----------------------------------------------------------------------------------------------------
2023-04-10 13:56:46,157 EPOCH 7 done: loss 0.1327 - lr 0.000004
2023-04-10 13:57:11,974 Evaluating as a multi-label problem: False
2023-04-10 13:57:12,051 DEV : loss 0.188308447599411 - f1-score (micro avg) 0.8331
2023-04-10 13:57:12,135 ----------------------------------------------------------------------------------------------------
2023-04-10 13:57:58,126 epoch 8 - iter 78/785 - loss 0.12315730 - time (sec): 45.99 - samples/sec: 450.64 - lr: 0.000004
2023-04-10 13:58:44,253 epoch 8 - iter 156/785 - loss 0.11813120 - time (sec): 92.12 - samples/sec: 436.42 - lr: 0.000004
2023-04-10 13:59:30,059 epoch 8 - iter 234/785 - loss 0.11978297 - time (sec): 137.92 - samples/sec: 444.20 - lr: 0.000004
2023-04-10 14:00:16,019 epoch 8 - iter 312/785 - loss 0.11865398 - time (sec): 183.88 - samples/sec: 448.73 - lr: 0.000004
2023-04-10 14:01:01,667 epoch 8 - iter 390/785 - loss 0.11649927 - time (sec): 229.53 - samples/sec: 449.84 - lr: 0.000003
2023-04-10 14:01:47,632 epoch 8 - iter 468/785 - loss 0.11678690 - time (sec): 275.50 - samples/sec: 450.66 - lr: 0.000003
2023-04-10 14:02:33,760 epoch 8 - iter 546/785 - loss 0.11768582 - time (sec): 321.62 - samples/sec: 451.69 - lr: 0.000003
2023-04-10 14:03:19,469 epoch 8 - iter 624/785 - loss 0.11698818 - time (sec): 367.33 - samples/sec: 450.46 - lr: 0.000003
2023-04-10 14:04:05,350 epoch 8 - iter 702/785 - loss 0.11655166 - time (sec): 413.21 - samples/sec: 449.45 - lr: 0.000003
2023-04-10 14:04:51,185 epoch 8 - iter 780/785 - loss 0.11644185 - time (sec): 459.05 - samples/sec: 450.72 - lr: 0.000003
2023-04-10 14:04:53,872 ----------------------------------------------------------------------------------------------------
2023-04-10 14:04:53,874 EPOCH 8 done: loss 0.1164 - lr 0.000003
2023-04-10 14:05:19,663 Evaluating as a multi-label problem: False
2023-04-10 14:05:19,743 DEV : loss 0.18473857641220093 - f1-score (micro avg) 0.8406
2023-04-10 14:05:19,828 ----------------------------------------------------------------------------------------------------
2023-04-10 14:06:05,453 epoch 9 - iter 78/785 - loss 0.10657376 - time (sec): 45.62 - samples/sec: 440.02 - lr: 0.000003
2023-04-10 14:06:51,368 epoch 9 - iter 156/785 - loss 0.10863598 - time (sec): 91.54 - samples/sec: 439.74 - lr: 0.000003
2023-04-10 14:07:36,988 epoch 9 - iter 234/785 - loss 0.10633920 - time (sec): 137.16 - samples/sec: 445.35 - lr: 0.000003
2023-04-10 14:08:22,898 epoch 9 - iter 312/785 - loss 0.10460097 - time (sec): 183.07 - samples/sec: 446.32 - lr: 0.000003
2023-04-10 14:09:08,636 epoch 9 - iter 390/785 - loss 0.10531387 - time (sec): 228.81 - samples/sec: 446.18 - lr: 0.000003
2023-04-10 14:09:54,238 epoch 9 - iter 468/785 - loss 0.10648494 - time (sec): 274.41 - samples/sec: 446.35 - lr: 0.000003
2023-04-10 14:10:39,806 epoch 9 - iter 546/785 - loss 0.10488251 - time (sec): 319.98 - samples/sec: 448.60 - lr: 0.000003
2023-04-10 14:11:25,286 epoch 9 - iter 624/785 - loss 0.10527523 - time (sec): 365.46 - samples/sec: 450.60 - lr: 0.000003
2023-04-10 14:12:10,986 epoch 9 - iter 702/785 - loss 0.10473876 - time (sec): 411.16 - samples/sec: 451.31 - lr: 0.000003
2023-04-10 14:12:56,700 epoch 9 - iter 780/785 - loss 0.10399221 - time (sec): 456.87 - samples/sec: 452.89 - lr: 0.000003
2023-04-10 14:12:59,380 ----------------------------------------------------------------------------------------------------
2023-04-10 14:12:59,382 EPOCH 9 done: loss 0.1042 - lr 0.000003
2023-04-10 14:13:25,061 Evaluating as a multi-label problem: False
2023-04-10 14:13:25,138 DEV : loss 0.19332602620124817 - f1-score (micro avg) 0.8443
2023-04-10 14:13:25,224 ----------------------------------------------------------------------------------------------------
2023-04-10 14:14:10,842 epoch 10 - iter 78/785 - loss 0.09722209 - time (sec): 45.62 - samples/sec: 457.53 - lr: 0.000003
2023-04-10 14:14:56,574 epoch 10 - iter 156/785 - loss 0.09960375 - time (sec): 91.35 - samples/sec: 452.13 - lr: 0.000003
2023-04-10 14:15:42,382 epoch 10 - iter 234/785 - loss 0.09791734 - time (sec): 137.16 - samples/sec: 451.64 - lr: 0.000003
2023-04-10 14:16:28,359 epoch 10 - iter 312/785 - loss 0.09533145 - time (sec): 183.13 - samples/sec: 455.08 - lr: 0.000003
2023-04-10 14:17:13,898 epoch 10 - iter 390/785 - loss 0.09546462 - time (sec): 228.67 - samples/sec: 455.61 - lr: 0.000003
2023-04-10 14:17:59,305 epoch 10 - iter 468/785 - loss 0.09469376 - time (sec): 274.08 - samples/sec: 452.76 - lr: 0.000003
2023-04-10 14:18:44,944 epoch 10 - iter 546/785 - loss 0.09461890 - time (sec): 319.72 - samples/sec: 453.17 - lr: 0.000003
2023-04-10 14:19:30,380 epoch 10 - iter 624/785 - loss 0.09481641 - time (sec): 365.16 - samples/sec: 453.85 - lr: 0.000003
2023-04-10 14:20:15,968 epoch 10 - iter 702/785 - loss 0.09459196 - time (sec): 410.74 - samples/sec: 453.32 - lr: 0.000003
2023-04-10 14:21:01,721 epoch 10 - iter 780/785 - loss 0.09403540 - time (sec): 456.50 - samples/sec: 452.79 - lr: 0.000003
2023-04-10 14:21:04,407 ----------------------------------------------------------------------------------------------------
2023-04-10 14:21:04,409 EPOCH 10 done: loss 0.0938 - lr 0.000003
2023-04-10 14:21:30,389 Evaluating as a multi-label problem: False
2023-04-10 14:21:30,467 DEV : loss 0.18941430747509003 - f1-score (micro avg) 0.8458
2023-04-10 14:21:30,553 ----------------------------------------------------------------------------------------------------
2023-04-10 14:22:16,229 epoch 11 - iter 78/785 - loss 0.07965436 - time (sec): 45.67 - samples/sec: 469.58 - lr: 0.000003
2023-04-10 14:23:02,286 epoch 11 - iter 156/785 - loss 0.08242477 - time (sec): 91.73 - samples/sec: 467.46 - lr: 0.000003
2023-04-10 14:23:47,919 epoch 11 - iter 234/785 - loss 0.08405410 - time (sec): 137.36 - samples/sec: 461.18 - lr: 0.000003
2023-04-10 14:24:33,686 epoch 11 - iter 312/785 - loss 0.08238391 - time (sec): 183.13 - samples/sec: 455.77 - lr: 0.000003
2023-04-10 14:25:19,484 epoch 11 - iter 390/785 - loss 0.08149592 - time (sec): 228.93 - samples/sec: 453.93 - lr: 0.000003
2023-04-10 14:26:05,233 epoch 11 - iter 468/785 - loss 0.08168820 - time (sec): 274.68 - samples/sec: 452.09 - lr: 0.000003
2023-04-10 14:26:50,901 epoch 11 - iter 546/785 - loss 0.08177046 - time (sec): 320.35 - samples/sec: 452.25 - lr: 0.000003
2023-04-10 14:27:36,661 epoch 11 - iter 624/785 - loss 0.08271731 - time (sec): 366.11 - samples/sec: 452.99 - lr: 0.000003
2023-04-10 14:28:22,099 epoch 11 - iter 702/785 - loss 0.08254577 - time (sec): 411.54 - samples/sec: 452.31 - lr: 0.000003
2023-04-10 14:29:07,671 epoch 11 - iter 780/785 - loss 0.08371043 - time (sec): 457.12 - samples/sec: 453.09 - lr: 0.000003
2023-04-10 14:29:10,316 ----------------------------------------------------------------------------------------------------
2023-04-10 14:29:10,318 EPOCH 11 done: loss 0.0837 - lr 0.000003
2023-04-10 14:29:35,090 Evaluating as a multi-label problem: False
2023-04-10 14:29:35,166 DEV : loss 0.2022610902786255 - f1-score (micro avg) 0.8404
2023-04-10 14:29:35,259 ----------------------------------------------------------------------------------------------------
2023-04-10 14:30:21,283 epoch 12 - iter 78/785 - loss 0.06767359 - time (sec): 46.02 - samples/sec: 454.91 - lr: 0.000002
2023-04-10 14:31:07,146 epoch 12 - iter 156/785 - loss 0.07443837 - time (sec): 91.89 - samples/sec: 449.99 - lr: 0.000002
2023-04-10 14:31:53,003 epoch 12 - iter 234/785 - loss 0.07629224 - time (sec): 137.74 - samples/sec: 451.86 - lr: 0.000002
2023-04-10 14:32:38,934 epoch 12 - iter 312/785 - loss 0.07741157 - time (sec): 183.67 - samples/sec: 452.03 - lr: 0.000002
2023-04-10 14:33:24,791 epoch 12 - iter 390/785 - loss 0.07706257 - time (sec): 229.53 - samples/sec: 454.13 - lr: 0.000002
2023-04-10 14:34:10,546 epoch 12 - iter 468/785 - loss 0.07581749 - time (sec): 275.29 - samples/sec: 454.58 - lr: 0.000002
2023-04-10 14:34:56,353 epoch 12 - iter 546/785 - loss 0.07615371 - time (sec): 321.09 - samples/sec: 453.24 - lr: 0.000002
2023-04-10 14:35:41,715 epoch 12 - iter 624/785 - loss 0.07630547 - time (sec): 366.45 - samples/sec: 451.94 - lr: 0.000002
2023-04-10 14:36:27,902 epoch 12 - iter 702/785 - loss 0.07703151 - time (sec): 412.64 - samples/sec: 451.11 - lr: 0.000002
2023-04-10 14:37:13,513 epoch 12 - iter 780/785 - loss 0.07688972 - time (sec): 458.25 - samples/sec: 451.11 - lr: 0.000002
2023-04-10 14:37:16,212 ----------------------------------------------------------------------------------------------------
2023-04-10 14:37:16,214 EPOCH 12 done: loss 0.0769 - lr 0.000002
2023-04-10 14:37:42,267 Evaluating as a multi-label problem: False
2023-04-10 14:37:42,343 DEV : loss 0.19032613933086395 - f1-score (micro avg) 0.8513
2023-04-10 14:37:42,429 ----------------------------------------------------------------------------------------------------
2023-04-10 14:38:28,309 epoch 13 - iter 78/785 - loss 0.06781882 - time (sec): 45.88 - samples/sec: 437.28 - lr: 0.000002
2023-04-10 14:39:13,998 epoch 13 - iter 156/785 - loss 0.06953428 - time (sec): 91.57 - samples/sec: 442.07 - lr: 0.000002
2023-04-10 14:39:59,400 epoch 13 - iter 234/785 - loss 0.06968786 - time (sec): 136.97 - samples/sec: 447.35 - lr: 0.000002
2023-04-10 14:40:45,242 epoch 13 - iter 312/785 - loss 0.07032229 - time (sec): 182.81 - samples/sec: 449.08 - lr: 0.000002
2023-04-10 14:41:30,932 epoch 13 - iter 390/785 - loss 0.07052987 - time (sec): 228.50 - samples/sec: 445.56 - lr: 0.000002
2023-04-10 14:42:16,884 epoch 13 - iter 468/785 - loss 0.07176712 - time (sec): 274.45 - samples/sec: 444.54 - lr: 0.000002
2023-04-10 14:43:02,911 epoch 13 - iter 546/785 - loss 0.07183614 - time (sec): 320.48 - samples/sec: 446.39 - lr: 0.000002
2023-04-10 14:43:48,816 epoch 13 - iter 624/785 - loss 0.07253765 - time (sec): 366.39 - samples/sec: 446.93 - lr: 0.000002
2023-04-10 14:44:34,491 epoch 13 - iter 702/785 - loss 0.07213498 - time (sec): 412.06 - samples/sec: 449.17 - lr: 0.000002
2023-04-10 14:45:20,007 epoch 13 - iter 780/785 - loss 0.07218568 - time (sec): 457.58 - samples/sec: 451.86 - lr: 0.000002
2023-04-10 14:45:22,772 ----------------------------------------------------------------------------------------------------
2023-04-10 14:45:22,774 EPOCH 13 done: loss 0.0722 - lr 0.000002
2023-04-10 14:45:48,608 Evaluating as a multi-label problem: False
2023-04-10 14:45:48,685 DEV : loss 0.19682374596595764 - f1-score (micro avg) 0.853
2023-04-10 14:45:48,772 ----------------------------------------------------------------------------------------------------
2023-04-10 14:46:34,526 epoch 14 - iter 78/785 - loss 0.05882194 - time (sec): 45.75 - samples/sec: 442.48 - lr: 0.000002
2023-04-10 14:47:20,308 epoch 14 - iter 156/785 - loss 0.06553124 - time (sec): 91.53 - samples/sec: 446.65 - lr: 0.000002
2023-04-10 14:48:06,130 epoch 14 - iter 234/785 - loss 0.06636154 - time (sec): 137.36 - samples/sec: 445.02 - lr: 0.000002
2023-04-10 14:48:51,621 epoch 14 - iter 312/785 - loss 0.06544912 - time (sec): 182.85 - samples/sec: 448.03 - lr: 0.000002
2023-04-10 14:49:37,323 epoch 14 - iter 390/785 - loss 0.06512617 - time (sec): 228.55 - samples/sec: 448.79 - lr: 0.000002
2023-04-10 14:50:23,228 epoch 14 - iter 468/785 - loss 0.06536846 - time (sec): 274.46 - samples/sec: 448.59 - lr: 0.000002
2023-04-10 14:51:08,762 epoch 14 - iter 546/785 - loss 0.06540547 - time (sec): 319.99 - samples/sec: 450.40 - lr: 0.000002
2023-04-10 14:51:54,701 epoch 14 - iter 624/785 - loss 0.06641531 - time (sec): 365.93 - samples/sec: 448.39 - lr: 0.000002
2023-04-10 14:52:40,613 epoch 14 - iter 702/785 - loss 0.06649606 - time (sec): 411.84 - samples/sec: 449.74 - lr: 0.000002
2023-04-10 14:53:26,281 epoch 14 - iter 780/785 - loss 0.06663863 - time (sec): 457.51 - samples/sec: 452.11 - lr: 0.000002
2023-04-10 14:53:29,011 ----------------------------------------------------------------------------------------------------
2023-04-10 14:53:29,013 EPOCH 14 done: loss 0.0665 - lr 0.000002
2023-04-10 14:53:54,922 Evaluating as a multi-label problem: False
2023-04-10 14:53:54,995 DEV : loss 0.19152763485908508 - f1-score (micro avg) 0.8543
2023-04-10 14:53:55,084 ----------------------------------------------------------------------------------------------------
2023-04-10 14:54:40,977 epoch 15 - iter 78/785 - loss 0.05893628 - time (sec): 45.89 - samples/sec: 434.64 - lr: 0.000002
2023-04-10 14:55:26,530 epoch 15 - iter 156/785 - loss 0.06296731 - time (sec): 91.44 - samples/sec: 438.22 - lr: 0.000002
2023-04-10 14:56:11,919 epoch 15 - iter 234/785 - loss 0.06296709 - time (sec): 136.83 - samples/sec: 449.49 - lr: 0.000002
2023-04-10 14:56:57,537 epoch 15 - iter 312/785 - loss 0.06042086 - time (sec): 182.45 - samples/sec: 452.57 - lr: 0.000002
2023-04-10 14:57:43,221 epoch 15 - iter 390/785 - loss 0.06400154 - time (sec): 228.13 - samples/sec: 453.81 - lr: 0.000002
2023-04-10 14:58:29,151 epoch 15 - iter 468/785 - loss 0.06384428 - time (sec): 274.07 - samples/sec: 450.67 - lr: 0.000002
2023-04-10 14:59:14,871 epoch 15 - iter 546/785 - loss 0.06211280 - time (sec): 319.79 - samples/sec: 452.05 - lr: 0.000001
2023-04-10 15:00:00,212 epoch 15 - iter 624/785 - loss 0.06312445 - time (sec): 365.13 - samples/sec: 453.40 - lr: 0.000001
2023-04-10 15:00:46,083 epoch 15 - iter 702/785 - loss 0.06365620 - time (sec): 411.00 - samples/sec: 453.08 - lr: 0.000001
2023-04-10 15:01:31,831 epoch 15 - iter 780/785 - loss 0.06357545 - time (sec): 456.74 - samples/sec: 452.90 - lr: 0.000001
2023-04-10 15:01:34,610 ----------------------------------------------------------------------------------------------------
2023-04-10 15:01:34,612 EPOCH 15 done: loss 0.0635 - lr 0.000001
2023-04-10 15:02:00,571 Evaluating as a multi-label problem: False
2023-04-10 15:02:00,650 DEV : loss 0.19623318314552307 - f1-score (micro avg) 0.8562
2023-04-10 15:02:00,739 ----------------------------------------------------------------------------------------------------
2023-04-10 15:02:46,516 epoch 16 - iter 78/785 - loss 0.05263069 - time (sec): 45.78 - samples/sec: 447.66 - lr: 0.000001
2023-04-10 15:03:32,217 epoch 16 - iter 156/785 - loss 0.05540555 - time (sec): 91.48 - samples/sec: 458.04 - lr: 0.000001
2023-04-10 15:04:17,802 epoch 16 - iter 234/785 - loss 0.05653095 - time (sec): 137.06 - samples/sec: 454.61 - lr: 0.000001
2023-04-10 15:05:03,756 epoch 16 - iter 312/785 - loss 0.05690468 - time (sec): 183.01 - samples/sec: 453.77 - lr: 0.000001
2023-04-10 15:05:49,407 epoch 16 - iter 390/785 - loss 0.05848835 - time (sec): 228.67 - samples/sec: 454.04 - lr: 0.000001
2023-04-10 15:06:35,060 epoch 16 - iter 468/785 - loss 0.05897047 - time (sec): 274.32 - samples/sec: 453.35 - lr: 0.000001
2023-04-10 15:07:20,765 epoch 16 - iter 546/785 - loss 0.05940641 - time (sec): 320.02 - samples/sec: 452.29 - lr: 0.000001
2023-04-10 15:08:06,518 epoch 16 - iter 624/785 - loss 0.05878874 - time (sec): 365.78 - samples/sec: 452.31 - lr: 0.000001
2023-04-10 15:08:52,406 epoch 16 - iter 702/785 - loss 0.05878710 - time (sec): 411.67 - samples/sec: 452.43 - lr: 0.000001
2023-04-10 15:09:38,261 epoch 16 - iter 780/785 - loss 0.05871527 - time (sec): 457.52 - samples/sec: 452.47 - lr: 0.000001
2023-04-10 15:09:41,139 ----------------------------------------------------------------------------------------------------
2023-04-10 15:09:41,141 EPOCH 16 done: loss 0.0587 - lr 0.000001
2023-04-10 15:10:06,206 Evaluating as a multi-label problem: False
2023-04-10 15:10:06,282 DEV : loss 0.19955378770828247 - f1-score (micro avg) 0.8578
2023-04-10 15:10:06,370 ----------------------------------------------------------------------------------------------------
2023-04-10 15:10:52,361 epoch 17 - iter 78/785 - loss 0.05076330 - time (sec): 45.99 - samples/sec: 462.06 - lr: 0.000001
2023-04-10 15:11:38,184 epoch 17 - iter 156/785 - loss 0.05519241 - time (sec): 91.81 - samples/sec: 462.21 - lr: 0.000001
2023-04-10 15:12:24,115 epoch 17 - iter 234/785 - loss 0.05342529 - time (sec): 137.74 - samples/sec: 457.55 - lr: 0.000001
2023-04-10 15:13:09,882 epoch 17 - iter 312/785 - loss 0.05189467 - time (sec): 183.51 - samples/sec: 455.00 - lr: 0.000001
2023-04-10 15:13:55,976 epoch 17 - iter 390/785 - loss 0.05405067 - time (sec): 229.60 - samples/sec: 453.17 - lr: 0.000001
2023-04-10 15:14:41,579 epoch 17 - iter 468/785 - loss 0.05398715 - time (sec): 275.21 - samples/sec: 453.21 - lr: 0.000001
2023-04-10 15:15:27,308 epoch 17 - iter 546/785 - loss 0.05539713 - time (sec): 320.94 - samples/sec: 454.08 - lr: 0.000001
2023-04-10 15:16:13,512 epoch 17 - iter 624/785 - loss 0.05586570 - time (sec): 367.14 - samples/sec: 453.67 - lr: 0.000001
2023-04-10 15:16:59,624 epoch 17 - iter 702/785 - loss 0.05576616 - time (sec): 413.25 - samples/sec: 452.94 - lr: 0.000001
2023-04-10 15:17:45,460 epoch 17 - iter 780/785 - loss 0.05531521 - time (sec): 459.09 - samples/sec: 450.39 - lr: 0.000001
2023-04-10 15:17:48,168 ----------------------------------------------------------------------------------------------------
2023-04-10 15:17:48,170 EPOCH 17 done: loss 0.0553 - lr 0.000001
2023-04-10 15:18:14,080 Evaluating as a multi-label problem: False
2023-04-10 15:18:14,155 DEV : loss 0.20788049697875977 - f1-score (micro avg) 0.8562
2023-04-10 15:18:14,243 ----------------------------------------------------------------------------------------------------
2023-04-10 15:18:48,097 ----------------------------------------------------------------------------------------------------
2023-04-10 15:18:48,099 Exiting from training early.
2023-04-10 15:18:48,100 Saving model ...
2023-04-10 15:18:48,949 Done.
2023-04-10 15:18:48,952 ----------------------------------------------------------------------------------------------------
2023-04-10 15:18:48,954 Testing using last state of model ...
2023-04-10 15:19:14,468 Evaluating as a multi-label problem: False
2023-04-10 15:19:14,541 0.8346 0.868 0.851 0.7477
2023-04-10 15:19:14,543
Results:
- F-score (micro) 0.851
- F-score (macro) 0.8197
- Accuracy 0.7477
By class:
precision recall f1-score support
PROC 0.8033 0.8731 0.8368 3364
DISO 0.8552 0.8722 0.8636 2472
CHEM 0.8973 0.8933 0.8953 1565
ANAT 0.7138 0.6551 0.6832 316
micro avg 0.8346 0.8680 0.8510 7717
macro avg 0.8174 0.8234 0.8197 7717
weighted avg 0.8353 0.8680 0.8509 7717
2023-04-10 15:19:14,544 ----------------------------------------------------------------------------------------------------