vse-infty / coco_wsl_grid_bert /test_log.txt
cccjc's picture
add weights
f4fb17d
2020-10-02 01:27:56,847 loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/tiger/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2020-10-02 01:27:58,327 Did not load checkpoints
2020-10-02 01:27:58,328 Resnet backbone now has fixed blocks 2
2020-10-02 01:27:59,708 loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/tiger/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517
2020-10-02 01:27:59,708 Model config {
"architectures": [
"BertForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"finetuning_task": null,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"num_labels": 2,
"output_attentions": false,
"output_hidden_states": false,
"pad_token_id": 0,
"pruned_heads": {},
"torchscript": false,
"type_vocab_size": 2,
"use_bfloat16": false,
"vocab_size": 30522
}
2020-10-02 01:28:01,097 loading weights file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin from cache at /home/tiger/.cache/torch/transformers/aa1ef1aede4482d0dbcd4d52baad8ae300e60902e88fcb0bebdec09afd232066.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157
2020-10-02 01:28:03,256 Use adam as the optimizer, with init lr 0.0005
2020-10-02 01:28:03,257 Image encoder is data paralleled now.
2020-10-02 01:28:03,400 Load full model with backbone
2020-10-02 01:28:03,402 Loading dataset
2020-10-02 01:28:07,517 Input mode small: scaled by factor 2.0
2020-10-02 01:28:14,586 Computing results...
2020-10-02 01:29:11,923 Test: [0/196] Le 63.6659 (63.6659) Time 57.334 (0.000)
2020-10-02 01:29:19,880 Test: [10/196] Le 60.8184 (62.7525) Time 0.850 (0.000)
2020-10-02 01:29:28,004 Test: [20/196] Le 64.4445 (62.7450) Time 0.781 (0.000)
2020-10-02 01:29:35,944 Test: [30/196] Le 64.8506 (62.9363) Time 0.825 (0.000)
2020-10-02 01:29:43,981 Test: [40/196] Le 63.3589 (63.1086) Time 0.791 (0.000)
2020-10-02 01:29:51,976 Test: [50/196] Le 64.0212 (63.1067) Time 0.816 (0.000)
2020-10-02 01:29:59,972 Test: [60/196] Le 61.2870 (62.9668) Time 0.794 (0.000)
2020-10-02 01:30:07,943 Test: [70/196] Le 59.9160 (62.8829) Time 0.826 (0.000)
2020-10-02 01:30:15,991 Test: [80/196] Le 63.5098 (63.0116) Time 0.785 (0.000)
2020-10-02 01:30:23,929 Test: [90/196] Le 65.1759 (63.0592) Time 0.811 (0.000)
2020-10-02 01:30:32,017 Test: [100/196] Le 62.5716 (63.0479) Time 0.792 (0.000)
2020-10-02 01:30:40,103 Test: [110/196] Le 60.8049 (63.0152) Time 0.814 (0.000)
2020-10-02 01:30:48,234 Test: [120/196] Le 64.6984 (63.0478) Time 0.788 (0.000)
2020-10-02 01:30:56,220 Test: [130/196] Le 66.0548 (63.0548) Time 0.826 (0.000)
2020-10-02 01:31:04,246 Test: [140/196] Le 68.9694 (63.0795) Time 0.818 (0.000)
2020-10-02 01:31:12,168 Test: [150/196] Le 63.2262 (63.0947) Time 0.801 (0.000)
2020-10-02 01:31:20,223 Test: [160/196] Le 61.3962 (63.0856) Time 0.787 (0.000)
2020-10-02 01:31:28,184 Test: [170/196] Le 62.4964 (63.0277) Time 0.840 (0.000)
2020-10-02 01:31:36,369 Test: [180/196] Le 62.8228 (63.0423) Time 0.840 (0.000)
2020-10-02 01:31:44,250 Test: [190/196] Le 62.2450 (62.9804) Time 0.785 (0.000)
2020-10-02 01:31:53,422 Images: 5000, Captions: 25000
2020-10-02 01:32:23,263 Align loss: 0.8650665053327701
2020-10-02 01:32:23,263 Image uniform loss: -3.7929412346140463
2020-10-02 01:32:23,263 Text uniform loss: -3.8632661255327365
2020-10-02 01:32:23,347 calculate similarity time:
2020-10-02 01:32:23,685 Image to text: 86.9, 98.9, 99.6, 1.0, 1.3
2020-10-02 01:32:23,972 Text to image: 74.0, 94.7, 97.5, 1.0, 3.0
2020-10-02 01:32:23,972 rsum: 551.6 ar: 95.1 ari: 88.7
2020-10-02 01:32:24,041 calculate similarity time:
2020-10-02 01:32:24,378 Image to text: 83.3, 97.7, 99.0, 1.0, 1.5
2020-10-02 01:32:24,666 Text to image: 71.5, 93.4, 97.4, 1.0, 3.3
2020-10-02 01:32:24,667 rsum: 542.3 ar: 93.3 ari: 87.4
2020-10-02 01:32:24,732 calculate similarity time:
2020-10-02 01:32:25,070 Image to text: 85.0, 98.1, 99.7, 1.0, 1.4
2020-10-02 01:32:25,358 Text to image: 72.2, 93.7, 97.2, 1.0, 3.6
2020-10-02 01:32:25,358 rsum: 545.9 ar: 94.3 ari: 87.7
2020-10-02 01:32:25,424 calculate similarity time:
2020-10-02 01:32:25,761 Image to text: 83.3, 97.4, 99.2, 1.0, 1.4
2020-10-02 01:32:26,049 Text to image: 69.7, 93.5, 97.5, 1.0, 2.7
2020-10-02 01:32:26,049 rsum: 540.7 ar: 93.3 ari: 86.9
2020-10-02 01:32:26,117 calculate similarity time:
2020-10-02 01:32:26,454 Image to text: 84.0, 98.3, 99.6, 1.0, 1.4
2020-10-02 01:32:26,741 Text to image: 72.8, 94.2, 97.7, 1.0, 3.1
2020-10-02 01:32:26,741 rsum: 546.6 ar: 94.0 ari: 88.2
2020-10-02 01:32:26,741 -----------------------------------
2020-10-02 01:32:26,741 Mean metrics:
2020-10-02 01:32:26,741 rsum: 545.4
2020-10-02 01:32:26,741 Average i2t Recall: 94.0
2020-10-02 01:32:26,741 Image to text: 84.5 98.1 99.4 1.0 1.4
2020-10-02 01:32:26,741 Average t2i Recall: 87.8
2020-10-02 01:32:26,741 Text to image: 72.0 93.9 97.5 1.0 3.1
2020-10-02 01:32:28,728 loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/tiger/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2020-10-02 01:32:30,167 Did not load checkpoints
2020-10-02 01:32:30,169 Resnet backbone now has fixed blocks 2
2020-10-02 01:32:31,618 loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/tiger/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517
2020-10-02 01:32:31,619 Model config {
"architectures": [
"BertForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"finetuning_task": null,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"num_labels": 2,
"output_attentions": false,
"output_hidden_states": false,
"pad_token_id": 0,
"pruned_heads": {},
"torchscript": false,
"type_vocab_size": 2,
"use_bfloat16": false,
"vocab_size": 30522
}
2020-10-02 01:32:32,949 loading weights file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin from cache at /home/tiger/.cache/torch/transformers/aa1ef1aede4482d0dbcd4d52baad8ae300e60902e88fcb0bebdec09afd232066.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157
2020-10-02 01:32:35,071 Use adam as the optimizer, with init lr 0.0005
2020-10-02 01:32:35,072 Image encoder is data paralleled now.
2020-10-02 01:32:35,210 Load full model with backbone
2020-10-02 01:32:35,213 Loading dataset
2020-10-02 01:32:38,512 Input mode small: scaled by factor 2.0
2020-10-02 01:32:47,138 Computing results...
2020-10-02 01:33:05,339 Test: [0/196] Le 63.6659 (63.6659) Time 18.197 (0.000)
2020-10-02 01:33:14,825 Test: [10/196] Le 60.8184 (62.7525) Time 0.847 (0.000)
2020-10-02 01:33:22,994 Test: [20/196] Le 64.4445 (62.7450) Time 0.791 (0.000)
2020-10-02 01:33:30,959 Test: [30/196] Le 64.8506 (62.9363) Time 0.820 (0.000)
2020-10-02 01:33:39,122 Test: [40/196] Le 63.3589 (63.1086) Time 0.793 (0.000)
2020-10-02 01:33:47,162 Test: [50/196] Le 64.0212 (63.1067) Time 0.833 (0.000)
2020-10-02 01:33:55,256 Test: [60/196] Le 61.2870 (62.9668) Time 0.788 (0.000)
2020-10-02 01:34:03,214 Test: [70/196] Le 59.9160 (62.8829) Time 0.813 (0.000)
2020-10-02 01:34:11,474 Test: [80/196] Le 63.5098 (63.0116) Time 0.813 (0.000)
2020-10-02 01:34:19,532 Test: [90/196] Le 65.1759 (63.0592) Time 0.820 (0.000)
2020-10-02 01:34:27,738 Test: [100/196] Le 62.5716 (63.0479) Time 0.813 (0.000)
2020-10-02 01:34:35,754 Test: [110/196] Le 60.8049 (63.0152) Time 0.831 (0.000)
2020-10-02 01:34:43,977 Test: [120/196] Le 64.6984 (63.0478) Time 0.794 (0.000)
2020-10-02 01:34:52,081 Test: [130/196] Le 66.0548 (63.0548) Time 0.834 (0.000)
2020-10-02 01:35:00,538 Test: [140/196] Le 68.9694 (63.0795) Time 0.878 (0.000)
2020-10-02 01:35:08,849 Test: [150/196] Le 63.2262 (63.0947) Time 0.886 (0.000)
2020-10-02 01:35:17,131 Test: [160/196] Le 61.3962 (63.0856) Time 0.788 (0.000)
2020-10-02 01:35:25,125 Test: [170/196] Le 62.4964 (63.0277) Time 0.822 (0.000)
2020-10-02 01:35:33,289 Test: [180/196] Le 62.8228 (63.0423) Time 0.811 (0.000)
2020-10-02 01:35:41,210 Test: [190/196] Le 62.2450 (62.9804) Time 0.790 (0.000)
2020-10-02 01:35:47,189 Images: 5000, Captions: 25000
2020-10-02 01:36:11,797 Align loss: 0.8650665053325651
2020-10-02 01:36:11,797 Image uniform loss: -3.7929412346064644
2020-10-02 01:36:11,797 Text uniform loss: -3.8632661255327365
2020-10-02 01:36:21,609 Save the similarity into runs/coco_vsepp_wsl_bert_var_gpool_2/results_testall_5k.npy
2020-10-02 01:36:21,609 calculate similarity time:
2020-10-02 01:36:40,196 rsum: 468.9
2020-10-02 01:36:40,196 Average i2t Recall: 83.5
2020-10-02 01:36:40,196 Image to text: 66.4 89.3 94.6 1.0 3.0
2020-10-02 01:36:40,197 Average t2i Recall: 72.9
2020-10-02 01:36:40,197 Text to image: 51.6 79.3 87.6 1.0 11.5