File size: 131,198 Bytes
f1c1609 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 | backup evironment completed !
Loading pth from /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal
******************** All args: *************************************************
align_contiguous = False
align_drop_z = 0
align_keep_percentile = 0.1
align_many_to_one = False
align_one_to_many = False
align_top_band_size = 0
att_hid_size = 512
aux_loss = True
backbone = None
base_cfg_path = cfgs_base/howto/howto-yc2_yc2_ori_(sim_op_order_v2)_UniVL_refine.yml
basic_ss_prob = 0
batch_size = 1
batch_size_for_eval = 1
bbox_loss_coef = 0
beta = 1
cap_dec_n_points = 4
cap_nheads = 1
cap_num_feature_levels = 4
cap_prob_clip = False
caption_cost_type = loss
caption_decoder_type = standard
caption_loss_coef = 2
cfg_path = cfgs_ft_gt/howto-yc2_yc2_univl_topk25_r1_iter3_th1_refine_aug(8,0.02)_top3_2stage_inscap.yml
cl_schedule_time = [0, 2]
cl_schedule_val = [0, 0.1]
clip_context_dim = 512
cls_loss_coef = 2
contrastive_hidden_size = 128
contrastive_loss_start_coef = 0.0
contrastive_loss_temperature = 0.1
cost_alpha = 0.25
cost_gamma = 2
count_loss_coef = 0.5
criteria_for_best_ckpt = overall
current_lr = 5e-05
data_norm = 0
data_rescale = 1
debug = False
dec_layers = 2
dec_n_points = 4
device = cuda
dict_file = data/howto/vocabulary_howto_rate2_yc2.json
dict_file_val = data/howto/vocabulary_howto_rate2_yc2.json
dilation = False
disable_contrastive_projection = 1
disable_cudnn = 0
disable_mid_caption_heads = False
disable_rematch = False
disable_tqdm = False
drop_prob = 0.5
ec_alpha = 1.0
enable_bg_for_cl = True
enable_contrastive = False
enable_cross_video_cl = True
enable_e2t_cl = True
enc_layers = 2
enc_n_points = 4
eos_coef = 0.1
epoch = 20
event_context_dim = None
feature_dim = 768
feature_sample_rate = 1
fix_xcw = 1
focal_alpha = 0.25
focal_gamma = 2.0
focal_mil = False
frame_embedding_num = 200
ft_gt_percent = 1.0
giou_loss_coef = 4
gpu_id = []
grad_clip = 100.0
gt_file_for_auc = data/anet/captiondata/val_all.json
gt_file_for_eval = ['data/yc2/captiondata/yc2_val.json']
gt_file_for_para_eval = ['data/yc2/captiondata/para/para_yc2_val.json']
gt_proposal_sample_num = 20
hidden_dim = 512
hidden_dropout_prob = 0.5
huggingface_cache_dir = .cache
id = seq2-ft(mix)-gt_percent-1.0
id_ori =
input_encoding_size = 512
invalid_video_json = []
iteration = 3
layer_norm_eps = 1e-12
learning_rate_decay_every = 3
learning_rate_decay_rate = 0.5
learning_rate_decay_start = 8
lloss_beta = 1
lloss_cross_entropy = 0
lloss_focal_loss = 0
lloss_gau_mask = 1
lr = 5e-05
lr_backbone = 2e-05
lr_backbone_names = ['None']
lr_linear_proj_mult = 0.1
lr_linear_proj_names = ['reference_points', 'sampling_offsets']
lr_proj = 0
map = True
matcher_type = default
max_caption_len = 50
max_eseq_length = 20
max_pos_num = 500
max_text_input_len = 32
merge_criterion = ins_cap_topk
merge_k_boxes = 3
merge_mode = weighted_sum
mil_loss_coef = 0
min_epoch_when_save = -1
nheads = 8
norm_ins_score = sigmoid
nthreads = 4
num_classes = 1
num_feature_levels = 4
num_layers = 1
num_neg_box = 10
num_queries = 100
optimizer_type = adam
position_embedding = sine
position_embedding_scale = 6.283185307179586
pre_percent = 1.0
pretrain = None
pretrain_path =
pretrained_language_model = UniVL
prior_anchor_duration_init = True
prior_manner = all
pseudo_box_aug = False
pseudo_box_aug_mode = random_range
pseudo_box_aug_num = 8
pseudo_box_aug_ratio = 0.02
pseudo_box_type = similarity_op_order_v2
random_anchor_init = True
random_seed = False
ref_rank_loss_coef = 0.0
refine_pseudo_box = False
refine_pseudo_stage_num = 2
rnn_size = 512
sample_method = nearest
save_all_checkpoint = 0
save_checkpoint_every = 1
save_dir = /mnt/data/pjlab-3090-sport/wuhao/logs/dibs
scheduled_sampling_increase_every = 2
scheduled_sampling_increase_prob = 0.05
scheduled_sampling_max_prob = 0.25
scheduled_sampling_start = -1
seed = 777
self_iou_loss_coef = 0.0
set_cost_bbox = 0
set_cost_caption = 0
set_cost_cl = 0.0
set_cost_class = 2
set_cost_giou = 4
set_cost_sim = 1.0
share_caption_head = 1
soft_attention = 1
start_from =
start_from_mode = last
start_refine_epoch = -1
statistic_mode = mode
test = False
text_encoder_learning_strategy = frozen
text_feature_folder = ['/mnt/data/Gvlab/wuhao/features/howto100m/UniVL/text', '/mnt/data/Gvlab/wuhao/features/yc2/UniVL_features/UniVL_text/']
text_feature_folder_val = ['/mnt/data/Gvlab/wuhao/features/yc2/UniVL_features/UniVL_text/']
text_hidden_dim = 768
top_frames = 25
train_caption_file = ['data/howto/captiondata/howto100m_train.json', 'data/yc2/captiondata/yc2_train.json']
train_proposal_sample_num = 30
train_proposal_type = gt
training_scheme = all
transformer_dropout_prob = 0.1
transformer_ff_dim = 512
transformer_input_type = queries
use_additional_cap_layer = False
use_additional_score_layer = False
use_anchor = 0
use_neg_pseudo_box = False
use_pseudo_box = False
use_query_box_for_refine = 0
val_caption_file = data/yc2/captiondata/yc2_val.json
visual_feature_folder = ['/mnt/data/Gvlab/wuhao/features/howto100m/UniVL/visual', '/mnt/data/Gvlab/wuhao/features/yc2/UniVL_features/UniVL_visual/']
visual_feature_folder_val = ['/mnt/data/Gvlab/wuhao/features/yc2/UniVL_features/UniVL_visual/']
visual_feature_type = ['UniVL']
vocab_size = 14538
vocab_size_val = 14538
weight_decay = 0.0001
weighted_mil_loss = False
width_ratio = 1
width_th = 1
window_size = 3
with_box_refine = 1
wordRNN_input_feats_type = C
******************** Model structure: ******************************************
PDVC(
(base_encoder): BaseEncoder(
(pos_embed): PositionEmbeddingSine(
(duration_embed_layer): Linear(in_features=256, out_features=256, bias=True)
)
(input_proj): ModuleList(
(0): Sequential(
(0): Conv1d(768, 512, kernel_size=(1,), stride=(1,))
(1): GroupNorm(32, 512, eps=1e-05, affine=True)
)
(1): Sequential(
(0): Conv1d(768, 512, kernel_size=(3,), stride=(2,), padding=(1,))
(1): GroupNorm(32, 512, eps=1e-05, affine=True)
)
(2): Sequential(
(0): Conv1d(512, 512, kernel_size=(3,), stride=(2,), padding=(1,))
(1): GroupNorm(32, 512, eps=1e-05, affine=True)
)
(3): Sequential(
(0): Conv1d(512, 512, kernel_size=(3,), stride=(2,), padding=(1,))
(1): GroupNorm(32, 512, eps=1e-05, affine=True)
)
)
)
(transformer): DeformableTransformer(
(encoder): DeformableTransformerEncoder(
(layers): ModuleList(
(0): DeformableTransformerEncoderLayer(
(self_attn): MSDeformAttn(
(sampling_offsets): Linear(in_features=512, out_features=128, bias=True)
(attention_weights): Linear(in_features=512, out_features=128, bias=True)
(value_proj): Linear(in_features=512, out_features=512, bias=True)
(output_proj): Linear(in_features=512, out_features=512, bias=True)
)
(dropout1): Dropout(p=0.1, inplace=False)
(norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(linear1): Linear(in_features=512, out_features=512, bias=True)
(dropout2): Dropout(p=0.1, inplace=False)
(linear2): Linear(in_features=512, out_features=512, bias=True)
(dropout3): Dropout(p=0.1, inplace=False)
(norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
(1): DeformableTransformerEncoderLayer(
(self_attn): MSDeformAttn(
(sampling_offsets): Linear(in_features=512, out_features=128, bias=True)
(attention_weights): Linear(in_features=512, out_features=128, bias=True)
(value_proj): Linear(in_features=512, out_features=512, bias=True)
(output_proj): Linear(in_features=512, out_features=512, bias=True)
)
(dropout1): Dropout(p=0.1, inplace=False)
(norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(linear1): Linear(in_features=512, out_features=512, bias=True)
(dropout2): Dropout(p=0.1, inplace=False)
(linear2): Linear(in_features=512, out_features=512, bias=True)
(dropout3): Dropout(p=0.1, inplace=False)
(norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
)
)
(decoder): DeformableTransformerDecoder(
(layers): ModuleList(
(0): DeformableTransformerDecoderLayer(
(cross_attn): MSDeformAttn(
(sampling_offsets): Linear(in_features=512, out_features=128, bias=True)
(attention_weights): Linear(in_features=512, out_features=128, bias=True)
(value_proj): Linear(in_features=512, out_features=512, bias=True)
(output_proj): Linear(in_features=512, out_features=512, bias=True)
)
(dropout1): Dropout(p=0.1, inplace=False)
(norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
)
(dropout2): Dropout(p=0.1, inplace=False)
(norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(linear1): Linear(in_features=512, out_features=512, bias=True)
(dropout3): Dropout(p=0.1, inplace=False)
(linear2): Linear(in_features=512, out_features=512, bias=True)
(dropout4): Dropout(p=0.1, inplace=False)
(norm3): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
(1): DeformableTransformerDecoderLayer(
(cross_attn): MSDeformAttn(
(sampling_offsets): Linear(in_features=512, out_features=128, bias=True)
(attention_weights): Linear(in_features=512, out_features=128, bias=True)
(value_proj): Linear(in_features=512, out_features=512, bias=True)
(output_proj): Linear(in_features=512, out_features=512, bias=True)
)
(dropout1): Dropout(p=0.1, inplace=False)
(norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(self_attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
)
(dropout2): Dropout(p=0.1, inplace=False)
(norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(linear1): Linear(in_features=512, out_features=512, bias=True)
(dropout3): Dropout(p=0.1, inplace=False)
(linear2): Linear(in_features=512, out_features=512, bias=True)
(dropout4): Dropout(p=0.1, inplace=False)
(norm3): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
)
(bbox_head): ModuleList(
(0): MLP(
(layers): ModuleList(
(0): Linear(in_features=512, out_features=512, bias=True)
(1): Linear(in_features=512, out_features=512, bias=True)
(2): Linear(in_features=512, out_features=2, bias=True)
)
)
(1): MLP(
(layers): ModuleList(
(0): Linear(in_features=512, out_features=512, bias=True)
(1): Linear(in_features=512, out_features=512, bias=True)
(2): Linear(in_features=512, out_features=2, bias=True)
)
)
)
)
(pos_trans): Linear(in_features=512, out_features=1024, bias=True)
(pos_trans_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(reference_points): Linear(in_features=512, out_features=1, bias=True)
)
(caption_head): ModuleList(
(0): LSTMDSACaptioner(
(embed): Embedding(14539, 512)
(logit): Linear(in_features=512, out_features=14539, bias=True)
(dropout): Dropout(p=0.5, inplace=False)
(core): ShowAttendTellCore(
(rnn): LSTM(1536, 512, bias=False, dropout=0.5)
(att_drop): Dropout(p=0.5, inplace=False)
(deformable_att): MSDeformAttnCap(
(sampling_offsets): Linear(in_features=1024, out_features=16, bias=True)
(attention_weights): Linear(in_features=1024, out_features=16, bias=True)
(value_proj): Linear(in_features=512, out_features=512, bias=True)
(output_proj): Linear(in_features=512, out_features=512, bias=True)
)
(ctx2att): Linear(in_features=512, out_features=512, bias=True)
(h2att): Linear(in_features=512, out_features=512, bias=True)
(alpha_net): Linear(in_features=512, out_features=1, bias=True)
)
)
(1): LSTMDSACaptioner(
(embed): Embedding(14539, 512)
(logit): Linear(in_features=512, out_features=14539, bias=True)
(dropout): Dropout(p=0.5, inplace=False)
(core): ShowAttendTellCore(
(rnn): LSTM(1536, 512, bias=False, dropout=0.5)
(att_drop): Dropout(p=0.5, inplace=False)
(deformable_att): MSDeformAttnCap(
(sampling_offsets): Linear(in_features=1024, out_features=16, bias=True)
(attention_weights): Linear(in_features=1024, out_features=16, bias=True)
(value_proj): Linear(in_features=512, out_features=512, bias=True)
(output_proj): Linear(in_features=512, out_features=512, bias=True)
)
(ctx2att): Linear(in_features=512, out_features=512, bias=True)
(h2att): Linear(in_features=512, out_features=512, bias=True)
(alpha_net): Linear(in_features=512, out_features=1, bias=True)
)
)
)
(query_embed): Embedding(100, 1024)
(class_head): ModuleList(
(0): Linear(in_features=512, out_features=1, bias=True)
(1): Linear(in_features=512, out_features=1, bias=True)
)
(class_refine_head): ModuleList(
(0): Linear(in_features=512, out_features=1, bias=True)
(1): Linear(in_features=512, out_features=1, bias=True)
)
(count_head): ModuleList(
(0): Linear(in_features=512, out_features=21, bias=True)
(1): Linear(in_features=512, out_features=21, bias=True)
)
(bbox_head): ModuleList(
(0): MLP(
(layers): ModuleList(
(0): Linear(in_features=512, out_features=512, bias=True)
(1): Linear(in_features=512, out_features=512, bias=True)
(2): Linear(in_features=512, out_features=2, bias=True)
)
)
(1): MLP(
(layers): ModuleList(
(0): Linear(in_features=512, out_features=512, bias=True)
(1): Linear(in_features=512, out_features=512, bias=True)
(2): Linear(in_features=512, out_features=2, bias=True)
)
)
)
(contrastive_projection_event): ModuleList(
(0): Identity()
(1): Identity()
)
(contrastive_projection_text): ModuleList(
(0): Identity()
(1): Identity()
)
)
******************** Strat training ! ******************************************
loss type: dict_keys(['loss_ce', 'loss_bbox', 'loss_giou', 'loss_counter', 'loss_caption', 'contrastive_loss', 'loss_ce_0', 'loss_bbox_0', 'loss_giou_0', 'loss_counter_0', 'loss_caption_0', 'contrastive_loss_0'])
loss weights: dict_values([2, 0, 4, 0.5, 2, 0.0, 2, 0, 4, 0.5, 2, 0.0])
ID seq2-ft(mix)-gt_percent-1.0 iter 133 (epoch 0),
loss = OrderedDict([('loss_ce', 0.336), ('loss_counter', 0.129), ('loss_bbox', 0.039), ('loss_giou', 0.368), ('loss_self_iou', 0.028), ('cardinality_error', 7.797), ('loss_ce_0', 0.337), ('loss_counter_0', 0.13), ('loss_bbox_0', 0.041), ('loss_giou_0', 0.381), ('loss_self_iou_0', 0.03), ('cardinality_error_0', 7.797), ('loss_caption_0', 2.755), ('loss_caption', 2.681), ('total_loss', 15.341)]),
time/iter = 0.172, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 266 (epoch 0),
loss = OrderedDict([('loss_ce', 0.324), ('loss_counter', 0.129), ('loss_bbox', 0.036), ('loss_giou', 0.369), ('loss_self_iou', 0.018), ('cardinality_error', 7.812), ('loss_ce_0', 0.341), ('loss_counter_0', 0.132), ('loss_bbox_0', 0.039), ('loss_giou_0', 0.38), ('loss_self_iou_0', 0.019), ('cardinality_error_0', 7.812), ('loss_caption_0', 2.803), ('loss_caption', 2.638), ('total_loss', 15.341)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 399 (epoch 0),
loss = OrderedDict([('loss_ce', 0.312), ('loss_counter', 0.13), ('loss_bbox', 0.039), ('loss_giou', 0.375), ('loss_self_iou', 0.02), ('cardinality_error', 7.835), ('loss_ce_0', 0.324), ('loss_counter_0', 0.132), ('loss_bbox_0', 0.043), ('loss_giou_0', 0.395), ('loss_self_iou_0', 0.021), ('cardinality_error_0', 7.835), ('loss_caption_0', 2.81), ('loss_caption', 2.676), ('total_loss', 15.459)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 532 (epoch 0),
loss = OrderedDict([('loss_ce', 0.307), ('loss_counter', 0.133), ('loss_bbox', 0.044), ('loss_giou', 0.394), ('loss_self_iou', 0.02), ('cardinality_error', 7.902), ('loss_ce_0', 0.319), ('loss_counter_0', 0.133), ('loss_bbox_0', 0.05), ('loss_giou_0', 0.421), ('loss_self_iou_0', 0.026), ('cardinality_error_0', 7.902), ('loss_caption_0', 2.817), ('loss_caption', 2.654), ('total_loss', 15.588)]),
time/iter = 0.167, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 665 (epoch 0),
loss = OrderedDict([('loss_ce', 0.312), ('loss_counter', 0.135), ('loss_bbox', 0.034), ('loss_giou', 0.345), ('loss_self_iou', 0.017), ('cardinality_error', 7.805), ('loss_ce_0', 0.319), ('loss_counter_0', 0.131), ('loss_bbox_0', 0.038), ('loss_giou_0', 0.372), ('loss_self_iou_0', 0.019), ('cardinality_error_0', 7.805), ('loss_caption_0', 2.758), ('loss_caption', 2.635), ('total_loss', 15.049)]),
time/iter = 0.157, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 798 (epoch 0),
loss = OrderedDict([('loss_ce', 0.321), ('loss_counter', 0.125), ('loss_bbox', 0.03), ('loss_giou', 0.319), ('loss_self_iou', 0.015), ('cardinality_error', 7.774), ('loss_ce_0', 0.331), ('loss_counter_0', 0.124), ('loss_bbox_0', 0.032), ('loss_giou_0', 0.344), ('loss_self_iou_0', 0.015), ('cardinality_error_0', 7.774), ('loss_caption_0', 2.66), ('loss_caption', 2.559), ('total_loss', 14.519)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 931 (epoch 0),
loss = OrderedDict([('loss_ce', 0.327), ('loss_counter', 0.122), ('loss_bbox', 0.027), ('loss_giou', 0.306), ('loss_self_iou', 0.011), ('cardinality_error', 7.865), ('loss_ce_0', 0.346), ('loss_counter_0', 0.123), ('loss_bbox_0', 0.029), ('loss_giou_0', 0.327), ('loss_self_iou_0', 0.012), ('cardinality_error_0', 7.865), ('loss_caption_0', 2.54), ('loss_caption', 2.468), ('total_loss', 14.017)]),
time/iter = 0.154, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 1064 (epoch 0),
loss = OrderedDict([('loss_ce', 0.331), ('loss_counter', 0.121), ('loss_bbox', 0.027), ('loss_giou', 0.292), ('loss_self_iou', 0.01), ('cardinality_error', 7.579), ('loss_ce_0', 0.345), ('loss_counter_0', 0.127), ('loss_bbox_0', 0.028), ('loss_giou_0', 0.311), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 7.579), ('loss_caption_0', 2.639), ('loss_caption', 2.626), ('total_loss', 14.419)]),
time/iter = 0.163, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 1197 (epoch 0),
loss = OrderedDict([('loss_ce', 0.325), ('loss_counter', 0.118), ('loss_bbox', 0.026), ('loss_giou', 0.296), ('loss_self_iou', 0.011), ('cardinality_error', 7.241), ('loss_ce_0', 0.339), ('loss_counter_0', 0.121), ('loss_bbox_0', 0.028), ('loss_giou_0', 0.317), ('loss_self_iou_0', 0.011), ('cardinality_error_0', 7.241), ('loss_caption_0', 2.501), ('loss_caption', 2.496), ('total_loss', 13.892)]),
time/iter = 0.154, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 1330 (epoch 0),
loss = OrderedDict([('loss_ce', 0.327), ('loss_counter', 0.126), ('loss_bbox', 0.026), ('loss_giou', 0.304), ('loss_self_iou', 0.011), ('cardinality_error', 7.94), ('loss_ce_0', 0.334), ('loss_counter_0', 0.127), ('loss_bbox_0', 0.029), ('loss_giou_0', 0.332), ('loss_self_iou_0', 0.011), ('cardinality_error_0', 7.94), ('loss_caption_0', 2.635), ('loss_caption', 2.619), ('total_loss', 14.504)]),
time/iter = 0.158, bad_vid = 0.000
Validation results of iter 1333:
Bleu_1:0.16894357888730638
Bleu_2:0.09902176620134434
Bleu_3:0.05312286436412136
Bleu_4:0.026212861867102137
METEOR:0.0791142699299577
ROUGE_L:0.15563765109454591
CIDEr:0.4087091055845523
Recall:0.1991554685892762
Precision:0.40083793546594454
soda_c:0.05642652494419026
para_Bleu_1:0.28013834967939705
para_Bleu_2:0.16393959632782257
para_Bleu_3:0.09809744775628881
para_Bleu_4:0.060378126412557326
para_METEOR:0.1286956339033507
para_ROUGE_L:0.29903071052996405
para_CIDEr:0.14675303603221324
overall score of iter 1333: 0.3358267963481213
Save model at iter 1333 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save Best-model at iter 1333 to checkpoint file.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 1463 (epoch 1),
loss = OrderedDict([('loss_ce', 0.322), ('loss_counter', 0.128), ('loss_bbox', 0.026), ('loss_giou', 0.301), ('loss_self_iou', 0.011), ('cardinality_error', 7.699), ('loss_ce_0', 0.335), ('loss_counter_0', 0.129), ('loss_bbox_0', 0.026), ('loss_giou_0', 0.316), ('loss_self_iou_0', 0.011), ('cardinality_error_0', 7.699), ('loss_caption_0', 2.448), ('loss_caption', 2.462), ('total_loss', 13.729)]),
time/iter = 0.660, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 1596 (epoch 1),
loss = OrderedDict([('loss_ce', 0.311), ('loss_counter', 0.126), ('loss_bbox', 0.022), ('loss_giou', 0.284), ('loss_self_iou', 0.01), ('cardinality_error', 8.233), ('loss_ce_0', 0.322), ('loss_counter_0', 0.123), ('loss_bbox_0', 0.024), ('loss_giou_0', 0.31), ('loss_self_iou_0', 0.01), ('cardinality_error_0', 8.233), ('loss_caption_0', 2.348), ('loss_caption', 2.348), ('total_loss', 13.16)]),
time/iter = 0.153, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 1729 (epoch 1),
loss = OrderedDict([('loss_ce', 0.311), ('loss_counter', 0.124), ('loss_bbox', 0.023), ('loss_giou', 0.273), ('loss_self_iou', 0.01), ('cardinality_error', 7.632), ('loss_ce_0', 0.32), ('loss_counter_0', 0.124), ('loss_bbox_0', 0.026), ('loss_giou_0', 0.307), ('loss_self_iou_0', 0.012), ('cardinality_error_0', 7.632), ('loss_caption_0', 2.363), ('loss_caption', 2.353), ('total_loss', 13.14)]),
time/iter = 0.156, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 1862 (epoch 1),
loss = OrderedDict([('loss_ce', 0.316), ('loss_counter', 0.12), ('loss_bbox', 0.023), ('loss_giou', 0.268), ('loss_self_iou', 0.01), ('cardinality_error', 7.609), ('loss_ce_0', 0.32), ('loss_counter_0', 0.119), ('loss_bbox_0', 0.025), ('loss_giou_0', 0.29), ('loss_self_iou_0', 0.01), ('cardinality_error_0', 7.609), ('loss_caption_0', 2.439), ('loss_caption', 2.419), ('total_loss', 13.343)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 1995 (epoch 1),
loss = OrderedDict([('loss_ce', 0.314), ('loss_counter', 0.122), ('loss_bbox', 0.022), ('loss_giou', 0.281), ('loss_self_iou', 0.009), ('cardinality_error', 7.541), ('loss_ce_0', 0.322), ('loss_counter_0', 0.122), ('loss_bbox_0', 0.025), ('loss_giou_0', 0.309), ('loss_self_iou_0', 0.011), ('cardinality_error_0', 7.541), ('loss_caption_0', 2.503), ('loss_caption', 2.503), ('total_loss', 13.766)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 2128 (epoch 1),
loss = OrderedDict([('loss_ce', 0.316), ('loss_counter', 0.126), ('loss_bbox', 0.024), ('loss_giou', 0.284), ('loss_self_iou', 0.009), ('cardinality_error', 7.789), ('loss_ce_0', 0.324), ('loss_counter_0', 0.125), ('loss_bbox_0', 0.026), ('loss_giou_0', 0.301), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 7.789), ('loss_caption_0', 2.5), ('loss_caption', 2.493), ('total_loss', 13.73)]),
time/iter = 0.161, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 2261 (epoch 1),
loss = OrderedDict([('loss_ce', 0.31), ('loss_counter', 0.122), ('loss_bbox', 0.023), ('loss_giou', 0.285), ('loss_self_iou', 0.012), ('cardinality_error', 7.902), ('loss_ce_0', 0.316), ('loss_counter_0', 0.12), ('loss_bbox_0', 0.025), ('loss_giou_0', 0.304), ('loss_self_iou_0', 0.011), ('cardinality_error_0', 7.902), ('loss_caption_0', 2.425), ('loss_caption', 2.424), ('total_loss', 13.426)]),
time/iter = 0.156, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 2394 (epoch 1),
loss = OrderedDict([('loss_ce', 0.315), ('loss_counter', 0.126), ('loss_bbox', 0.025), ('loss_giou', 0.29), ('loss_self_iou', 0.011), ('cardinality_error', 7.534), ('loss_ce_0', 0.323), ('loss_counter_0', 0.125), ('loss_bbox_0', 0.026), ('loss_giou_0', 0.308), ('loss_self_iou_0', 0.01), ('cardinality_error_0', 7.534), ('loss_caption_0', 2.439), ('loss_caption', 2.435), ('total_loss', 13.54)]),
time/iter = 0.154, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 2527 (epoch 1),
loss = OrderedDict([('loss_ce', 0.313), ('loss_counter', 0.125), ('loss_bbox', 0.023), ('loss_giou', 0.276), ('loss_self_iou', 0.009), ('cardinality_error', 7.647), ('loss_ce_0', 0.319), ('loss_counter_0', 0.123), ('loss_bbox_0', 0.025), ('loss_giou_0', 0.296), ('loss_self_iou_0', 0.01), ('cardinality_error_0', 7.647), ('loss_caption_0', 2.454), ('loss_caption', 2.455), ('total_loss', 13.492)]),
time/iter = 0.157, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 2660 (epoch 1),
loss = OrderedDict([('loss_ce', 0.313), ('loss_counter', 0.131), ('loss_bbox', 0.023), ('loss_giou', 0.273), ('loss_self_iou', 0.01), ('cardinality_error', 8.0), ('loss_ce_0', 0.317), ('loss_counter_0', 0.128), ('loss_bbox_0', 0.026), ('loss_giou_0', 0.294), ('loss_self_iou_0', 0.01), ('cardinality_error_0', 8.0), ('loss_caption_0', 2.464), ('loss_caption', 2.451), ('total_loss', 13.487)]),
time/iter = 0.167, bad_vid = 0.000
Validation results of iter 2666:
Bleu_1:0.18247710374533507
Bleu_2:0.10433126216854799
Bleu_3:0.05471515540980739
Bleu_4:0.025315544998990337
METEOR:0.08392673175891194
ROUGE_L:0.16810710582244187
CIDEr:0.48711946137609907
Recall:0.23104975652842194
Precision:0.4442690424090867
soda_c:0.06454827356060923
para_Bleu_1:0.27953804293947354
para_Bleu_2:0.1635778619591909
para_Bleu_3:0.09761782578266559
para_Bleu_4:0.060085255296605154
para_METEOR:0.13134445752685775
para_ROUGE_L:0.3040652157082556
para_CIDEr:0.15701615141849948
overall score of iter 2666: 0.34844586424196233
Save model at iter 2666 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save Best-model at iter 2666 to checkpoint file.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 2793 (epoch 2),
loss = OrderedDict([('loss_ce', 0.309), ('loss_counter', 0.119), ('loss_bbox', 0.021), ('loss_giou', 0.26), ('loss_self_iou', 0.01), ('cardinality_error', 7.556), ('loss_ce_0', 0.312), ('loss_counter_0', 0.118), ('loss_bbox_0', 0.024), ('loss_giou_0', 0.285), ('loss_self_iou_0', 0.011), ('cardinality_error_0', 7.556), ('loss_caption_0', 2.27), ('loss_caption', 2.276), ('total_loss', 12.632)]),
time/iter = 0.666, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 2926 (epoch 2),
loss = OrderedDict([('loss_ce', 0.313), ('loss_counter', 0.121), ('loss_bbox', 0.023), ('loss_giou', 0.266), ('loss_self_iou', 0.008), ('cardinality_error', 7.444), ('loss_ce_0', 0.317), ('loss_counter_0', 0.118), ('loss_bbox_0', 0.025), ('loss_giou_0', 0.287), ('loss_self_iou_0', 0.01), ('cardinality_error_0', 7.444), ('loss_caption_0', 2.276), ('loss_caption', 2.291), ('total_loss', 12.726)]),
time/iter = 0.157, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 3059 (epoch 2),
loss = OrderedDict([('loss_ce', 0.298), ('loss_counter', 0.127), ('loss_bbox', 0.02), ('loss_giou', 0.272), ('loss_self_iou', 0.008), ('cardinality_error', 8.135), ('loss_ce_0', 0.302), ('loss_counter_0', 0.125), ('loss_bbox_0', 0.023), ('loss_giou_0', 0.296), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 8.135), ('loss_caption_0', 2.364), ('loss_caption', 2.364), ('total_loss', 13.057)]),
time/iter = 0.165, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 3192 (epoch 2),
loss = OrderedDict([('loss_ce', 0.301), ('loss_counter', 0.122), ('loss_bbox', 0.022), ('loss_giou', 0.266), ('loss_self_iou', 0.008), ('cardinality_error', 7.699), ('loss_ce_0', 0.306), ('loss_counter_0', 0.121), ('loss_bbox_0', 0.023), ('loss_giou_0', 0.286), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.699), ('loss_caption_0', 2.367), ('loss_caption', 2.381), ('total_loss', 13.038)]),
time/iter = 0.177, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 3325 (epoch 2),
loss = OrderedDict([('loss_ce', 0.3), ('loss_counter', 0.123), ('loss_bbox', 0.021), ('loss_giou', 0.274), ('loss_self_iou', 0.009), ('cardinality_error', 7.932), ('loss_ce_0', 0.3), ('loss_counter_0', 0.121), ('loss_bbox_0', 0.023), ('loss_giou_0', 0.291), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 7.932), ('loss_caption_0', 2.323), ('loss_caption', 2.33), ('total_loss', 12.887)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 3458 (epoch 2),
loss = OrderedDict([('loss_ce', 0.31), ('loss_counter', 0.124), ('loss_bbox', 0.021), ('loss_giou', 0.277), ('loss_self_iou', 0.01), ('cardinality_error', 7.865), ('loss_ce_0', 0.31), ('loss_counter_0', 0.123), ('loss_bbox_0', 0.023), ('loss_giou_0', 0.295), ('loss_self_iou_0', 0.011), ('cardinality_error_0', 7.865), ('loss_caption_0', 2.351), ('loss_caption', 2.341), ('total_loss', 13.038)]),
time/iter = 0.153, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 3591 (epoch 2),
loss = OrderedDict([('loss_ce', 0.306), ('loss_counter', 0.114), ('loss_bbox', 0.022), ('loss_giou', 0.263), ('loss_self_iou', 0.009), ('cardinality_error', 7.586), ('loss_ce_0', 0.308), ('loss_counter_0', 0.114), ('loss_bbox_0', 0.024), ('loss_giou_0', 0.285), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 7.586), ('loss_caption_0', 2.222), ('loss_caption', 2.223), ('total_loss', 12.425)]),
time/iter = 0.154, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 3724 (epoch 2),
loss = OrderedDict([('loss_ce', 0.305), ('loss_counter', 0.123), ('loss_bbox', 0.023), ('loss_giou', 0.265), ('loss_self_iou', 0.009), ('cardinality_error', 7.624), ('loss_ce_0', 0.307), ('loss_counter_0', 0.121), ('loss_bbox_0', 0.024), ('loss_giou_0', 0.279), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 7.624), ('loss_caption_0', 2.38), ('loss_caption', 2.368), ('total_loss', 13.014)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 3857 (epoch 2),
loss = OrderedDict([('loss_ce', 0.306), ('loss_counter', 0.115), ('loss_bbox', 0.021), ('loss_giou', 0.264), ('loss_self_iou', 0.009), ('cardinality_error', 7.489), ('loss_ce_0', 0.312), ('loss_counter_0', 0.114), ('loss_bbox_0', 0.023), ('loss_giou_0', 0.279), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 7.489), ('loss_caption_0', 2.343), ('loss_caption', 2.344), ('total_loss', 12.897)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 3990 (epoch 2),
loss = OrderedDict([('loss_ce', 0.299), ('loss_counter', 0.134), ('loss_bbox', 0.02), ('loss_giou', 0.268), ('loss_self_iou', 0.012), ('cardinality_error', 8.301), ('loss_ce_0', 0.299), ('loss_counter_0', 0.131), ('loss_bbox_0', 0.022), ('loss_giou_0', 0.289), ('loss_self_iou_0', 0.013), ('cardinality_error_0', 8.301), ('loss_caption_0', 2.327), ('loss_caption', 2.346), ('total_loss', 12.9)]),
time/iter = 0.154, bad_vid = 0.000
Validation results of iter 3999:
Bleu_1:0.18812761655735627
Bleu_2:0.11394688266117041
Bleu_3:0.06350983100569632
Bleu_4:0.03295035253718016
METEOR:0.08673497362280043
ROUGE_L:0.17099683701262633
CIDEr:0.534654554166069
Recall:0.2545535313519452
Precision:0.4357073390990242
soda_c:0.06940030844072555
para_Bleu_1:0.31911536052560924
para_Bleu_2:0.19074275606485158
para_Bleu_3:0.11503629156908896
para_Bleu_4:0.07096292455051724
para_METEOR:0.14141970569772275
para_ROUGE_L:0.3133292457236414
para_CIDEr:0.18756071216976763
overall score of iter 3999: 0.3999433424180076
Save model at iter 3999 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save Best-model at iter 3999 to checkpoint file.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 4123 (epoch 3),
loss = OrderedDict([('loss_ce', 0.305), ('loss_counter', 0.129), ('loss_bbox', 0.021), ('loss_giou', 0.256), ('loss_self_iou', 0.008), ('cardinality_error', 7.925), ('loss_ce_0', 0.307), ('loss_counter_0', 0.126), ('loss_bbox_0', 0.023), ('loss_giou_0', 0.275), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.925), ('loss_caption_0', 2.272), ('loss_caption', 2.28), ('total_loss', 12.579)]),
time/iter = 0.678, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 4256 (epoch 3),
loss = OrderedDict([('loss_ce', 0.308), ('loss_counter', 0.121), ('loss_bbox', 0.02), ('loss_giou', 0.256), ('loss_self_iou', 0.008), ('cardinality_error', 7.632), ('loss_ce_0', 0.31), ('loss_counter_0', 0.12), ('loss_bbox_0', 0.022), ('loss_giou_0', 0.276), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 7.632), ('loss_caption_0', 2.247), ('loss_caption', 2.252), ('total_loss', 12.484)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 4389 (epoch 3),
loss = OrderedDict([('loss_ce', 0.305), ('loss_counter', 0.12), ('loss_bbox', 0.021), ('loss_giou', 0.26), ('loss_self_iou', 0.011), ('cardinality_error', 7.526), ('loss_ce_0', 0.309), ('loss_counter_0', 0.119), ('loss_bbox_0', 0.022), ('loss_giou_0', 0.272), ('loss_self_iou_0', 0.01), ('cardinality_error_0', 7.526), ('loss_caption_0', 2.194), ('loss_caption', 2.205), ('total_loss', 12.273)]),
time/iter = 0.161, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 4522 (epoch 3),
loss = OrderedDict([('loss_ce', 0.305), ('loss_counter', 0.115), ('loss_bbox', 0.019), ('loss_giou', 0.248), ('loss_self_iou', 0.007), ('cardinality_error', 7.519), ('loss_ce_0', 0.303), ('loss_counter_0', 0.113), ('loss_bbox_0', 0.021), ('loss_giou_0', 0.262), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.519), ('loss_caption_0', 2.335), ('loss_caption', 2.326), ('total_loss', 12.689)]),
time/iter = 0.170, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 4655 (epoch 3),
loss = OrderedDict([('loss_ce', 0.297), ('loss_counter', 0.122), ('loss_bbox', 0.02), ('loss_giou', 0.263), ('loss_self_iou', 0.008), ('cardinality_error', 7.97), ('loss_ce_0', 0.298), ('loss_counter_0', 0.121), ('loss_bbox_0', 0.022), ('loss_giou_0', 0.285), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 7.97), ('loss_caption_0', 2.254), ('loss_caption', 2.267), ('total_loss', 12.545)]),
time/iter = 0.153, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 4788 (epoch 3),
loss = OrderedDict([('loss_ce', 0.308), ('loss_counter', 0.118), ('loss_bbox', 0.021), ('loss_giou', 0.253), ('loss_self_iou', 0.008), ('cardinality_error', 7.481), ('loss_ce_0', 0.308), ('loss_counter_0', 0.118), ('loss_bbox_0', 0.022), ('loss_giou_0', 0.268), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.481), ('loss_caption_0', 2.208), ('loss_caption', 2.195), ('total_loss', 12.24)]),
time/iter = 0.151, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 4921 (epoch 3),
loss = OrderedDict([('loss_ce', 0.306), ('loss_counter', 0.12), ('loss_bbox', 0.019), ('loss_giou', 0.262), ('loss_self_iou', 0.01), ('cardinality_error', 7.842), ('loss_ce_0', 0.305), ('loss_counter_0', 0.119), ('loss_bbox_0', 0.021), ('loss_giou_0', 0.284), ('loss_self_iou_0', 0.011), ('cardinality_error_0', 7.842), ('loss_caption_0', 2.186), ('loss_caption', 2.196), ('total_loss', 12.289)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 5054 (epoch 3),
loss = OrderedDict([('loss_ce', 0.303), ('loss_counter', 0.121), ('loss_bbox', 0.022), ('loss_giou', 0.26), ('loss_self_iou', 0.009), ('cardinality_error', 7.887), ('loss_ce_0', 0.305), ('loss_counter_0', 0.12), ('loss_bbox_0', 0.023), ('loss_giou_0', 0.271), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 7.887), ('loss_caption_0', 2.242), ('loss_caption', 2.239), ('total_loss', 12.422)]),
time/iter = 0.170, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 5187 (epoch 3),
loss = OrderedDict([('loss_ce', 0.303), ('loss_counter', 0.124), ('loss_bbox', 0.021), ('loss_giou', 0.262), ('loss_self_iou', 0.009), ('cardinality_error', 7.932), ('loss_ce_0', 0.305), ('loss_counter_0', 0.123), ('loss_bbox_0', 0.022), ('loss_giou_0', 0.277), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 7.932), ('loss_caption_0', 2.25), ('loss_caption', 2.246), ('total_loss', 12.483)]),
time/iter = 0.166, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 5320 (epoch 3),
loss = OrderedDict([('loss_ce', 0.299), ('loss_counter', 0.12), ('loss_bbox', 0.022), ('loss_giou', 0.26), ('loss_self_iou', 0.006), ('cardinality_error', 7.729), ('loss_ce_0', 0.298), ('loss_counter_0', 0.12), ('loss_bbox_0', 0.024), ('loss_giou_0', 0.279), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.729), ('loss_caption_0', 2.287), ('loss_caption', 2.298), ('total_loss', 12.64)]),
time/iter = 0.161, bad_vid = 0.000
Validation results of iter 5332:
Bleu_1:0.19536023703614988
Bleu_2:0.11676341716851109
Bleu_3:0.06337153157323498
Bleu_4:0.031788948303475714
METEOR:0.09287502887069582
ROUGE_L:0.18168372139225142
CIDEr:0.5345089450528974
Recall:0.26186565000159123
Precision:0.4578470702650138
soda_c:0.06891495599002981
para_Bleu_1:0.3645537642333956
para_Bleu_2:0.21504928179111618
para_Bleu_3:0.1297486406737134
para_Bleu_4:0.08010111193897063
para_METEOR:0.1518569517959942
para_ROUGE_L:0.3241825281759821
para_CIDEr:0.22211083978975357
overall score of iter 5332: 0.4540689035247184
Save model at iter 5332 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save Best-model at iter 5332 to checkpoint file.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 5453 (epoch 4),
loss = OrderedDict([('loss_ce', 0.301), ('loss_counter', 0.113), ('loss_bbox', 0.022), ('loss_giou', 0.25), ('loss_self_iou', 0.011), ('cardinality_error', 7.519), ('loss_ce_0', 0.298), ('loss_counter_0', 0.113), ('loss_bbox_0', 0.023), ('loss_giou_0', 0.269), ('loss_self_iou_0', 0.011), ('cardinality_error_0', 7.519), ('loss_caption_0', 2.175), ('loss_caption', 2.176), ('total_loss', 12.088)]),
time/iter = 0.716, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 5586 (epoch 4),
loss = OrderedDict([('loss_ce', 0.294), ('loss_counter', 0.12), ('loss_bbox', 0.018), ('loss_giou', 0.252), ('loss_self_iou', 0.007), ('cardinality_error', 7.662), ('loss_ce_0', 0.292), ('loss_counter_0', 0.119), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.274), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.662), ('loss_caption_0', 2.16), ('loss_caption', 2.132), ('total_loss', 11.979)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 5719 (epoch 4),
loss = OrderedDict([('loss_ce', 0.305), ('loss_counter', 0.13), ('loss_bbox', 0.02), ('loss_giou', 0.255), ('loss_self_iou', 0.008), ('cardinality_error', 8.451), ('loss_ce_0', 0.302), ('loss_counter_0', 0.127), ('loss_bbox_0', 0.021), ('loss_giou_0', 0.273), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 8.451), ('loss_caption_0', 2.166), ('loss_caption', 2.164), ('total_loss', 12.113)]),
time/iter = 0.156, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 5852 (epoch 4),
loss = OrderedDict([('loss_ce', 0.301), ('loss_counter', 0.12), ('loss_bbox', 0.019), ('loss_giou', 0.246), ('loss_self_iou', 0.007), ('cardinality_error', 7.835), ('loss_ce_0', 0.302), ('loss_counter_0', 0.12), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.267), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.835), ('loss_caption_0', 2.122), ('loss_caption', 2.111), ('total_loss', 11.841)]),
time/iter = 0.157, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 5985 (epoch 4),
loss = OrderedDict([('loss_ce', 0.304), ('loss_counter', 0.122), ('loss_bbox', 0.02), ('loss_giou', 0.243), ('loss_self_iou', 0.009), ('cardinality_error', 7.474), ('loss_ce_0', 0.298), ('loss_counter_0', 0.12), ('loss_bbox_0', 0.022), ('loss_giou_0', 0.263), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.474), ('loss_caption_0', 2.149), ('loss_caption', 2.14), ('total_loss', 11.926)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 6118 (epoch 4),
loss = OrderedDict([('loss_ce', 0.3), ('loss_counter', 0.113), ('loss_bbox', 0.018), ('loss_giou', 0.241), ('loss_self_iou', 0.008), ('cardinality_error', 7.639), ('loss_ce_0', 0.302), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.019), ('loss_giou_0', 0.259), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.639), ('loss_caption_0', 2.235), ('loss_caption', 2.215), ('total_loss', 12.218)]),
time/iter = 0.160, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 6251 (epoch 4),
loss = OrderedDict([('loss_ce', 0.301), ('loss_counter', 0.125), ('loss_bbox', 0.02), ('loss_giou', 0.251), ('loss_self_iou', 0.007), ('cardinality_error', 7.857), ('loss_ce_0', 0.301), ('loss_counter_0', 0.125), ('loss_bbox_0', 0.022), ('loss_giou_0', 0.268), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.857), ('loss_caption_0', 2.235), ('loss_caption', 2.226), ('total_loss', 12.328)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 6384 (epoch 4),
loss = OrderedDict([('loss_ce', 0.302), ('loss_counter', 0.124), ('loss_bbox', 0.02), ('loss_giou', 0.246), ('loss_self_iou', 0.006), ('cardinality_error', 7.82), ('loss_ce_0', 0.301), ('loss_counter_0', 0.124), ('loss_bbox_0', 0.021), ('loss_giou_0', 0.265), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.82), ('loss_caption_0', 2.208), ('loss_caption', 2.183), ('total_loss', 12.157)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 6517 (epoch 4),
loss = OrderedDict([('loss_ce', 0.297), ('loss_counter', 0.12), ('loss_bbox', 0.02), ('loss_giou', 0.256), ('loss_self_iou', 0.008), ('cardinality_error', 7.872), ('loss_ce_0', 0.295), ('loss_counter_0', 0.118), ('loss_bbox_0', 0.022), ('loss_giou_0', 0.271), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.872), ('loss_caption_0', 2.135), ('loss_caption', 2.155), ('total_loss', 11.99)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 6650 (epoch 4),
loss = OrderedDict([('loss_ce', 0.297), ('loss_counter', 0.112), ('loss_bbox', 0.021), ('loss_giou', 0.244), ('loss_self_iou', 0.008), ('cardinality_error', 7.398), ('loss_ce_0', 0.297), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.023), ('loss_giou_0', 0.26), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.398), ('loss_caption_0', 2.205), ('loss_caption', 2.202), ('total_loss', 12.127)]),
time/iter = 0.156, bad_vid = 0.000
Validation results of iter 6665:
Bleu_1:0.19366491706119263
Bleu_2:0.1161802397372496
Bleu_3:0.06381908710297783
Bleu_4:0.0310996008751752
METEOR:0.0900086447067842
ROUGE_L:0.1772625018945245
CIDEr:0.5329339889166991
Recall:0.27822837264850414
Precision:0.4414053002674447
soda_c:0.0725148309247326
para_Bleu_1:0.36779729697992286
para_Bleu_2:0.2189609464261768
para_Bleu_3:0.13170237886801614
para_Bleu_4:0.08102932652379062
para_METEOR:0.15287168689015676
para_ROUGE_L:0.32609559286330886
para_CIDEr:0.24981796796266917
overall score of iter 6665: 0.48371898137661656
Save model at iter 6665 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save Best-model at iter 6665 to checkpoint file.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 6783 (epoch 5),
loss = OrderedDict([('loss_ce', 0.29), ('loss_counter', 0.117), ('loss_bbox', 0.019), ('loss_giou', 0.24), ('loss_self_iou', 0.007), ('cardinality_error', 7.586), ('loss_ce_0', 0.29), ('loss_counter_0', 0.116), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.257), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 7.586), ('loss_caption_0', 2.02), ('loss_caption', 2.014), ('total_loss', 11.332)]),
time/iter = 0.689, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 6916 (epoch 5),
loss = OrderedDict([('loss_ce', 0.301), ('loss_counter', 0.118), ('loss_bbox', 0.021), ('loss_giou', 0.249), ('loss_self_iou', 0.008), ('cardinality_error', 7.519), ('loss_ce_0', 0.302), ('loss_counter_0', 0.116), ('loss_bbox_0', 0.023), ('loss_giou_0', 0.264), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.519), ('loss_caption_0', 2.118), ('loss_caption', 2.101), ('total_loss', 11.817)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 7049 (epoch 5),
loss = OrderedDict([('loss_ce', 0.294), ('loss_counter', 0.119), ('loss_bbox', 0.019), ('loss_giou', 0.25), ('loss_self_iou', 0.007), ('cardinality_error', 7.699), ('loss_ce_0', 0.292), ('loss_counter_0', 0.118), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.265), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.699), ('loss_caption_0', 2.105), ('loss_caption', 2.111), ('total_loss', 11.78)]),
time/iter = 0.172, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 7182 (epoch 5),
loss = OrderedDict([('loss_ce', 0.29), ('loss_counter', 0.115), ('loss_bbox', 0.021), ('loss_giou', 0.242), ('loss_self_iou', 0.008), ('cardinality_error', 7.594), ('loss_ce_0', 0.288), ('loss_counter_0', 0.115), ('loss_bbox_0', 0.022), ('loss_giou_0', 0.257), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 7.594), ('loss_caption_0', 2.194), ('loss_caption', 2.195), ('total_loss', 12.045)]),
time/iter = 0.160, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 7315 (epoch 5),
loss = OrderedDict([('loss_ce', 0.29), ('loss_counter', 0.123), ('loss_bbox', 0.02), ('loss_giou', 0.254), ('loss_self_iou', 0.009), ('cardinality_error', 8.301), ('loss_ce_0', 0.291), ('loss_counter_0', 0.123), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.268), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 8.301), ('loss_caption_0', 2.096), ('loss_caption', 2.09), ('total_loss', 11.741)]),
time/iter = 0.153, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 7448 (epoch 5),
loss = OrderedDict([('loss_ce', 0.296), ('loss_counter', 0.12), ('loss_bbox', 0.019), ('loss_giou', 0.234), ('loss_self_iou', 0.006), ('cardinality_error', 7.677), ('loss_ce_0', 0.292), ('loss_counter_0', 0.12), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.251), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.677), ('loss_caption_0', 2.076), ('loss_caption', 2.063), ('total_loss', 11.513)]),
time/iter = 0.152, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 7581 (epoch 5),
loss = OrderedDict([('loss_ce', 0.298), ('loss_counter', 0.116), ('loss_bbox', 0.019), ('loss_giou', 0.238), ('loss_self_iou', 0.008), ('cardinality_error', 7.534), ('loss_ce_0', 0.295), ('loss_counter_0', 0.116), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.253), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.534), ('loss_caption_0', 2.114), ('loss_caption', 2.112), ('total_loss', 11.718)]),
time/iter = 0.154, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 7714 (epoch 5),
loss = OrderedDict([('loss_ce', 0.295), ('loss_counter', 0.117), ('loss_bbox', 0.018), ('loss_giou', 0.235), ('loss_self_iou', 0.008), ('cardinality_error', 7.677), ('loss_ce_0', 0.291), ('loss_counter_0', 0.116), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.253), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.677), ('loss_caption_0', 2.167), ('loss_caption', 2.179), ('total_loss', 11.932)]),
time/iter = 0.160, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 7847 (epoch 5),
loss = OrderedDict([('loss_ce', 0.293), ('loss_counter', 0.118), ('loss_bbox', 0.019), ('loss_giou', 0.252), ('loss_self_iou', 0.009), ('cardinality_error', 8.053), ('loss_ce_0', 0.289), ('loss_counter_0', 0.117), ('loss_bbox_0', 0.021), ('loss_giou_0', 0.269), ('loss_self_iou_0', 0.009), ('cardinality_error_0', 8.053), ('loss_caption_0', 2.106), ('loss_caption', 2.115), ('total_loss', 11.804)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 7980 (epoch 5),
loss = OrderedDict([('loss_ce', 0.3), ('loss_counter', 0.118), ('loss_bbox', 0.019), ('loss_giou', 0.249), ('loss_self_iou', 0.007), ('cardinality_error', 7.902), ('loss_ce_0', 0.295), ('loss_counter_0', 0.117), ('loss_bbox_0', 0.021), ('loss_giou_0', 0.268), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.902), ('loss_caption_0', 2.151), ('loss_caption', 2.153), ('total_loss', 11.979)]),
time/iter = 0.158, bad_vid = 0.000
Validation results of iter 7998:
Bleu_1:0.19874944106127662
Bleu_2:0.12266046915797622
Bleu_3:0.07150852984916518
Bleu_4:0.036185181004552064
METEOR:0.09274687098087099
ROUGE_L:0.18413336093424784
CIDEr:0.5727051685734265
Recall:0.259037909270404
Precision:0.451289465457956
soda_c:0.07263494732248185
para_Bleu_1:0.32307562783294125
para_Bleu_2:0.1944214796418441
para_Bleu_3:0.11901149393254483
para_Bleu_4:0.07454555120453704
para_METEOR:0.14324209261218024
para_ROUGE_L:0.31918573126228
para_CIDEr:0.23096832321460165
overall score of iter 7998: 0.4487559670313189
Save model at iter 7998 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 8113 (epoch 6),
loss = OrderedDict([('loss_ce', 0.297), ('loss_counter', 0.114), ('loss_bbox', 0.019), ('loss_giou', 0.236), ('loss_self_iou', 0.008), ('cardinality_error', 7.617), ('loss_ce_0', 0.295), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.021), ('loss_giou_0', 0.257), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.617), ('loss_caption_0', 2.036), ('loss_caption', 2.044), ('total_loss', 11.427)]),
time/iter = 0.677, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 8246 (epoch 6),
loss = OrderedDict([('loss_ce', 0.286), ('loss_counter', 0.119), ('loss_bbox', 0.019), ('loss_giou', 0.237), ('loss_self_iou', 0.006), ('cardinality_error', 7.827), ('loss_ce_0', 0.283), ('loss_counter_0', 0.119), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.257), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.827), ('loss_caption_0', 2.055), ('loss_caption', 2.057), ('total_loss', 11.458)]),
time/iter = 0.160, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 8379 (epoch 6),
loss = OrderedDict([('loss_ce', 0.29), ('loss_counter', 0.118), ('loss_bbox', 0.018), ('loss_giou', 0.225), ('loss_self_iou', 0.005), ('cardinality_error', 7.82), ('loss_ce_0', 0.286), ('loss_counter_0', 0.117), ('loss_bbox_0', 0.019), ('loss_giou_0', 0.246), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.82), ('loss_caption_0', 2.046), ('loss_caption', 2.041), ('total_loss', 11.331)]),
time/iter = 0.160, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 8512 (epoch 6),
loss = OrderedDict([('loss_ce', 0.286), ('loss_counter', 0.114), ('loss_bbox', 0.018), ('loss_giou', 0.228), ('loss_self_iou', 0.006), ('cardinality_error', 7.654), ('loss_ce_0', 0.283), ('loss_counter_0', 0.114), ('loss_bbox_0', 0.019), ('loss_giou_0', 0.245), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.654), ('loss_caption_0', 1.991), ('loss_caption', 1.997), ('total_loss', 11.118)]),
time/iter = 0.157, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 8645 (epoch 6),
loss = OrderedDict([('loss_ce', 0.29), ('loss_counter', 0.115), ('loss_bbox', 0.02), ('loss_giou', 0.251), ('loss_self_iou', 0.007), ('cardinality_error', 8.068), ('loss_ce_0', 0.287), ('loss_counter_0', 0.115), ('loss_bbox_0', 0.022), ('loss_giou_0', 0.265), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 8.068), ('loss_caption_0', 2.094), ('loss_caption', 2.097), ('total_loss', 11.714)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 8778 (epoch 6),
loss = OrderedDict([('loss_ce', 0.288), ('loss_counter', 0.121), ('loss_bbox', 0.019), ('loss_giou', 0.24), ('loss_self_iou', 0.008), ('cardinality_error', 8.008), ('loss_ce_0', 0.286), ('loss_counter_0', 0.121), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.258), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 8.008), ('loss_caption_0', 2.092), ('loss_caption', 2.092), ('total_loss', 11.63)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 8911 (epoch 6),
loss = OrderedDict([('loss_ce', 0.298), ('loss_counter', 0.114), ('loss_bbox', 0.019), ('loss_giou', 0.235), ('loss_self_iou', 0.008), ('cardinality_error', 7.338), ('loss_ce_0', 0.297), ('loss_counter_0', 0.114), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.248), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.338), ('loss_caption_0', 2.051), ('loss_caption', 2.054), ('total_loss', 11.446)]),
time/iter = 0.156, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 9044 (epoch 6),
loss = OrderedDict([('loss_ce', 0.292), ('loss_counter', 0.105), ('loss_bbox', 0.02), ('loss_giou', 0.227), ('loss_self_iou', 0.008), ('cardinality_error', 7.226), ('loss_ce_0', 0.292), ('loss_counter_0', 0.105), ('loss_bbox_0', 0.021), ('loss_giou_0', 0.243), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.226), ('loss_caption_0', 2.08), ('loss_caption', 2.084), ('total_loss', 11.478)]),
time/iter = 0.160, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 9177 (epoch 6),
loss = OrderedDict([('loss_ce', 0.291), ('loss_counter', 0.12), ('loss_bbox', 0.019), ('loss_giou', 0.254), ('loss_self_iou', 0.007), ('cardinality_error', 7.977), ('loss_ce_0', 0.288), ('loss_counter_0', 0.119), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.275), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.977), ('loss_caption_0', 2.046), ('loss_caption', 2.031), ('total_loss', 11.546)]),
time/iter = 0.158, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 9310 (epoch 6),
loss = OrderedDict([('loss_ce', 0.28), ('loss_counter', 0.117), ('loss_bbox', 0.018), ('loss_giou', 0.236), ('loss_self_iou', 0.006), ('cardinality_error', 7.97), ('loss_ce_0', 0.281), ('loss_counter_0', 0.118), ('loss_bbox_0', 0.019), ('loss_giou_0', 0.252), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.97), ('loss_caption_0', 1.986), ('loss_caption', 1.995), ('total_loss', 11.157)]),
time/iter = 0.152, bad_vid = 0.000
Validation results of iter 9331:
Bleu_1:0.2003309018825777
Bleu_2:0.1225756065112458
Bleu_3:0.06724461390362559
Bleu_4:0.033684328156599955
METEOR:0.0938288297360794
ROUGE_L:0.1832565856913202
CIDEr:0.5805494889367487
Recall:0.28578288505804933
Precision:0.4570872842207636
soda_c:0.07457933387713374
para_Bleu_1:0.3713316702717572
para_Bleu_2:0.22391267992808692
para_Bleu_3:0.1360620228892395
para_Bleu_4:0.08475146307949002
para_METEOR:0.15553928732702577
para_ROUGE_L:0.3279787647771023
para_CIDEr:0.24807495620487915
overall score of iter 9331: 0.4883657066113949
Save model at iter 9331 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save Best-model at iter 9331 to checkpoint file.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 9443 (epoch 7),
loss = OrderedDict([('loss_ce', 0.294), ('loss_counter', 0.114), ('loss_bbox', 0.018), ('loss_giou', 0.226), ('loss_self_iou', 0.006), ('cardinality_error', 7.617), ('loss_ce_0', 0.292), ('loss_counter_0', 0.114), ('loss_bbox_0', 0.019), ('loss_giou_0', 0.239), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.617), ('loss_caption_0', 2.065), ('loss_caption', 2.061), ('total_loss', 11.394)]),
time/iter = 0.717, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 9576 (epoch 7),
loss = OrderedDict([('loss_ce', 0.285), ('loss_counter', 0.119), ('loss_bbox', 0.02), ('loss_giou', 0.231), ('loss_self_iou', 0.006), ('cardinality_error', 7.917), ('loss_ce_0', 0.284), ('loss_counter_0', 0.119), ('loss_bbox_0', 0.021), ('loss_giou_0', 0.252), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.917), ('loss_caption_0', 1.977), ('loss_caption', 1.974), ('total_loss', 11.093)]),
time/iter = 0.165, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 9709 (epoch 7),
loss = OrderedDict([('loss_ce', 0.291), ('loss_counter', 0.117), ('loss_bbox', 0.016), ('loss_giou', 0.224), ('loss_self_iou', 0.006), ('cardinality_error', 8.098), ('loss_ce_0', 0.29), ('loss_counter_0', 0.117), ('loss_bbox_0', 0.018), ('loss_giou_0', 0.242), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 8.098), ('loss_caption_0', 2.051), ('loss_caption', 2.063), ('total_loss', 11.373)]),
time/iter = 0.170, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 9842 (epoch 7),
loss = OrderedDict([('loss_ce', 0.288), ('loss_counter', 0.11), ('loss_bbox', 0.018), ('loss_giou', 0.242), ('loss_self_iou', 0.007), ('cardinality_error', 7.662), ('loss_ce_0', 0.286), ('loss_counter_0', 0.11), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.262), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.662), ('loss_caption_0', 1.939), ('loss_caption', 1.953), ('total_loss', 11.058)]),
time/iter = 0.169, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 9975 (epoch 7),
loss = OrderedDict([('loss_ce', 0.28), ('loss_counter', 0.116), ('loss_bbox', 0.017), ('loss_giou', 0.238), ('loss_self_iou', 0.006), ('cardinality_error', 8.233), ('loss_ce_0', 0.281), ('loss_counter_0', 0.116), ('loss_bbox_0', 0.018), ('loss_giou_0', 0.255), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 8.233), ('loss_caption_0', 2.024), ('loss_caption', 2.026), ('total_loss', 11.31)]),
time/iter = 0.167, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 10108 (epoch 7),
loss = OrderedDict([('loss_ce', 0.283), ('loss_counter', 0.111), ('loss_bbox', 0.018), ('loss_giou', 0.232), ('loss_self_iou', 0.006), ('cardinality_error', 7.466), ('loss_ce_0', 0.279), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.246), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.466), ('loss_caption_0', 1.878), ('loss_caption', 1.882), ('total_loss', 10.667)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 10241 (epoch 7),
loss = OrderedDict([('loss_ce', 0.285), ('loss_counter', 0.119), ('loss_bbox', 0.018), ('loss_giou', 0.24), ('loss_self_iou', 0.007), ('cardinality_error', 7.722), ('loss_ce_0', 0.282), ('loss_counter_0', 0.119), ('loss_bbox_0', 0.019), ('loss_giou_0', 0.253), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.722), ('loss_caption_0', 1.984), ('loss_caption', 1.988), ('total_loss', 11.165)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 10374 (epoch 7),
loss = OrderedDict([('loss_ce', 0.292), ('loss_counter', 0.113), ('loss_bbox', 0.017), ('loss_giou', 0.225), ('loss_self_iou', 0.007), ('cardinality_error', 7.692), ('loss_ce_0', 0.285), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.019), ('loss_giou_0', 0.241), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.692), ('loss_caption_0', 2.089), ('loss_caption', 2.094), ('total_loss', 11.498)]),
time/iter = 0.164, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 10507 (epoch 7),
loss = OrderedDict([('loss_ce', 0.287), ('loss_counter', 0.113), ('loss_bbox', 0.019), ('loss_giou', 0.22), ('loss_self_iou', 0.007), ('cardinality_error', 7.564), ('loss_ce_0', 0.283), ('loss_counter_0', 0.113), ('loss_bbox_0', 0.021), ('loss_giou_0', 0.241), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.564), ('loss_caption_0', 1.936), ('loss_caption', 1.935), ('total_loss', 10.84)]),
time/iter = 0.165, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 10640 (epoch 7),
loss = OrderedDict([('loss_ce', 0.281), ('loss_counter', 0.115), ('loss_bbox', 0.02), ('loss_giou', 0.232), ('loss_self_iou', 0.008), ('cardinality_error', 7.549), ('loss_ce_0', 0.278), ('loss_counter_0', 0.116), ('loss_bbox_0', 0.022), ('loss_giou_0', 0.249), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.549), ('loss_caption_0', 2.041), ('loss_caption', 2.042), ('total_loss', 11.323)]),
time/iter = 0.178, bad_vid = 0.000
Validation results of iter 10664:
Bleu_1:0.19584871429233122
Bleu_2:0.1203954133477019
Bleu_3:0.06765236989260215
Bleu_4:0.03515047236439923
METEOR:0.09347581038898298
ROUGE_L:0.18336361365161372
CIDEr:0.5642570328531701
Recall:0.287053410514844
Precision:0.4506790316418327
soda_c:0.07315525040409161
para_Bleu_1:0.39595219023577966
para_Bleu_2:0.23717913606151478
para_Bleu_3:0.14480681642134902
para_Bleu_4:0.0901695364250172
para_METEOR:0.16127903027678414
para_ROUGE_L:0.3324403291093838
para_CIDEr:0.23804687234043756
overall score of iter 10664: 0.48949543904223886
Save model at iter 10664 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save Best-model at iter 10664 to checkpoint file.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 10773 (epoch 8),
loss = OrderedDict([('loss_ce', 0.279), ('loss_counter', 0.114), ('loss_bbox', 0.017), ('loss_giou', 0.235), ('loss_self_iou', 0.006), ('cardinality_error', 7.94), ('loss_ce_0', 0.278), ('loss_counter_0', 0.115), ('loss_bbox_0', 0.018), ('loss_giou_0', 0.253), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.94), ('loss_caption_0', 1.851), ('loss_caption', 1.84), ('total_loss', 10.561)]),
time/iter = 0.724, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 10906 (epoch 8),
loss = OrderedDict([('loss_ce', 0.279), ('loss_counter', 0.109), ('loss_bbox', 0.017), ('loss_giou', 0.215), ('loss_self_iou', 0.006), ('cardinality_error', 7.218), ('loss_ce_0', 0.278), ('loss_counter_0', 0.109), ('loss_bbox_0', 0.018), ('loss_giou_0', 0.231), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.218), ('loss_caption_0', 1.945), ('loss_caption', 1.948), ('total_loss', 10.791)]),
time/iter = 0.165, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 11039 (epoch 8),
loss = OrderedDict([('loss_ce', 0.288), ('loss_counter', 0.108), ('loss_bbox', 0.017), ('loss_giou', 0.207), ('loss_self_iou', 0.006), ('cardinality_error', 7.579), ('loss_ce_0', 0.283), ('loss_counter_0', 0.109), ('loss_bbox_0', 0.018), ('loss_giou_0', 0.223), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.579), ('loss_caption_0', 1.92), ('loss_caption', 1.927), ('total_loss', 10.664)]),
time/iter = 0.165, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 11172 (epoch 8),
loss = OrderedDict([('loss_ce', 0.28), ('loss_counter', 0.11), ('loss_bbox', 0.018), ('loss_giou', 0.215), ('loss_self_iou', 0.006), ('cardinality_error', 7.451), ('loss_ce_0', 0.279), ('loss_counter_0', 0.11), ('loss_bbox_0', 0.019), ('loss_giou_0', 0.231), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.451), ('loss_caption_0', 1.91), ('loss_caption', 1.9), ('total_loss', 10.635)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 11305 (epoch 8),
loss = OrderedDict([('loss_ce', 0.278), ('loss_counter', 0.125), ('loss_bbox', 0.017), ('loss_giou', 0.233), ('loss_self_iou', 0.006), ('cardinality_error', 8.09), ('loss_ce_0', 0.276), ('loss_counter_0', 0.126), ('loss_bbox_0', 0.018), ('loss_giou_0', 0.244), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 8.09), ('loss_caption_0', 1.876), ('loss_caption', 1.877), ('total_loss', 10.648)]),
time/iter = 0.152, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 11438 (epoch 8),
loss = OrderedDict([('loss_ce', 0.273), ('loss_counter', 0.113), ('loss_bbox', 0.016), ('loss_giou', 0.211), ('loss_self_iou', 0.005), ('cardinality_error', 7.744), ('loss_ce_0', 0.269), ('loss_counter_0', 0.113), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.231), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.744), ('loss_caption_0', 1.981), ('loss_caption', 1.968), ('total_loss', 10.865)]),
time/iter = 0.156, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 11571 (epoch 8),
loss = OrderedDict([('loss_ce', 0.281), ('loss_counter', 0.114), ('loss_bbox', 0.018), ('loss_giou', 0.225), ('loss_self_iou', 0.006), ('cardinality_error', 7.699), ('loss_ce_0', 0.277), ('loss_counter_0', 0.115), ('loss_bbox_0', 0.02), ('loss_giou_0', 0.243), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.699), ('loss_caption_0', 1.833), ('loss_caption', 1.846), ('total_loss', 10.461)]),
time/iter = 0.149, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 11704 (epoch 8),
loss = OrderedDict([('loss_ce', 0.28), ('loss_counter', 0.115), ('loss_bbox', 0.017), ('loss_giou', 0.21), ('loss_self_iou', 0.006), ('cardinality_error', 7.82), ('loss_ce_0', 0.278), ('loss_counter_0', 0.116), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.226), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.82), ('loss_caption_0', 1.91), ('loss_caption', 1.915), ('total_loss', 10.628)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 11837 (epoch 8),
loss = OrderedDict([('loss_ce', 0.271), ('loss_counter', 0.111), ('loss_bbox', 0.016), ('loss_giou', 0.215), ('loss_self_iou', 0.007), ('cardinality_error', 8.0), ('loss_ce_0', 0.273), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.018), ('loss_giou_0', 0.23), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 8.0), ('loss_caption_0', 1.936), ('loss_caption', 1.939), ('total_loss', 10.726)]),
time/iter = 0.156, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 11970 (epoch 8),
loss = OrderedDict([('loss_ce', 0.272), ('loss_counter', 0.115), ('loss_bbox', 0.017), ('loss_giou', 0.22), ('loss_self_iou', 0.006), ('cardinality_error', 8.158), ('loss_ce_0', 0.27), ('loss_counter_0', 0.116), ('loss_bbox_0', 0.018), ('loss_giou_0', 0.242), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 8.158), ('loss_caption_0', 1.953), ('loss_caption', 1.962), ('total_loss', 10.881)]),
time/iter = 0.168, bad_vid = 0.000
Validation results of iter 11997:
Bleu_1:0.19696025394358163
Bleu_2:0.12042554867022627
Bleu_3:0.06805715701089529
Bleu_4:0.034063345644385214
METEOR:0.09208296372249718
ROUGE_L:0.1803782633150628
CIDEr:0.5812603125344058
Recall:0.29169024735901117
Precision:0.44299129936438486
soda_c:0.07606608300691252
para_Bleu_1:0.383549187276652
para_Bleu_2:0.23192713278728125
para_Bleu_3:0.14217181061136971
para_Bleu_4:0.0892715976218228
para_METEOR:0.16074434603101373
para_ROUGE_L:0.3336567463040183
para_CIDEr:0.2859809872200661
overall score of iter 11997: 0.5359969308729027
Save model at iter 11997 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save Best-model at iter 11997 to checkpoint file.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 12103 (epoch 9),
loss = OrderedDict([('loss_ce', 0.275), ('loss_counter', 0.111), ('loss_bbox', 0.016), ('loss_giou', 0.216), ('loss_self_iou', 0.006), ('cardinality_error', 8.038), ('loss_ce_0', 0.274), ('loss_counter_0', 0.111), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.231), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 8.038), ('loss_caption_0', 1.832), ('loss_caption', 1.845), ('total_loss', 10.35)]),
time/iter = 0.705, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 12236 (epoch 9),
loss = OrderedDict([('loss_ce', 0.272), ('loss_counter', 0.111), ('loss_bbox', 0.016), ('loss_giou', 0.206), ('loss_self_iou', 0.005), ('cardinality_error', 7.812), ('loss_ce_0', 0.266), ('loss_counter_0', 0.111), ('loss_bbox_0', 0.018), ('loss_giou_0', 0.223), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.812), ('loss_caption_0', 1.968), ('loss_caption', 1.959), ('total_loss', 10.757)]),
time/iter = 0.166, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 12369 (epoch 9),
loss = OrderedDict([('loss_ce', 0.273), ('loss_counter', 0.118), ('loss_bbox', 0.016), ('loss_giou', 0.21), ('loss_self_iou', 0.005), ('cardinality_error', 7.827), ('loss_ce_0', 0.27), ('loss_counter_0', 0.118), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.226), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.827), ('loss_caption_0', 1.89), ('loss_caption', 1.903), ('total_loss', 10.534)]),
time/iter = 0.158, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 12502 (epoch 9),
loss = OrderedDict([('loss_ce', 0.27), ('loss_counter', 0.108), ('loss_bbox', 0.016), ('loss_giou', 0.205), ('loss_self_iou', 0.006), ('cardinality_error', 7.684), ('loss_ce_0', 0.268), ('loss_counter_0', 0.108), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.224), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.684), ('loss_caption_0', 1.903), ('loss_caption', 1.905), ('total_loss', 10.519)]),
time/iter = 0.160, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 12635 (epoch 9),
loss = OrderedDict([('loss_ce', 0.27), ('loss_counter', 0.111), ('loss_bbox', 0.015), ('loss_giou', 0.218), ('loss_self_iou', 0.005), ('cardinality_error', 7.947), ('loss_ce_0', 0.269), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.232), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.947), ('loss_caption_0', 1.822), ('loss_caption', 1.826), ('total_loss', 10.284)]),
time/iter = 0.158, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 12768 (epoch 9),
loss = OrderedDict([('loss_ce', 0.277), ('loss_counter', 0.111), ('loss_bbox', 0.017), ('loss_giou', 0.219), ('loss_self_iou', 0.008), ('cardinality_error', 7.669), ('loss_ce_0', 0.276), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.018), ('loss_giou_0', 0.235), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.669), ('loss_caption_0', 1.905), ('loss_caption', 1.909), ('total_loss', 10.662)]),
time/iter = 0.160, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 12901 (epoch 9),
loss = OrderedDict([('loss_ce', 0.269), ('loss_counter', 0.106), ('loss_bbox', 0.015), ('loss_giou', 0.208), ('loss_self_iou', 0.005), ('cardinality_error', 7.639), ('loss_ce_0', 0.267), ('loss_counter_0', 0.108), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.224), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.639), ('loss_caption_0', 1.856), ('loss_caption', 1.863), ('total_loss', 10.344)]),
time/iter = 0.154, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 13034 (epoch 9),
loss = OrderedDict([('loss_ce', 0.273), ('loss_counter', 0.112), ('loss_bbox', 0.015), ('loss_giou', 0.216), ('loss_self_iou', 0.005), ('cardinality_error', 7.85), ('loss_ce_0', 0.274), ('loss_counter_0', 0.113), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.231), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.85), ('loss_caption_0', 1.841), ('loss_caption', 1.841), ('total_loss', 10.356)]),
time/iter = 0.149, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 13167 (epoch 9),
loss = OrderedDict([('loss_ce', 0.275), ('loss_counter', 0.109), ('loss_bbox', 0.018), ('loss_giou', 0.21), ('loss_self_iou', 0.005), ('cardinality_error', 7.406), ('loss_ce_0', 0.273), ('loss_counter_0', 0.109), ('loss_bbox_0', 0.019), ('loss_giou_0', 0.226), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.406), ('loss_caption_0', 1.931), ('loss_caption', 1.927), ('total_loss', 10.663)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 13300 (epoch 9),
loss = OrderedDict([('loss_ce', 0.274), ('loss_counter', 0.113), ('loss_bbox', 0.017), ('loss_giou', 0.212), ('loss_self_iou', 0.005), ('cardinality_error', 7.737), ('loss_ce_0', 0.272), ('loss_counter_0', 0.113), ('loss_bbox_0', 0.019), ('loss_giou_0', 0.23), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.737), ('loss_caption_0', 1.853), ('loss_caption', 1.849), ('total_loss', 10.379)]),
time/iter = 0.154, bad_vid = 0.000
Validation results of iter 13330:
Bleu_1:0.20446290018298774
Bleu_2:0.12418412895577716
Bleu_3:0.06899010124646034
Bleu_4:0.03428116460131532
METEOR:0.09595521703655657
ROUGE_L:0.1876517650928566
CIDEr:0.5887832993219201
Recall:0.3017153873964599
Precision:0.4588439095550697
soda_c:0.07875391677883807
para_Bleu_1:0.3953706124668704
para_Bleu_2:0.24043007714841402
para_Bleu_3:0.14833197751929023
para_Bleu_4:0.09386644902900565
para_METEOR:0.16476396966168239
para_ROUGE_L:0.33760319454244797
para_CIDEr:0.31194480042956774
overall score of iter 13330: 0.5705752191202558
Save model at iter 13330 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save Best-model at iter 13330 to checkpoint file.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 13433 (epoch 10),
loss = OrderedDict([('loss_ce', 0.271), ('loss_counter', 0.112), ('loss_bbox', 0.017), ('loss_giou', 0.217), ('loss_self_iou', 0.006), ('cardinality_error', 7.835), ('loss_ce_0', 0.267), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.018), ('loss_giou_0', 0.235), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.835), ('loss_caption_0', 1.804), ('loss_caption', 1.811), ('total_loss', 10.223)]),
time/iter = 0.700, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 13566 (epoch 10),
loss = OrderedDict([('loss_ce', 0.266), ('loss_counter', 0.116), ('loss_bbox', 0.015), ('loss_giou', 0.204), ('loss_self_iou', 0.005), ('cardinality_error', 7.774), ('loss_ce_0', 0.266), ('loss_counter_0', 0.116), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.221), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.774), ('loss_caption_0', 1.884), ('loss_caption', 1.887), ('total_loss', 10.42)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 13699 (epoch 10),
loss = OrderedDict([('loss_ce', 0.261), ('loss_counter', 0.111), ('loss_bbox', 0.016), ('loss_giou', 0.201), ('loss_self_iou', 0.006), ('cardinality_error', 7.729), ('loss_ce_0', 0.259), ('loss_counter_0', 0.11), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.218), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.729), ('loss_caption_0', 1.823), ('loss_caption', 1.806), ('total_loss', 10.083)]),
time/iter = 0.158, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 13832 (epoch 10),
loss = OrderedDict([('loss_ce', 0.269), ('loss_counter', 0.111), ('loss_bbox', 0.016), ('loss_giou', 0.211), ('loss_self_iou', 0.005), ('cardinality_error', 7.699), ('loss_ce_0', 0.271), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.228), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.699), ('loss_caption_0', 1.855), ('loss_caption', 1.857), ('total_loss', 10.374)]),
time/iter = 0.164, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 13965 (epoch 10),
loss = OrderedDict([('loss_ce', 0.275), ('loss_counter', 0.105), ('loss_bbox', 0.016), ('loss_giou', 0.196), ('loss_self_iou', 0.006), ('cardinality_error', 7.128), ('loss_ce_0', 0.271), ('loss_counter_0', 0.106), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.214), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.128), ('loss_caption_0', 1.809), ('loss_caption', 1.8), ('total_loss', 10.055)]),
time/iter = 0.151, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 14098 (epoch 10),
loss = OrderedDict([('loss_ce', 0.273), ('loss_counter', 0.112), ('loss_bbox', 0.016), ('loss_giou', 0.213), ('loss_self_iou', 0.007), ('cardinality_error', 7.925), ('loss_ce_0', 0.273), ('loss_counter_0', 0.113), ('loss_bbox_0', 0.018), ('loss_giou_0', 0.23), ('loss_self_iou_0', 0.008), ('cardinality_error_0', 7.925), ('loss_caption_0', 1.863), ('loss_caption', 1.863), ('total_loss', 10.433)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 14231 (epoch 10),
loss = OrderedDict([('loss_ce', 0.261), ('loss_counter', 0.113), ('loss_bbox', 0.017), ('loss_giou', 0.212), ('loss_self_iou', 0.007), ('cardinality_error', 7.82), ('loss_ce_0', 0.262), ('loss_counter_0', 0.114), ('loss_bbox_0', 0.018), ('loss_giou_0', 0.222), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.82), ('loss_caption_0', 1.936), ('loss_caption', 1.929), ('total_loss', 10.624)]),
time/iter = 0.161, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 14364 (epoch 10),
loss = OrderedDict([('loss_ce', 0.263), ('loss_counter', 0.104), ('loss_bbox', 0.015), ('loss_giou', 0.216), ('loss_self_iou', 0.005), ('cardinality_error', 7.744), ('loss_ce_0', 0.263), ('loss_counter_0', 0.104), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.227), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.744), ('loss_caption_0', 1.757), ('loss_caption', 1.754), ('total_loss', 9.948)]),
time/iter = 0.185, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 14497 (epoch 10),
loss = OrderedDict([('loss_ce', 0.266), ('loss_counter', 0.11), ('loss_bbox', 0.015), ('loss_giou', 0.2), ('loss_self_iou', 0.005), ('cardinality_error', 7.827), ('loss_ce_0', 0.265), ('loss_counter_0', 0.111), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.214), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.827), ('loss_caption_0', 1.896), ('loss_caption', 1.894), ('total_loss', 10.407)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 14630 (epoch 10),
loss = OrderedDict([('loss_ce', 0.263), ('loss_counter', 0.113), ('loss_bbox', 0.015), ('loss_giou', 0.208), ('loss_self_iou', 0.005), ('cardinality_error', 7.925), ('loss_ce_0', 0.261), ('loss_counter_0', 0.113), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.224), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.925), ('loss_caption_0', 1.84), ('loss_caption', 1.842), ('total_loss', 10.253)]),
time/iter = 0.158, bad_vid = 0.000
Validation results of iter 14663:
Bleu_1:0.19267153393038786
Bleu_2:0.11732781330402656
Bleu_3:0.06746115616325608
Bleu_4:0.03425583839334337
METEOR:0.08963300348041837
ROUGE_L:0.17480207136309905
CIDEr:0.575137603362526
Recall:0.30432682743951917
Precision:0.4353044354138446
soda_c:0.07762847290423684
para_Bleu_1:0.393384019586376
para_Bleu_2:0.23835405770332685
para_Bleu_3:0.14545808678454117
para_Bleu_4:0.09085202435904723
para_METEOR:0.16354570345255123
para_ROUGE_L:0.3343729651839732
para_CIDEr:0.27098453497923136
overall score of iter 14663: 0.5253822627908299
Save model at iter 14663 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 14763 (epoch 11),
loss = OrderedDict([('loss_ce', 0.266), ('loss_counter', 0.111), ('loss_bbox', 0.015), ('loss_giou', 0.208), ('loss_self_iou', 0.006), ('cardinality_error', 7.85), ('loss_ce_0', 0.264), ('loss_counter_0', 0.111), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.225), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.85), ('loss_caption_0', 1.87), ('loss_caption', 1.877), ('total_loss', 10.398)]),
time/iter = 0.690, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 14896 (epoch 11),
loss = OrderedDict([('loss_ce', 0.26), ('loss_counter', 0.112), ('loss_bbox', 0.015), ('loss_giou', 0.2), ('loss_self_iou', 0.005), ('cardinality_error', 7.692), ('loss_ce_0', 0.259), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.217), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.692), ('loss_caption_0', 1.796), ('loss_caption', 1.784), ('total_loss', 9.979)]),
time/iter = 0.160, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 15029 (epoch 11),
loss = OrderedDict([('loss_ce', 0.264), ('loss_counter', 0.103), ('loss_bbox', 0.015), ('loss_giou', 0.195), ('loss_self_iou', 0.006), ('cardinality_error', 7.414), ('loss_ce_0', 0.264), ('loss_counter_0', 0.104), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.211), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.414), ('loss_caption_0', 1.763), ('loss_caption', 1.767), ('total_loss', 9.842)]),
time/iter = 0.154, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 15162 (epoch 11),
loss = OrderedDict([('loss_ce', 0.263), ('loss_counter', 0.113), ('loss_bbox', 0.015), ('loss_giou', 0.196), ('loss_self_iou', 0.004), ('cardinality_error', 7.767), ('loss_ce_0', 0.262), ('loss_counter_0', 0.113), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.211), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.767), ('loss_caption_0', 1.781), ('loss_caption', 1.781), ('total_loss', 9.916)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 15295 (epoch 11),
loss = OrderedDict([('loss_ce', 0.257), ('loss_counter', 0.106), ('loss_bbox', 0.015), ('loss_giou', 0.2), ('loss_self_iou', 0.005), ('cardinality_error', 7.662), ('loss_ce_0', 0.255), ('loss_counter_0', 0.108), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.214), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.662), ('loss_caption_0', 1.735), ('loss_caption', 1.75), ('total_loss', 9.755)]),
time/iter = 0.153, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 15428 (epoch 11),
loss = OrderedDict([('loss_ce', 0.258), ('loss_counter', 0.114), ('loss_bbox', 0.015), ('loss_giou', 0.21), ('loss_self_iou', 0.005), ('cardinality_error', 7.992), ('loss_ce_0', 0.261), ('loss_counter_0', 0.114), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.22), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.992), ('loss_caption_0', 1.852), ('loss_caption', 1.86), ('total_loss', 10.298)]),
time/iter = 0.156, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 15561 (epoch 11),
loss = OrderedDict([('loss_ce', 0.256), ('loss_counter', 0.112), ('loss_bbox', 0.015), ('loss_giou', 0.204), ('loss_self_iou', 0.006), ('cardinality_error', 8.068), ('loss_ce_0', 0.257), ('loss_counter_0', 0.113), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.218), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 8.068), ('loss_caption_0', 1.878), ('loss_caption', 1.866), ('total_loss', 10.314)]),
time/iter = 0.161, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 15694 (epoch 11),
loss = OrderedDict([('loss_ce', 0.256), ('loss_counter', 0.106), ('loss_bbox', 0.015), ('loss_giou', 0.202), ('loss_self_iou', 0.004), ('cardinality_error', 7.647), ('loss_ce_0', 0.257), ('loss_counter_0', 0.107), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.215), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.647), ('loss_caption_0', 1.7), ('loss_caption', 1.684), ('total_loss', 9.569)]),
time/iter = 0.152, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 15827 (epoch 11),
loss = OrderedDict([('loss_ce', 0.259), ('loss_counter', 0.104), ('loss_bbox', 0.016), ('loss_giou', 0.194), ('loss_self_iou', 0.005), ('cardinality_error', 7.722), ('loss_ce_0', 0.257), ('loss_counter_0', 0.105), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.209), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.722), ('loss_caption_0', 1.848), ('loss_caption', 1.839), ('total_loss', 10.119)]),
time/iter = 0.153, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 15960 (epoch 11),
loss = OrderedDict([('loss_ce', 0.26), ('loss_counter', 0.107), ('loss_bbox', 0.015), ('loss_giou', 0.197), ('loss_self_iou', 0.004), ('cardinality_error', 7.609), ('loss_ce_0', 0.257), ('loss_counter_0', 0.107), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.214), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.609), ('loss_caption_0', 1.847), ('loss_caption', 1.858), ('total_loss', 10.198)]),
time/iter = 0.161, bad_vid = 0.000
Validation results of iter 15996:
Bleu_1:0.1989422607268001
Bleu_2:0.12223038556953512
Bleu_3:0.06835990671747892
Bleu_4:0.03486159828438583
METEOR:0.09408978838449876
ROUGE_L:0.18200142867223945
CIDEr:0.593480700759431
Recall:0.30795469953703025
Precision:0.4513424333993264
soda_c:0.0796861065455984
para_Bleu_1:0.39594509057043764
para_Bleu_2:0.24087109399513515
para_Bleu_3:0.14790262814870953
para_Bleu_4:0.09321042711819619
para_METEOR:0.1655617051143519
para_ROUGE_L:0.3391051008488012
para_CIDEr:0.32807196750555834
overall score of iter 15996: 0.5868440997381064
Save model at iter 15996 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save Best-model at iter 15996 to checkpoint file.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 16093 (epoch 12),
loss = OrderedDict([('loss_ce', 0.257), ('loss_counter', 0.109), ('loss_bbox', 0.015), ('loss_giou', 0.19), ('loss_self_iou', 0.004), ('cardinality_error', 7.992), ('loss_ce_0', 0.258), ('loss_counter_0', 0.11), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.202), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.992), ('loss_caption_0', 1.773), ('loss_caption', 1.769), ('total_loss', 9.789)]),
time/iter = 0.727, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 16226 (epoch 12),
loss = OrderedDict([('loss_ce', 0.26), ('loss_counter', 0.11), ('loss_bbox', 0.014), ('loss_giou', 0.198), ('loss_self_iou', 0.004), ('cardinality_error', 7.805), ('loss_ce_0', 0.259), ('loss_counter_0', 0.111), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.215), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.805), ('loss_caption_0', 1.743), ('loss_caption', 1.749), ('total_loss', 9.786)]),
time/iter = 0.156, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 16359 (epoch 12),
loss = OrderedDict([('loss_ce', 0.265), ('loss_counter', 0.116), ('loss_bbox', 0.014), ('loss_giou', 0.198), ('loss_self_iou', 0.005), ('cardinality_error', 7.85), ('loss_ce_0', 0.264), ('loss_counter_0', 0.115), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.214), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.85), ('loss_caption_0', 1.797), ('loss_caption', 1.778), ('total_loss', 9.972)]),
time/iter = 0.156, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 16492 (epoch 12),
loss = OrderedDict([('loss_ce', 0.254), ('loss_counter', 0.105), ('loss_bbox', 0.015), ('loss_giou', 0.189), ('loss_self_iou', 0.004), ('cardinality_error', 7.383), ('loss_ce_0', 0.257), ('loss_counter_0', 0.106), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.202), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.383), ('loss_caption_0', 1.796), ('loss_caption', 1.808), ('total_loss', 9.899)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 16625 (epoch 12),
loss = OrderedDict([('loss_ce', 0.258), ('loss_counter', 0.109), ('loss_bbox', 0.014), ('loss_giou', 0.186), ('loss_self_iou', 0.005), ('cardinality_error', 7.782), ('loss_ce_0', 0.256), ('loss_counter_0', 0.11), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.203), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.782), ('loss_caption_0', 1.78), ('loss_caption', 1.779), ('total_loss', 9.812)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 16758 (epoch 12),
loss = OrderedDict([('loss_ce', 0.252), ('loss_counter', 0.106), ('loss_bbox', 0.014), ('loss_giou', 0.196), ('loss_self_iou', 0.005), ('cardinality_error', 7.962), ('loss_ce_0', 0.252), ('loss_counter_0', 0.107), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.211), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.962), ('loss_caption_0', 1.795), ('loss_caption', 1.806), ('total_loss', 9.948)]),
time/iter = 0.153, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 16891 (epoch 12),
loss = OrderedDict([('loss_ce', 0.258), ('loss_counter', 0.109), ('loss_bbox', 0.016), ('loss_giou', 0.199), ('loss_self_iou', 0.005), ('cardinality_error', 7.797), ('loss_ce_0', 0.255), ('loss_counter_0', 0.111), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.211), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.797), ('loss_caption_0', 1.788), ('loss_caption', 1.782), ('total_loss', 9.914)]),
time/iter = 0.157, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 17024 (epoch 12),
loss = OrderedDict([('loss_ce', 0.262), ('loss_counter', 0.11), ('loss_bbox', 0.014), ('loss_giou', 0.198), ('loss_self_iou', 0.005), ('cardinality_error', 7.511), ('loss_ce_0', 0.26), ('loss_counter_0', 0.11), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.211), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.511), ('loss_caption_0', 1.717), ('loss_caption', 1.72), ('total_loss', 9.666)]),
time/iter = 0.170, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 17157 (epoch 12),
loss = OrderedDict([('loss_ce', 0.25), ('loss_counter', 0.104), ('loss_bbox', 0.015), ('loss_giou', 0.189), ('loss_self_iou', 0.004), ('cardinality_error', 7.692), ('loss_ce_0', 0.252), ('loss_counter_0', 0.106), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.2), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.692), ('loss_caption_0', 1.738), ('loss_caption', 1.749), ('total_loss', 9.638)]),
time/iter = 0.182, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 17290 (epoch 12),
loss = OrderedDict([('loss_ce', 0.254), ('loss_counter', 0.109), ('loss_bbox', 0.015), ('loss_giou', 0.198), ('loss_self_iou', 0.005), ('cardinality_error', 7.932), ('loss_ce_0', 0.254), ('loss_counter_0', 0.109), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.214), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.932), ('loss_caption_0', 1.815), ('loss_caption', 1.83), ('total_loss', 10.067)]),
time/iter = 0.163, bad_vid = 0.000
Validation results of iter 17329:
Bleu_1:0.19294534256446427
Bleu_2:0.11789730285267924
Bleu_3:0.06601509377472357
Bleu_4:0.03274421971508606
METEOR:0.0906445074413136
ROUGE_L:0.17678145420382357
CIDEr:0.5750907875125135
Recall:0.3073352674556176
Precision:0.4434536834427428
soda_c:0.07896521325127955
para_Bleu_1:0.39483511792471604
para_Bleu_2:0.23988438429479647
para_Bleu_3:0.1464330354033768
para_Bleu_4:0.09122283851671699
para_METEOR:0.16480200992253577
para_ROUGE_L:0.33317486176302236
para_CIDEr:0.29080350784714515
overall score of iter 17329: 0.5468283562863979
Save model at iter 17329 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 17423 (epoch 13),
loss = OrderedDict([('loss_ce', 0.262), ('loss_counter', 0.111), ('loss_bbox', 0.015), ('loss_giou', 0.195), ('loss_self_iou', 0.007), ('cardinality_error', 7.692), ('loss_ce_0', 0.259), ('loss_counter_0', 0.11), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.208), ('loss_self_iou_0', 0.007), ('cardinality_error_0', 7.692), ('loss_caption_0', 1.865), ('loss_caption', 1.881), ('total_loss', 10.261)]),
time/iter = 0.713, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 17556 (epoch 13),
loss = OrderedDict([('loss_ce', 0.252), ('loss_counter', 0.111), ('loss_bbox', 0.014), ('loss_giou', 0.193), ('loss_self_iou', 0.004), ('cardinality_error', 7.737), ('loss_ce_0', 0.253), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.209), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.737), ('loss_caption_0', 1.744), ('loss_caption', 1.743), ('total_loss', 9.707)]),
time/iter = 0.168, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 17689 (epoch 13),
loss = OrderedDict([('loss_ce', 0.259), ('loss_counter', 0.109), ('loss_bbox', 0.015), ('loss_giou', 0.199), ('loss_self_iou', 0.006), ('cardinality_error', 7.602), ('loss_ce_0', 0.262), ('loss_counter_0', 0.109), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.211), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.602), ('loss_caption_0', 1.835), ('loss_caption', 1.819), ('total_loss', 10.1)]),
time/iter = 0.160, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 17822 (epoch 13),
loss = OrderedDict([('loss_ce', 0.25), ('loss_counter', 0.108), ('loss_bbox', 0.014), ('loss_giou', 0.191), ('loss_self_iou', 0.005), ('cardinality_error', 7.526), ('loss_ce_0', 0.249), ('loss_counter_0', 0.108), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.206), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.526), ('loss_caption_0', 1.681), ('loss_caption', 1.67), ('total_loss', 9.397)]),
time/iter = 0.152, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 17955 (epoch 13),
loss = OrderedDict([('loss_ce', 0.255), ('loss_counter', 0.102), ('loss_bbox', 0.014), ('loss_giou', 0.184), ('loss_self_iou', 0.005), ('cardinality_error', 7.526), ('loss_ce_0', 0.252), ('loss_counter_0', 0.104), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.2), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.526), ('loss_caption_0', 1.757), ('loss_caption', 1.745), ('total_loss', 9.658)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 18088 (epoch 13),
loss = OrderedDict([('loss_ce', 0.251), ('loss_counter', 0.106), ('loss_bbox', 0.014), ('loss_giou', 0.177), ('loss_self_iou', 0.004), ('cardinality_error', 7.534), ('loss_ce_0', 0.251), ('loss_counter_0', 0.107), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.191), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.534), ('loss_caption_0', 1.703), ('loss_caption', 1.701), ('total_loss', 9.39)]),
time/iter = 0.154, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 18221 (epoch 13),
loss = OrderedDict([('loss_ce', 0.252), ('loss_counter', 0.111), ('loss_bbox', 0.014), ('loss_giou', 0.201), ('loss_self_iou', 0.005), ('cardinality_error', 8.211), ('loss_ce_0', 0.252), ('loss_counter_0', 0.111), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.213), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 8.211), ('loss_caption_0', 1.824), ('loss_caption', 1.816), ('total_loss', 10.053)]),
time/iter = 0.161, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 18354 (epoch 13),
loss = OrderedDict([('loss_ce', 0.253), ('loss_counter', 0.104), ('loss_bbox', 0.015), ('loss_giou', 0.195), ('loss_self_iou', 0.004), ('cardinality_error', 7.789), ('loss_ce_0', 0.249), ('loss_counter_0', 0.105), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.21), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.789), ('loss_caption_0', 1.792), ('loss_caption', 1.779), ('total_loss', 9.874)]),
time/iter = 0.157, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 18487 (epoch 13),
loss = OrderedDict([('loss_ce', 0.255), ('loss_counter', 0.111), ('loss_bbox', 0.013), ('loss_giou', 0.19), ('loss_self_iou', 0.004), ('cardinality_error', 7.992), ('loss_ce_0', 0.251), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.205), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.992), ('loss_caption_0', 1.826), ('loss_caption', 1.81), ('total_loss', 9.979)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 18620 (epoch 13),
loss = OrderedDict([('loss_ce', 0.251), ('loss_counter', 0.109), ('loss_bbox', 0.015), ('loss_giou', 0.193), ('loss_self_iou', 0.003), ('cardinality_error', 7.737), ('loss_ce_0', 0.251), ('loss_counter_0', 0.11), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.206), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.737), ('loss_caption_0', 1.767), ('loss_caption', 1.771), ('total_loss', 9.784)]),
time/iter = 0.153, bad_vid = 0.000
Validation results of iter 18662:
Bleu_1:0.1916652028982354
Bleu_2:0.11864819375256218
Bleu_3:0.06801290454817709
Bleu_4:0.03421778123301331
METEOR:0.08890100804282676
ROUGE_L:0.17229926562968575
CIDEr:0.5719694906113042
Recall:0.3115151404333572
Precision:0.42734448265082836
soda_c:0.07979305036983636
para_Bleu_1:0.3972508455506424
para_Bleu_2:0.24317507500304622
para_Bleu_3:0.1497047997976745
para_Bleu_4:0.09437727320664267
para_METEOR:0.16651343432042678
para_ROUGE_L:0.33875534436877147
para_CIDEr:0.29220356232363026
overall score of iter 18662: 0.5530942698506998
Save model at iter 18662 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 18753 (epoch 14),
loss = OrderedDict([('loss_ce', 0.247), ('loss_counter', 0.115), ('loss_bbox', 0.013), ('loss_giou', 0.195), ('loss_self_iou', 0.004), ('cardinality_error', 8.241), ('loss_ce_0', 0.251), ('loss_counter_0', 0.115), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.207), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 8.241), ('loss_caption_0', 1.758), ('loss_caption', 1.759), ('total_loss', 9.756)]),
time/iter = 0.731, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 18886 (epoch 14),
loss = OrderedDict([('loss_ce', 0.247), ('loss_counter', 0.103), ('loss_bbox', 0.015), ('loss_giou', 0.182), ('loss_self_iou', 0.004), ('cardinality_error', 7.436), ('loss_ce_0', 0.245), ('loss_counter_0', 0.104), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.194), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.436), ('loss_caption_0', 1.696), ('loss_caption', 1.692), ('total_loss', 9.366)]),
time/iter = 0.163, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 19019 (epoch 14),
loss = OrderedDict([('loss_ce', 0.243), ('loss_counter', 0.104), ('loss_bbox', 0.013), ('loss_giou', 0.181), ('loss_self_iou', 0.003), ('cardinality_error', 7.692), ('loss_ce_0', 0.242), ('loss_counter_0', 0.105), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.194), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.692), ('loss_caption_0', 1.73), ('loss_caption', 1.729), ('total_loss', 9.496)]),
time/iter = 0.157, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 19152 (epoch 14),
loss = OrderedDict([('loss_ce', 0.251), ('loss_counter', 0.112), ('loss_bbox', 0.014), ('loss_giou', 0.181), ('loss_self_iou', 0.006), ('cardinality_error', 7.82), ('loss_ce_0', 0.251), ('loss_counter_0', 0.114), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.194), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.82), ('loss_caption_0', 1.779), ('loss_caption', 1.771), ('total_loss', 9.714)]),
time/iter = 0.161, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 19285 (epoch 14),
loss = OrderedDict([('loss_ce', 0.25), ('loss_counter', 0.105), ('loss_bbox', 0.014), ('loss_giou', 0.194), ('loss_self_iou', 0.004), ('cardinality_error', 7.669), ('loss_ce_0', 0.25), ('loss_counter_0', 0.105), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.204), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.669), ('loss_caption_0', 1.76), ('loss_caption', 1.772), ('total_loss', 9.759)]),
time/iter = 0.154, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 19418 (epoch 14),
loss = OrderedDict([('loss_ce', 0.244), ('loss_counter', 0.115), ('loss_bbox', 0.013), ('loss_giou', 0.197), ('loss_self_iou', 0.004), ('cardinality_error', 8.256), ('loss_ce_0', 0.245), ('loss_counter_0', 0.117), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.211), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 8.256), ('loss_caption_0', 1.754), ('loss_caption', 1.758), ('total_loss', 9.747)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 19551 (epoch 14),
loss = OrderedDict([('loss_ce', 0.249), ('loss_counter', 0.109), ('loss_bbox', 0.013), ('loss_giou', 0.175), ('loss_self_iou', 0.004), ('cardinality_error', 7.865), ('loss_ce_0', 0.253), ('loss_counter_0', 0.11), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.187), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.865), ('loss_caption_0', 1.68), ('loss_caption', 1.689), ('total_loss', 9.3)]),
time/iter = 0.153, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 19684 (epoch 14),
loss = OrderedDict([('loss_ce', 0.263), ('loss_counter', 0.104), ('loss_bbox', 0.015), ('loss_giou', 0.187), ('loss_self_iou', 0.005), ('cardinality_error', 7.474), ('loss_ce_0', 0.262), ('loss_counter_0', 0.105), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.199), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.474), ('loss_caption_0', 1.81), ('loss_caption', 1.803), ('total_loss', 9.923)]),
time/iter = 0.165, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 19817 (epoch 14),
loss = OrderedDict([('loss_ce', 0.246), ('loss_counter', 0.106), ('loss_bbox', 0.014), ('loss_giou', 0.183), ('loss_self_iou', 0.005), ('cardinality_error', 7.526), ('loss_ce_0', 0.247), ('loss_counter_0', 0.107), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.197), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.526), ('loss_caption_0', 1.769), ('loss_caption', 1.765), ('total_loss', 9.677)]),
time/iter = 0.164, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 19950 (epoch 14),
loss = OrderedDict([('loss_ce', 0.254), ('loss_counter', 0.108), ('loss_bbox', 0.013), ('loss_giou', 0.19), ('loss_self_iou', 0.005), ('cardinality_error', 7.797), ('loss_ce_0', 0.253), ('loss_counter_0', 0.108), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.202), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.797), ('loss_caption_0', 1.736), ('loss_caption', 1.748), ('total_loss', 9.654)]),
time/iter = 0.153, bad_vid = 0.000
Validation results of iter 19995:
Bleu_1:0.19012877786294885
Bleu_2:0.11743680046097797
Bleu_3:0.06623934110461578
Bleu_4:0.03314975306654321
METEOR:0.08857227272587216
ROUGE_L:0.17208518718096077
CIDEr:0.5689998070546577
Recall:0.3090681299310951
Precision:0.43095498593310433
soda_c:0.08081534748318767
para_Bleu_1:0.3949292262433903
para_Bleu_2:0.24183495416706074
para_Bleu_3:0.1493168425692173
para_Bleu_4:0.0941904023418332
para_METEOR:0.16661877157717606
para_ROUGE_L:0.3391544295873436
para_CIDEr:0.3057631644012313
overall score of iter 19995: 0.5665723383202406
Save model at iter 19995 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 20083 (epoch 15),
loss = OrderedDict([('loss_ce', 0.254), ('loss_counter', 0.102), ('loss_bbox', 0.014), ('loss_giou', 0.186), ('loss_self_iou', 0.004), ('cardinality_error', 7.519), ('loss_ce_0', 0.257), ('loss_counter_0', 0.103), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.197), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.519), ('loss_caption_0', 1.743), ('loss_caption', 1.756), ('total_loss', 9.655)]),
time/iter = 0.703, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 20216 (epoch 15),
loss = OrderedDict([('loss_ce', 0.244), ('loss_counter', 0.105), ('loss_bbox', 0.013), ('loss_giou', 0.179), ('loss_self_iou', 0.003), ('cardinality_error', 7.759), ('loss_ce_0', 0.244), ('loss_counter_0', 0.106), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.193), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.759), ('loss_caption_0', 1.79), ('loss_caption', 1.781), ('total_loss', 9.713)]),
time/iter = 0.168, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 20349 (epoch 15),
loss = OrderedDict([('loss_ce', 0.246), ('loss_counter', 0.11), ('loss_bbox', 0.013), ('loss_giou', 0.19), ('loss_self_iou', 0.004), ('cardinality_error', 7.992), ('loss_ce_0', 0.245), ('loss_counter_0', 0.113), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.202), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.992), ('loss_caption_0', 1.749), ('loss_caption', 1.759), ('total_loss', 9.675)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 20482 (epoch 15),
loss = OrderedDict([('loss_ce', 0.244), ('loss_counter', 0.109), ('loss_bbox', 0.015), ('loss_giou', 0.193), ('loss_self_iou', 0.005), ('cardinality_error', 7.94), ('loss_ce_0', 0.244), ('loss_counter_0', 0.109), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.207), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.94), ('loss_caption_0', 1.694), ('loss_caption', 1.715), ('total_loss', 9.502)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 20615 (epoch 15),
loss = OrderedDict([('loss_ce', 0.257), ('loss_counter', 0.107), ('loss_bbox', 0.014), ('loss_giou', 0.188), ('loss_self_iou', 0.005), ('cardinality_error', 7.368), ('loss_ce_0', 0.257), ('loss_counter_0', 0.107), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.202), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.368), ('loss_caption_0', 1.77), ('loss_caption', 1.771), ('total_loss', 9.775)]),
time/iter = 0.156, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 20748 (epoch 15),
loss = OrderedDict([('loss_ce', 0.247), ('loss_counter', 0.107), ('loss_bbox', 0.013), ('loss_giou', 0.178), ('loss_self_iou', 0.004), ('cardinality_error', 7.857), ('loss_ce_0', 0.247), ('loss_counter_0', 0.108), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.192), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.857), ('loss_caption_0', 1.786), ('loss_caption', 1.773), ('total_loss', 9.695)]),
time/iter = 0.153, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 20881 (epoch 15),
loss = OrderedDict([('loss_ce', 0.243), ('loss_counter', 0.103), ('loss_bbox', 0.013), ('loss_giou', 0.178), ('loss_self_iou', 0.003), ('cardinality_error', 7.594), ('loss_ce_0', 0.242), ('loss_counter_0', 0.103), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.192), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.594), ('loss_caption_0', 1.746), ('loss_caption', 1.748), ('total_loss', 9.541)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 21014 (epoch 15),
loss = OrderedDict([('loss_ce', 0.249), ('loss_counter', 0.108), ('loss_bbox', 0.015), ('loss_giou', 0.19), ('loss_self_iou', 0.005), ('cardinality_error', 8.09), ('loss_ce_0', 0.249), ('loss_counter_0', 0.11), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.202), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 8.09), ('loss_caption_0', 1.709), ('loss_caption', 1.698), ('total_loss', 9.49)]),
time/iter = 0.149, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 21147 (epoch 15),
loss = OrderedDict([('loss_ce', 0.246), ('loss_counter', 0.115), ('loss_bbox', 0.014), ('loss_giou', 0.186), ('loss_self_iou', 0.004), ('cardinality_error', 7.812), ('loss_ce_0', 0.248), ('loss_counter_0', 0.114), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.198), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.812), ('loss_caption_0', 1.733), ('loss_caption', 1.732), ('total_loss', 9.57)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 21280 (epoch 15),
loss = OrderedDict([('loss_ce', 0.246), ('loss_counter', 0.104), ('loss_bbox', 0.014), ('loss_giou', 0.187), ('loss_self_iou', 0.004), ('cardinality_error', 7.632), ('loss_ce_0', 0.245), ('loss_counter_0', 0.105), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.197), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.632), ('loss_caption_0', 1.646), ('loss_caption', 1.658), ('total_loss', 9.233)]),
time/iter = 0.152, bad_vid = 0.000
Validation results of iter 21328:
Bleu_1:0.1927355202990476
Bleu_2:0.11755729236198051
Bleu_3:0.06532950485231373
Bleu_4:0.0318670348131602
METEOR:0.08966953019840175
ROUGE_L:0.17549405824640266
CIDEr:0.5708533801009449
Recall:0.31055728552993345
Precision:0.4412863394810881
soda_c:0.08079399116249976
para_Bleu_1:0.3847850395827542
para_Bleu_2:0.23591168028694995
para_Bleu_3:0.14500000021146267
para_Bleu_4:0.09097906463153684
para_METEOR:0.1633729521776342
para_ROUGE_L:0.33764324525807
para_CIDEr:0.3225522700715415
overall score of iter 21328: 0.5769042868807126
Save model at iter 21328 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 21413 (epoch 16),
loss = OrderedDict([('loss_ce', 0.24), ('loss_counter', 0.107), ('loss_bbox', 0.014), ('loss_giou', 0.175), ('loss_self_iou', 0.004), ('cardinality_error', 7.541), ('loss_ce_0', 0.239), ('loss_counter_0', 0.108), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.19), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.541), ('loss_caption_0', 1.637), ('loss_caption', 1.633), ('total_loss', 9.069)]),
time/iter = 0.698, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 21546 (epoch 16),
loss = OrderedDict([('loss_ce', 0.245), ('loss_counter', 0.102), ('loss_bbox', 0.013), ('loss_giou', 0.172), ('loss_self_iou', 0.004), ('cardinality_error', 7.624), ('loss_ce_0', 0.243), ('loss_counter_0', 0.103), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.185), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.624), ('loss_caption_0', 1.773), ('loss_caption', 1.784), ('total_loss', 9.621)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 21679 (epoch 16),
loss = OrderedDict([('loss_ce', 0.239), ('loss_counter', 0.107), ('loss_bbox', 0.014), ('loss_giou', 0.181), ('loss_self_iou', 0.004), ('cardinality_error', 7.992), ('loss_ce_0', 0.238), ('loss_counter_0', 0.109), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.194), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.992), ('loss_caption_0', 1.809), ('loss_caption', 1.805), ('total_loss', 9.791)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 21812 (epoch 16),
loss = OrderedDict([('loss_ce', 0.246), ('loss_counter', 0.107), ('loss_bbox', 0.013), ('loss_giou', 0.179), ('loss_self_iou', 0.003), ('cardinality_error', 7.677), ('loss_ce_0', 0.25), ('loss_counter_0', 0.108), ('loss_bbox_0', 0.013), ('loss_giou_0', 0.19), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.677), ('loss_caption_0', 1.674), ('loss_caption', 1.676), ('total_loss', 9.277)]),
time/iter = 0.157, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 21945 (epoch 16),
loss = OrderedDict([('loss_ce', 0.244), ('loss_counter', 0.108), ('loss_bbox', 0.014), ('loss_giou', 0.192), ('loss_self_iou', 0.004), ('cardinality_error', 7.865), ('loss_ce_0', 0.244), ('loss_counter_0', 0.108), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.206), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.865), ('loss_caption_0', 1.713), ('loss_caption', 1.714), ('total_loss', 9.531)]),
time/iter = 0.155, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 22078 (epoch 16),
loss = OrderedDict([('loss_ce', 0.251), ('loss_counter', 0.11), ('loss_bbox', 0.014), ('loss_giou', 0.19), ('loss_self_iou', 0.005), ('cardinality_error', 7.707), ('loss_ce_0', 0.247), ('loss_counter_0', 0.111), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.202), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.707), ('loss_caption_0', 1.772), ('loss_caption', 1.758), ('total_loss', 9.738)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 22211 (epoch 16),
loss = OrderedDict([('loss_ce', 0.249), ('loss_counter', 0.101), ('loss_bbox', 0.013), ('loss_giou', 0.18), ('loss_self_iou', 0.005), ('cardinality_error', 7.541), ('loss_ce_0', 0.249), ('loss_counter_0', 0.101), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.193), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.541), ('loss_caption_0', 1.665), ('loss_caption', 1.66), ('total_loss', 9.243)]),
time/iter = 0.160, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 22344 (epoch 16),
loss = OrderedDict([('loss_ce', 0.246), ('loss_counter', 0.113), ('loss_bbox', 0.015), ('loss_giou', 0.187), ('loss_self_iou', 0.004), ('cardinality_error', 8.008), ('loss_ce_0', 0.248), ('loss_counter_0', 0.115), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.202), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 8.008), ('loss_caption_0', 1.799), ('loss_caption', 1.784), ('total_loss', 9.823)]),
time/iter = 0.163, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 22477 (epoch 16),
loss = OrderedDict([('loss_ce', 0.246), ('loss_counter', 0.102), ('loss_bbox', 0.013), ('loss_giou', 0.184), ('loss_self_iou', 0.004), ('cardinality_error', 7.699), ('loss_ce_0', 0.247), ('loss_counter_0', 0.104), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.197), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.699), ('loss_caption_0', 1.722), ('loss_caption', 1.733), ('total_loss', 9.525)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 22610 (epoch 16),
loss = OrderedDict([('loss_ce', 0.243), ('loss_counter', 0.106), ('loss_bbox', 0.014), ('loss_giou', 0.188), ('loss_self_iou', 0.004), ('cardinality_error', 7.729), ('loss_ce_0', 0.245), ('loss_counter_0', 0.106), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.2), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.729), ('loss_caption_0', 1.664), ('loss_caption', 1.667), ('total_loss', 9.297)]),
time/iter = 0.154, bad_vid = 0.000
Validation results of iter 22661:
Bleu_1:0.1905629005997804
Bleu_2:0.11689699082903934
Bleu_3:0.06544029555928756
Bleu_4:0.03330988693345351
METEOR:0.08938496175202132
ROUGE_L:0.17298359351524648
CIDEr:0.5732307929342625
Recall:0.309604513071417
Precision:0.43046524955715343
soda_c:0.08056479007503722
para_Bleu_1:0.3975304274857351
para_Bleu_2:0.24253918136446623
para_Bleu_3:0.14848895422464012
para_Bleu_4:0.09337330751749118
para_METEOR:0.16677196164785574
para_ROUGE_L:0.33750187221117683
para_CIDEr:0.31278894258081524
overall score of iter 22661: 0.5729342117461622
Save model at iter 22661 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 22743 (epoch 17),
loss = OrderedDict([('loss_ce', 0.244), ('loss_counter', 0.108), ('loss_bbox', 0.015), ('loss_giou', 0.196), ('loss_self_iou', 0.005), ('cardinality_error', 7.714), ('loss_ce_0', 0.244), ('loss_counter_0', 0.109), ('loss_bbox_0', 0.017), ('loss_giou_0', 0.21), ('loss_self_iou_0', 0.006), ('cardinality_error_0', 7.714), ('loss_caption_0', 1.773), ('loss_caption', 1.775), ('total_loss', 9.803)]),
time/iter = 0.714, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 22876 (epoch 17),
loss = OrderedDict([('loss_ce', 0.245), ('loss_counter', 0.11), ('loss_bbox', 0.013), ('loss_giou', 0.181), ('loss_self_iou', 0.004), ('cardinality_error', 7.774), ('loss_ce_0', 0.249), ('loss_counter_0', 0.109), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.192), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.774), ('loss_caption_0', 1.76), ('loss_caption', 1.759), ('total_loss', 9.631)]),
time/iter = 0.157, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 23009 (epoch 17),
loss = OrderedDict([('loss_ce', 0.237), ('loss_counter', 0.105), ('loss_bbox', 0.012), ('loss_giou', 0.171), ('loss_self_iou', 0.003), ('cardinality_error', 7.872), ('loss_ce_0', 0.237), ('loss_counter_0', 0.106), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.183), ('loss_self_iou_0', 0.003), ('cardinality_error_0', 7.872), ('loss_caption_0', 1.69), ('loss_caption', 1.688), ('total_loss', 9.229)]),
time/iter = 0.161, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 23142 (epoch 17),
loss = OrderedDict([('loss_ce', 0.242), ('loss_counter', 0.098), ('loss_bbox', 0.013), ('loss_giou', 0.177), ('loss_self_iou', 0.004), ('cardinality_error', 7.744), ('loss_ce_0', 0.239), ('loss_counter_0', 0.1), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.19), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.744), ('loss_caption_0', 1.66), ('loss_caption', 1.663), ('total_loss', 9.173)]),
time/iter = 0.157, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 23275 (epoch 17),
loss = OrderedDict([('loss_ce', 0.242), ('loss_counter', 0.108), ('loss_bbox', 0.014), ('loss_giou', 0.183), ('loss_self_iou', 0.004), ('cardinality_error', 7.82), ('loss_ce_0', 0.242), ('loss_counter_0', 0.109), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.197), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.82), ('loss_caption_0', 1.727), ('loss_caption', 1.741), ('total_loss', 9.535)]),
time/iter = 0.160, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 23408 (epoch 17),
loss = OrderedDict([('loss_ce', 0.235), ('loss_counter', 0.104), ('loss_bbox', 0.014), ('loss_giou', 0.173), ('loss_self_iou', 0.004), ('cardinality_error', 7.083), ('loss_ce_0', 0.235), ('loss_counter_0', 0.104), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.182), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.083), ('loss_caption_0', 1.678), ('loss_caption', 1.68), ('total_loss', 9.181)]),
time/iter = 0.153, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 23541 (epoch 17),
loss = OrderedDict([('loss_ce', 0.25), ('loss_counter', 0.112), ('loss_bbox', 0.013), ('loss_giou', 0.185), ('loss_self_iou', 0.003), ('cardinality_error', 7.782), ('loss_ce_0', 0.253), ('loss_counter_0', 0.111), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.197), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.782), ('loss_caption_0', 1.686), ('loss_caption', 1.674), ('total_loss', 9.361)]),
time/iter = 0.158, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 23674 (epoch 17),
loss = OrderedDict([('loss_ce', 0.242), ('loss_counter', 0.104), ('loss_bbox', 0.013), ('loss_giou', 0.175), ('loss_self_iou', 0.004), ('cardinality_error', 7.699), ('loss_ce_0', 0.242), ('loss_counter_0', 0.106), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.188), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.699), ('loss_caption_0', 1.734), ('loss_caption', 1.755), ('total_loss', 9.502)]),
time/iter = 0.169, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 23807 (epoch 17),
loss = OrderedDict([('loss_ce', 0.247), ('loss_counter', 0.109), ('loss_bbox', 0.013), ('loss_giou', 0.188), ('loss_self_iou', 0.004), ('cardinality_error', 8.023), ('loss_ce_0', 0.248), ('loss_counter_0', 0.111), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.199), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 8.023), ('loss_caption_0', 1.838), ('loss_caption', 1.842), ('total_loss', 10.01)]),
time/iter = 0.176, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 23940 (epoch 17),
loss = OrderedDict([('loss_ce', 0.242), ('loss_counter', 0.107), ('loss_bbox', 0.013), ('loss_giou', 0.178), ('loss_self_iou', 0.004), ('cardinality_error', 7.789), ('loss_ce_0', 0.246), ('loss_counter_0', 0.11), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.189), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.789), ('loss_caption_0', 1.661), ('loss_caption', 1.655), ('total_loss', 9.188)]),
time/iter = 0.168, bad_vid = 0.000
Validation results of iter 23994:
Bleu_1:0.19099469488969467
Bleu_2:0.11646897839764006
Bleu_3:0.06451308365995856
Bleu_4:0.032200079484133
METEOR:0.08912416771202449
ROUGE_L:0.1730757893125124
CIDEr:0.5693051160396969
Recall:0.3097042977992106
Precision:0.43274547601681085
soda_c:0.08084297498321232
para_Bleu_1:0.3924031546442418
para_Bleu_2:0.23911474626028398
para_Bleu_3:0.14600811918196227
para_Bleu_4:0.09107950853175292
para_METEOR:0.16594454181978452
para_ROUGE_L:0.33729101832099057
para_CIDEr:0.30892642009784
overall score of iter 23994: 0.5659504704493774
Save model at iter 23994 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 24073 (epoch 18),
loss = OrderedDict([('loss_ce', 0.244), ('loss_counter', 0.11), ('loss_bbox', 0.012), ('loss_giou', 0.178), ('loss_self_iou', 0.003), ('cardinality_error', 7.97), ('loss_ce_0', 0.246), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.013), ('loss_giou_0', 0.191), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.97), ('loss_caption_0', 1.689), ('loss_caption', 1.683), ('total_loss', 9.309)]),
time/iter = 0.720, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 24206 (epoch 18),
loss = OrderedDict([('loss_ce', 0.237), ('loss_counter', 0.118), ('loss_bbox', 0.013), ('loss_giou', 0.183), ('loss_self_iou', 0.005), ('cardinality_error', 8.286), ('loss_ce_0', 0.236), ('loss_counter_0', 0.118), ('loss_bbox_0', 0.013), ('loss_giou_0', 0.195), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 8.286), ('loss_caption_0', 1.712), ('loss_caption', 1.715), ('total_loss', 9.432)]),
time/iter = 0.161, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 24339 (epoch 18),
loss = OrderedDict([('loss_ce', 0.245), ('loss_counter', 0.098), ('loss_bbox', 0.012), ('loss_giou', 0.167), ('loss_self_iou', 0.003), ('cardinality_error', 7.316), ('loss_ce_0', 0.247), ('loss_counter_0', 0.099), ('loss_bbox_0', 0.013), ('loss_giou_0', 0.179), ('loss_self_iou_0', 0.003), ('cardinality_error_0', 7.316), ('loss_caption_0', 1.695), ('loss_caption', 1.701), ('total_loss', 9.257)]),
time/iter = 0.159, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 24472 (epoch 18),
loss = OrderedDict([('loss_ce', 0.243), ('loss_counter', 0.108), ('loss_bbox', 0.013), ('loss_giou', 0.176), ('loss_self_iou', 0.003), ('cardinality_error', 7.459), ('loss_ce_0', 0.248), ('loss_counter_0', 0.109), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.187), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.459), ('loss_caption_0', 1.699), ('loss_caption', 1.699), ('total_loss', 9.337)]),
time/iter = 0.158, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 24605 (epoch 18),
loss = OrderedDict([('loss_ce', 0.242), ('loss_counter', 0.103), ('loss_bbox', 0.014), ('loss_giou', 0.18), ('loss_self_iou', 0.004), ('cardinality_error', 7.812), ('loss_ce_0', 0.243), ('loss_counter_0', 0.104), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.189), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.812), ('loss_caption_0', 1.775), ('loss_caption', 1.773), ('total_loss', 9.644)]),
time/iter = 0.163, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 24738 (epoch 18),
loss = OrderedDict([('loss_ce', 0.243), ('loss_counter', 0.101), ('loss_bbox', 0.016), ('loss_giou', 0.187), ('loss_self_iou', 0.004), ('cardinality_error', 7.556), ('loss_ce_0', 0.246), ('loss_counter_0', 0.103), ('loss_bbox_0', 0.016), ('loss_giou_0', 0.196), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.556), ('loss_caption_0', 1.727), ('loss_caption', 1.73), ('total_loss', 9.525)]),
time/iter = 0.166, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 24871 (epoch 18),
loss = OrderedDict([('loss_ce', 0.239), ('loss_counter', 0.104), ('loss_bbox', 0.013), ('loss_giou', 0.181), ('loss_self_iou', 0.004), ('cardinality_error', 7.692), ('loss_ce_0', 0.241), ('loss_counter_0', 0.105), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.192), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.692), ('loss_caption_0', 1.77), ('loss_caption', 1.773), ('total_loss', 9.641)]),
time/iter = 0.154, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 25004 (epoch 18),
loss = OrderedDict([('loss_ce', 0.246), ('loss_counter', 0.109), ('loss_bbox', 0.013), ('loss_giou', 0.186), ('loss_self_iou', 0.004), ('cardinality_error', 8.143), ('loss_ce_0', 0.247), ('loss_counter_0', 0.11), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.197), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 8.143), ('loss_caption_0', 1.692), ('loss_caption', 1.684), ('total_loss', 9.379)]),
time/iter = 0.151, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 25137 (epoch 18),
loss = OrderedDict([('loss_ce', 0.245), ('loss_counter', 0.111), ('loss_bbox', 0.014), ('loss_giou', 0.179), ('loss_self_iou', 0.004), ('cardinality_error', 7.88), ('loss_ce_0', 0.245), ('loss_counter_0', 0.111), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.192), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.88), ('loss_caption_0', 1.691), ('loss_caption', 1.696), ('total_loss', 9.347)]),
time/iter = 0.154, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 25270 (epoch 18),
loss = OrderedDict([('loss_ce', 0.237), ('loss_counter', 0.103), ('loss_bbox', 0.014), ('loss_giou', 0.185), ('loss_self_iou', 0.004), ('cardinality_error', 7.767), ('loss_ce_0', 0.238), ('loss_counter_0', 0.105), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.196), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.767), ('loss_caption_0', 1.687), ('loss_caption', 1.694), ('total_loss', 9.34)]),
time/iter = 0.146, bad_vid = 0.000
Validation results of iter 25327:
Bleu_1:0.19191750615066444
Bleu_2:0.11783589874301872
Bleu_3:0.06597231596326529
Bleu_4:0.03167603834812624
METEOR:0.08996609888818348
ROUGE_L:0.1746391859525846
CIDEr:0.5689023016363987
Recall:0.31503357525649683
Precision:0.4376628112951966
soda_c:0.08097707611185051
para_Bleu_1:0.3977375551078834
para_Bleu_2:0.24323062675170298
para_Bleu_3:0.1488548587270082
para_Bleu_4:0.09292110149283073
para_METEOR:0.16716298804356167
para_ROUGE_L:0.33781551083855066
para_CIDEr:0.31014493696748857
overall score of iter 25327: 0.570229026503881
Save model at iter 25327 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save info to info.json
ID seq2-ft(mix)-gt_percent-1.0 iter 25403 (epoch 19),
loss = OrderedDict([('loss_ce', 0.247), ('loss_counter', 0.102), ('loss_bbox', 0.013), ('loss_giou', 0.176), ('loss_self_iou', 0.005), ('cardinality_error', 7.429), ('loss_ce_0', 0.248), ('loss_counter_0', 0.105), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.186), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.429), ('loss_caption_0', 1.705), ('loss_caption', 1.695), ('total_loss', 9.343)]),
time/iter = 0.723, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 25536 (epoch 19),
loss = OrderedDict([('loss_ce', 0.241), ('loss_counter', 0.107), ('loss_bbox', 0.013), ('loss_giou', 0.189), ('loss_self_iou', 0.003), ('cardinality_error', 7.887), ('loss_ce_0', 0.246), ('loss_counter_0', 0.108), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.196), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.887), ('loss_caption_0', 1.717), ('loss_caption', 1.729), ('total_loss', 9.517)]),
time/iter = 0.163, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 25669 (epoch 19),
loss = OrderedDict([('loss_ce', 0.239), ('loss_counter', 0.111), ('loss_bbox', 0.014), ('loss_giou', 0.177), ('loss_self_iou', 0.004), ('cardinality_error', 7.707), ('loss_ce_0', 0.243), ('loss_counter_0', 0.111), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.186), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.707), ('loss_caption_0', 1.718), ('loss_caption', 1.711), ('total_loss', 9.385)]),
time/iter = 0.161, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 25802 (epoch 19),
loss = OrderedDict([('loss_ce', 0.24), ('loss_counter', 0.111), ('loss_bbox', 0.013), ('loss_giou', 0.183), ('loss_self_iou', 0.004), ('cardinality_error', 8.173), ('loss_ce_0', 0.242), ('loss_counter_0', 0.113), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.193), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 8.173), ('loss_caption_0', 1.732), ('loss_caption', 1.735), ('total_loss', 9.515)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 25935 (epoch 19),
loss = OrderedDict([('loss_ce', 0.241), ('loss_counter', 0.105), ('loss_bbox', 0.013), ('loss_giou', 0.179), ('loss_self_iou', 0.005), ('cardinality_error', 7.82), ('loss_ce_0', 0.241), ('loss_counter_0', 0.107), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.192), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.82), ('loss_caption_0', 1.626), ('loss_caption', 1.628), ('total_loss', 9.063)]),
time/iter = 0.153, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 26068 (epoch 19),
loss = OrderedDict([('loss_ce', 0.24), ('loss_counter', 0.102), ('loss_bbox', 0.014), ('loss_giou', 0.182), ('loss_self_iou', 0.005), ('cardinality_error', 7.444), ('loss_ce_0', 0.243), ('loss_counter_0', 0.103), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.19), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.444), ('loss_caption_0', 1.697), ('loss_caption', 1.701), ('total_loss', 9.35)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 26201 (epoch 19),
loss = OrderedDict([('loss_ce', 0.239), ('loss_counter', 0.097), ('loss_bbox', 0.014), ('loss_giou', 0.168), ('loss_self_iou', 0.005), ('cardinality_error', 7.301), ('loss_ce_0', 0.237), ('loss_counter_0', 0.099), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.181), ('loss_self_iou_0', 0.005), ('cardinality_error_0', 7.301), ('loss_caption_0', 1.702), ('loss_caption', 1.703), ('total_loss', 9.254)]),
time/iter = 0.161, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 26334 (epoch 19),
loss = OrderedDict([('loss_ce', 0.238), ('loss_counter', 0.112), ('loss_bbox', 0.013), ('loss_giou', 0.174), ('loss_self_iou', 0.003), ('cardinality_error', 7.827), ('loss_ce_0', 0.242), ('loss_counter_0', 0.112), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.188), ('loss_self_iou_0', 0.003), ('cardinality_error_0', 7.827), ('loss_caption_0', 1.729), ('loss_caption', 1.725), ('total_loss', 9.424)]),
time/iter = 0.164, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 26467 (epoch 19),
loss = OrderedDict([('loss_ce', 0.247), ('loss_counter', 0.109), ('loss_bbox', 0.014), ('loss_giou', 0.181), ('loss_self_iou', 0.003), ('cardinality_error', 8.023), ('loss_ce_0', 0.245), ('loss_counter_0', 0.11), ('loss_bbox_0', 0.015), ('loss_giou_0', 0.195), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 8.023), ('loss_caption_0', 1.751), ('loss_caption', 1.746), ('total_loss', 9.586)]),
time/iter = 0.162, bad_vid = 0.000
ID seq2-ft(mix)-gt_percent-1.0 iter 26600 (epoch 19),
loss = OrderedDict([('loss_ce', 0.242), ('loss_counter', 0.108), ('loss_bbox', 0.014), ('loss_giou', 0.186), ('loss_self_iou', 0.004), ('cardinality_error', 7.902), ('loss_ce_0', 0.242), ('loss_counter_0', 0.108), ('loss_bbox_0', 0.014), ('loss_giou_0', 0.196), ('loss_self_iou_0', 0.004), ('cardinality_error_0', 7.902), ('loss_caption_0', 1.727), ('loss_caption', 1.737), ('total_loss', 9.533)]),
time/iter = 0.156, bad_vid = 0.000
Validation results of iter 26660:
Bleu_1:0.1908811984292725
Bleu_2:0.11664270449592412
Bleu_3:0.06546844271584715
Bleu_4:0.03266470081303028
METEOR:0.08981101020496235
ROUGE_L:0.17382953846907112
CIDEr:0.5716745559959934
Recall:0.31292035599338697
Precision:0.4345220728699943
soda_c:0.08127095018359767
para_Bleu_1:0.40170065588267356
para_Bleu_2:0.2447870245859959
para_Bleu_3:0.14990588787772124
para_Bleu_4:0.09419227635900729
para_METEOR:0.16780671784283924
para_ROUGE_L:0.33845945539662686
para_CIDEr:0.3198675630646056
overall score of iter 26660: 0.5818665572664521
Save model at iter 26660 to /mnt/data/pjlab-3090-sport/wuhao/logs/dibs/howto-yc2_yc2_ori_pbox(similarity_op_order_v2)_Uni/similarity_op_order_v2_topf25_iter3_r1_th1_refine_aug(8,0.02)_top3_2stage_ins_cap_topk_mil_coef0_noFocal_seq2-ft(mix)-gt_percent-1.0_1/model-last.pth.
Save info to info.json
Best epoch: 11
Best Model Performance:
Bleu_1:0.1989422607268001
Bleu_2:0.12223038556953512
Bleu_3:0.06835990671747892
Bleu_4:0.03486159828438583
METEOR:0.09408978838449876
ROUGE_L:0.18200142867223945
CIDEr:0.593480700759431
Recall:0.30795469953703025
Precision:0.4513424333993264
soda_c:0.0796861065455984
para_Bleu_1:0.39594509057043764
para_Bleu_2:0.24087109399513515
para_Bleu_3:0.14790262814870953
para_Bleu_4:0.09321042711819619
para_METEOR:0.1655617051143519
para_ROUGE_L:0.3391051008488012
para_CIDEr:0.32807196750555834
avg_proposal_number:-1
Best Overall Score epoch11: 1.5265537286258848
|