RickC1999's picture
test
56879e9 verified
2024-08-16,03:25:52 | INFO | Running with a single process. Device cuda:0.
2024-08-16,03:25:52 | INFO | Loaded Align-fMRI-Encoder-small model config.
2024-08-16,03:25:54 | INFO | Model:
2024-08-16,03:25:54 | INFO | CustomTextCLIP(
(visual): VisionTransformer(
(conv1): Conv1d(1, 768, kernel_size=(32,), stride=(32,), bias=False)
(patch_dropout): Identity()
(ln_pre): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(transformer): Transformer(
(resblocks): ModuleList(
(0-11): 12 x ResidualAttentionBlock(
(ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
)
(ls_1): Identity()
(ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(mlp): Sequential(
(c_fc): Linear(in_features=768, out_features=3072, bias=True)
(gelu): GELU(approximate='none')
(c_proj): Linear(in_features=3072, out_features=768, bias=True)
)
(ls_2): Identity()
)
)
)
(ln_post): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
)
(text): HFTextEncoder(
(transformer): RobertaModel(
(embeddings): RobertaEmbeddings(
(word_embeddings): Embedding(50265, 768, padding_idx=1)
(position_embeddings): Embedding(514, 768, padding_idx=1)
(token_type_embeddings): Embedding(1, 768)
(LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): RobertaEncoder(
(layer): ModuleList(
(0-11): 12 x RobertaLayer(
(attention): RobertaAttention(
(self): RobertaSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): RobertaSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): RobertaIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): RobertaOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
(pooler): MeanPooler()
(proj): Sequential(
(0): Linear(in_features=768, out_features=640, bias=False)
(1): GELU(approximate='none')
(2): Linear(in_features=640, out_features=512, bias=False)
)
)
)
2024-08-16,03:25:54 | INFO | Params:
2024-08-16,03:25:54 | INFO | accum_freq: 1
2024-08-16,03:25:54 | INFO | aug_cfg: {}
2024-08-16,03:25:54 | INFO | batch_size: 256
2024-08-16,03:25:54 | INFO | beta1: 0.9
2024-08-16,03:25:54 | INFO | beta2: 0.999
2024-08-16,03:25:54 | INFO | checkpoint_path: ./logs/2024_08_16-03_25_52-model_Align-fMRI-Encoder-small-lr_0.0005-b_256-j_4-p_amp/checkpoints
2024-08-16,03:25:54 | INFO | coca_caption_loss_weight: 2.0
2024-08-16,03:25:54 | INFO | coca_contrastive_loss_weight: 1.0
2024-08-16,03:25:54 | INFO | copy_codebase: False
2024-08-16,03:25:54 | INFO | csv_caption_key: title
2024-08-16,03:25:54 | INFO | csv_img_key: filepath
2024-08-16,03:25:54 | INFO | csv_separator: ,
2024-08-16,03:25:54 | INFO | dataset_resampled: False
2024-08-16,03:25:54 | INFO | dataset_type: auto
2024-08-16,03:25:54 | INFO | ddp_static_graph: False
2024-08-16,03:25:54 | INFO | debug: False
2024-08-16,03:25:54 | INFO | delete_previous_checkpoint: False
2024-08-16,03:25:54 | INFO | device: cuda:0
2024-08-16,03:25:54 | INFO | dist_backend: nccl
2024-08-16,03:25:54 | INFO | dist_url: env://
2024-08-16,03:25:54 | INFO | distill: False
2024-08-16,03:25:54 | INFO | distill_model: None
2024-08-16,03:25:54 | INFO | distill_pretrained: None
2024-08-16,03:25:54 | INFO | distributed: False
2024-08-16,03:25:54 | INFO | epochs: 100
2024-08-16,03:25:54 | INFO | epochs_cooldown: None
2024-08-16,03:25:54 | INFO | eps: 1e-08
2024-08-16,03:25:54 | INFO | force_custom_text: False
2024-08-16,03:25:54 | INFO | force_image_size: None
2024-08-16,03:25:54 | INFO | force_patch_dropout: None
2024-08-16,03:25:54 | INFO | force_quick_gelu: False
2024-08-16,03:25:54 | INFO | gather_with_grad: False
2024-08-16,03:25:54 | INFO | grad_checkpointing: False
2024-08-16,03:25:54 | INFO | grad_clip_norm: None
2024-08-16,03:25:54 | INFO | horovod: False
2024-08-16,03:25:54 | INFO | image_interpolation: None
2024-08-16,03:25:54 | INFO | image_mean: None
2024-08-16,03:25:54 | INFO | image_resize_mode: None
2024-08-16,03:25:54 | INFO | image_std: None
2024-08-16,03:25:54 | INFO | imagenet_v2: None
2024-08-16,03:25:54 | INFO | imagenet_val: None
2024-08-16,03:25:54 | INFO | local_loss: False
2024-08-16,03:25:54 | INFO | local_rank: 0
2024-08-16,03:25:54 | INFO | lock_image: False
2024-08-16,03:25:54 | INFO | lock_image_freeze_bn_stats: False
2024-08-16,03:25:54 | INFO | lock_image_unlocked_groups: 0
2024-08-16,03:25:54 | INFO | lock_text: True
2024-08-16,03:25:54 | INFO | lock_text_freeze_layer_norm: False
2024-08-16,03:25:54 | INFO | lock_text_unlocked_layers: 0
2024-08-16,03:25:54 | INFO | log_every_n_steps: 100
2024-08-16,03:25:54 | INFO | log_level: 20
2024-08-16,03:25:54 | INFO | log_local: False
2024-08-16,03:25:54 | INFO | log_path: ./logs/2024_08_16-03_25_52-model_Align-fMRI-Encoder-small-lr_0.0005-b_256-j_4-p_amp/out.log
2024-08-16,03:25:54 | INFO | logs: ./logs/
2024-08-16,03:25:54 | INFO | lr: 0.0005
2024-08-16,03:25:54 | INFO | lr_cooldown_end: 0.0
2024-08-16,03:25:54 | INFO | lr_cooldown_power: 1.0
2024-08-16,03:25:54 | INFO | lr_scheduler: cosine
2024-08-16,03:25:54 | INFO | model: Align-fMRI-Encoder-small
2024-08-16,03:25:54 | INFO | name: 2024_08_16-03_25_52-model_Align-fMRI-Encoder-small-lr_0.0005-b_256-j_4-p_amp
2024-08-16,03:25:54 | INFO | no_set_device_rank: False
2024-08-16,03:25:54 | INFO | precision: amp
2024-08-16,03:25:54 | INFO | pretrained:
2024-08-16,03:25:54 | INFO | pretrained_image: False
2024-08-16,03:25:54 | INFO | rank: 0
2024-08-16,03:25:54 | INFO | remote_sync: None
2024-08-16,03:25:54 | INFO | remote_sync_frequency: 300
2024-08-16,03:25:54 | INFO | remote_sync_protocol: s3
2024-08-16,03:25:54 | INFO | report_to:
2024-08-16,03:25:54 | INFO | resume: None
2024-08-16,03:25:54 | INFO | save_frequency: 1
2024-08-16,03:25:54 | INFO | save_most_recent: False
2024-08-16,03:25:54 | INFO | seed: 0
2024-08-16,03:25:54 | INFO | siglip: False
2024-08-16,03:25:54 | INFO | skip_scheduler: False
2024-08-16,03:25:54 | INFO | tensorboard: False
2024-08-16,03:25:54 | INFO | tensorboard_path:
2024-08-16,03:25:54 | INFO | torchcompile: False
2024-08-16,03:25:54 | INFO | torchscript: False
2024-08-16,03:25:54 | INFO | trace: False
2024-08-16,03:25:54 | INFO | train_data: /root/autodl-tmp/.autodl/Projects/fMRI2TextAligner/notebooks/train.csv
2024-08-16,03:25:54 | INFO | train_data_upsampling_factors: None
2024-08-16,03:25:54 | INFO | train_num_samples: None
2024-08-16,03:25:54 | INFO | use_bn_sync: False
2024-08-16,03:25:54 | INFO | use_bnb_linear: None
2024-08-16,03:25:54 | INFO | val_data: /root/autodl-tmp/.autodl/Projects/fMRI2TextAligner/notebooks/val.csv
2024-08-16,03:25:54 | INFO | val_frequency: 1
2024-08-16,03:25:54 | INFO | val_num_samples: None
2024-08-16,03:25:54 | INFO | wandb: False
2024-08-16,03:25:54 | INFO | wandb_notes:
2024-08-16,03:25:54 | INFO | wandb_project_name: open-clip
2024-08-16,03:25:54 | INFO | warmup: 10000
2024-08-16,03:25:54 | INFO | wd: 0.2
2024-08-16,03:25:54 | INFO | workers: 4
2024-08-16,03:25:54 | INFO | world_size: 1
2024-08-16,03:25:54 | INFO | zeroshot_frequency: 2
2024-08-16,03:25:58 | INFO | Start epoch 0
2024-08-16,03:26:01 | INFO | Train Epoch: 0 [ 256/27000 (1%)] Data (t): 1.639 Batch (t): 3.538, 72.3607/s, 72.3607/s/gpu LR: 0.000000 Logit Scale: 14.286 Contrastive_loss: 5.5485 (5.5485) Loss: 5.5485 (5.5485)
2024-08-16,03:28:06 | INFO | Train Epoch: 0 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.520/s, 204.520/s/gpu LR: 0.000005 Logit Scale: 14.285 Contrastive_loss: 5.5459 (5.5472) Loss: 5.5459 (5.5472)
2024-08-16,03:28:11 | INFO | Train Epoch: 0 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.359/s, 204.359/s/gpu LR: 0.000005 Logit Scale: 14.285 Contrastive_loss: 5.5479 (5.5474) Loss: 5.5479 (5.5474)
2024-08-16,03:28:13 | INFO | Eval Epoch: 1 [256 / 3000] Clip Loss: 5.542412
2024-08-16,03:28:18 | INFO | Eval Epoch: 1 image_to_text_mean_rank: 1451.4443 image_to_text_median_rank: 1410.0000 image_to_text_R@1: 0.0003 image_to_text_R@5: 0.0023 image_to_text_R@10: 0.0060 text_to_image_mean_rank: 1439.4327 text_to_image_median_rank: 1409.0000 text_to_image_R@1: 0.0007 text_to_image_R@5: 0.0020 text_to_image_R@10: 0.0043 clip_val_loss: 5.5223 epoch: 1.0000 num_samples: 3000.0000
2024-08-16,03:28:19 | INFO | Start epoch 1
2024-08-16,03:28:22 | INFO | Train Epoch: 1 [ 256/27000 (1%)] Data (t): 1.447 Batch (t): 2.695, 95.0061/s, 95.0061/s/gpu LR: 0.000005 Logit Scale: 14.285 Contrastive_loss: 5.5397 (5.5397) Loss: 5.5397 (5.5397)
2024-08-16,03:30:27 | INFO | Train Epoch: 1 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.426/s, 204.426/s/gpu LR: 0.000010 Logit Scale: 14.290 Contrastive_loss: 5.4991 (5.5194) Loss: 5.4991 (5.5194)
2024-08-16,03:30:32 | INFO | Train Epoch: 1 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.419/s, 204.419/s/gpu LR: 0.000010 Logit Scale: 14.290 Contrastive_loss: 5.4470 (5.4953) Loss: 5.4470 (5.4953)
2024-08-16,03:30:34 | INFO | Eval Epoch: 2 [256 / 3000] Clip Loss: 5.452873
2024-08-16,03:30:38 | INFO | Eval Epoch: 2 image_to_text_mean_rank: 1193.8437 image_to_text_median_rank: 1062.0000 image_to_text_R@1: 0.0013 image_to_text_R@5: 0.0037 image_to_text_R@10: 0.0067 text_to_image_mean_rank: 1196.7497 text_to_image_median_rank: 1078.0000 text_to_image_R@1: 0.0013 text_to_image_R@5: 0.0060 text_to_image_R@10: 0.0090 clip_val_loss: 5.4537 epoch: 2.0000 num_samples: 3000.0000
2024-08-16,03:30:40 | INFO | Start epoch 2
2024-08-16,03:30:42 | INFO | Train Epoch: 2 [ 256/27000 (1%)] Data (t): 1.420 Batch (t): 2.666, 96.0263/s, 96.0263/s/gpu LR: 0.000011 Logit Scale: 14.290 Contrastive_loss: 5.4566 (5.4566) Loss: 5.4566 (5.4566)
2024-08-16,03:32:48 | INFO | Train Epoch: 2 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.495/s, 204.495/s/gpu LR: 0.000016 Logit Scale: 14.324 Contrastive_loss: 5.0180 (5.2373) Loss: 5.0180 (5.2373)
2024-08-16,03:32:53 | INFO | Train Epoch: 2 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.358/s, 204.358/s/gpu LR: 0.000016 Logit Scale: 14.325 Contrastive_loss: 5.1190 (5.1979) Loss: 5.1190 (5.1979)
2024-08-16,03:32:54 | INFO | Eval Epoch: 3 [256 / 3000] Clip Loss: 5.063042
2024-08-16,03:32:59 | INFO | Eval Epoch: 3 image_to_text_mean_rank: 782.7933 image_to_text_median_rank: 592.0000 image_to_text_R@1: 0.0017 image_to_text_R@5: 0.0083 image_to_text_R@10: 0.0157 text_to_image_mean_rank: 727.7393 text_to_image_median_rank: 536.0000 text_to_image_R@1: 0.0030 text_to_image_R@5: 0.0117 text_to_image_R@10: 0.0230 clip_val_loss: 5.0752 epoch: 3.0000 num_samples: 3000.0000
2024-08-16,03:33:00 | INFO | Start epoch 3
2024-08-16,03:33:03 | INFO | Train Epoch: 3 [ 256/27000 (1%)] Data (t): 1.597 Batch (t): 2.842, 90.0810/s, 90.0810/s/gpu LR: 0.000016 Logit Scale: 14.326 Contrastive_loss: 5.1011 (5.1011) Loss: 5.1011 (5.1011)
2024-08-16,03:35:09 | INFO | Train Epoch: 3 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.253, 204.414/s, 204.414/s/gpu LR: 0.000021 Logit Scale: 14.339 Contrastive_loss: 4.9292 (5.0151) Loss: 4.9292 (5.0151)
2024-08-16,03:35:14 | INFO | Train Epoch: 3 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.467/s, 204.467/s/gpu LR: 0.000021 Logit Scale: 14.339 Contrastive_loss: 4.8818 (4.9707) Loss: 4.8818 (4.9707)
2024-08-16,03:35:15 | INFO | Eval Epoch: 4 [256 / 3000] Clip Loss: 4.868340
2024-08-16,03:35:20 | INFO | Eval Epoch: 4 image_to_text_mean_rank: 683.8327 image_to_text_median_rank: 457.0000 image_to_text_R@1: 0.0043 image_to_text_R@5: 0.0123 image_to_text_R@10: 0.0230 text_to_image_mean_rank: 612.9693 text_to_image_median_rank: 408.0000 text_to_image_R@1: 0.0033 text_to_image_R@5: 0.0143 text_to_image_R@10: 0.0283 clip_val_loss: 4.9190 epoch: 4.0000 num_samples: 3000.0000
2024-08-16,03:35:21 | INFO | Start epoch 4
2024-08-16,03:35:24 | INFO | Train Epoch: 4 [ 256/27000 (1%)] Data (t): 1.489 Batch (t): 2.735, 93.5960/s, 93.5960/s/gpu LR: 0.000021 Logit Scale: 14.339 Contrastive_loss: 4.6774 (4.6774) Loss: 4.6774 (4.6774)
2024-08-16,03:37:29 | INFO | Train Epoch: 4 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.427/s, 204.427/s/gpu LR: 0.000026 Logit Scale: 14.352 Contrastive_loss: 4.6595 (4.6684) Loss: 4.6595 (4.6684)
2024-08-16,03:37:34 | INFO | Train Epoch: 4 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.444/s, 204.444/s/gpu LR: 0.000026 Logit Scale: 14.352 Contrastive_loss: 4.7835 (4.7068) Loss: 4.7835 (4.7068)
2024-08-16,03:37:36 | INFO | Eval Epoch: 5 [256 / 3000] Clip Loss: 4.786659
2024-08-16,03:37:41 | INFO | Eval Epoch: 5 image_to_text_mean_rank: 620.0710 image_to_text_median_rank: 405.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0150 image_to_text_R@10: 0.0250 text_to_image_mean_rank: 564.3297 text_to_image_median_rank: 358.0000 text_to_image_R@1: 0.0047 text_to_image_R@5: 0.0173 text_to_image_R@10: 0.0367 clip_val_loss: 4.8233 epoch: 5.0000 num_samples: 3000.0000
2024-08-16,03:37:42 | INFO | Start epoch 5
2024-08-16,03:37:45 | INFO | Train Epoch: 5 [ 256/27000 (1%)] Data (t): 1.495 Batch (t): 2.740, 93.4321/s, 93.4321/s/gpu LR: 0.000026 Logit Scale: 14.352 Contrastive_loss: 4.5845 (4.5845) Loss: 4.5845 (4.5845)
2024-08-16,03:39:50 | INFO | Train Epoch: 5 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.440/s, 204.440/s/gpu LR: 0.000031 Logit Scale: 14.392 Contrastive_loss: 4.5224 (4.5534) Loss: 4.5224 (4.5534)
2024-08-16,03:39:55 | INFO | Train Epoch: 5 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.301/s, 204.301/s/gpu LR: 0.000031 Logit Scale: 14.393 Contrastive_loss: 4.4956 (4.5342) Loss: 4.4956 (4.5342)
2024-08-16,03:39:56 | INFO | Eval Epoch: 6 [256 / 3000] Clip Loss: 4.666817
2024-08-16,03:40:01 | INFO | Eval Epoch: 6 image_to_text_mean_rank: 560.2250 image_to_text_median_rank: 363.0000 image_to_text_R@1: 0.0037 image_to_text_R@5: 0.0190 image_to_text_R@10: 0.0360 text_to_image_mean_rank: 524.7307 text_to_image_median_rank: 326.0000 text_to_image_R@1: 0.0037 text_to_image_R@5: 0.0223 text_to_image_R@10: 0.0403 clip_val_loss: 4.7456 epoch: 6.0000 num_samples: 3000.0000
2024-08-16,03:40:03 | INFO | Start epoch 6
2024-08-16,03:40:05 | INFO | Train Epoch: 6 [ 256/27000 (1%)] Data (t): 1.443 Batch (t): 2.691, 95.1276/s, 95.1276/s/gpu LR: 0.000032 Logit Scale: 14.394 Contrastive_loss: 4.1144 (4.1144) Loss: 4.1144 (4.1144)
2024-08-16,03:42:10 | INFO | Train Epoch: 6 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.240/s, 204.240/s/gpu LR: 0.000037 Logit Scale: 14.484 Contrastive_loss: 4.3783 (4.2463) Loss: 4.3783 (4.2463)
2024-08-16,03:42:15 | INFO | Train Epoch: 6 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.514/s, 204.514/s/gpu LR: 0.000037 Logit Scale: 14.486 Contrastive_loss: 4.4021 (4.2983) Loss: 4.4021 (4.2983)
2024-08-16,03:42:17 | INFO | Eval Epoch: 7 [256 / 3000] Clip Loss: 4.655993
2024-08-16,03:42:22 | INFO | Eval Epoch: 7 image_to_text_mean_rank: 563.7200 image_to_text_median_rank: 352.0000 image_to_text_R@1: 0.0037 image_to_text_R@5: 0.0177 image_to_text_R@10: 0.0317 text_to_image_mean_rank: 515.4990 text_to_image_median_rank: 306.0000 text_to_image_R@1: 0.0067 text_to_image_R@5: 0.0250 text_to_image_R@10: 0.0453 clip_val_loss: 4.7377 epoch: 7.0000 num_samples: 3000.0000
2024-08-16,03:42:23 | INFO | Start epoch 7
2024-08-16,03:42:26 | INFO | Train Epoch: 7 [ 256/27000 (1%)] Data (t): 1.452 Batch (t): 2.698, 94.8848/s, 94.8848/s/gpu LR: 0.000037 Logit Scale: 14.487 Contrastive_loss: 3.9120 (3.9120) Loss: 3.9120 (3.9120)
2024-08-16,03:44:31 | INFO | Train Epoch: 7 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.415/s, 204.415/s/gpu LR: 0.000042 Logit Scale: 14.608 Contrastive_loss: 3.9964 (3.9542) Loss: 3.9964 (3.9542)
2024-08-16,03:44:36 | INFO | Train Epoch: 7 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.414/s, 204.414/s/gpu LR: 0.000042 Logit Scale: 14.612 Contrastive_loss: 3.9585 (3.9556) Loss: 3.9585 (3.9556)
2024-08-16,03:44:38 | INFO | Eval Epoch: 8 [256 / 3000] Clip Loss: 4.634455
2024-08-16,03:44:42 | INFO | Eval Epoch: 8 image_to_text_mean_rank: 551.6537 image_to_text_median_rank: 340.0000 image_to_text_R@1: 0.0050 image_to_text_R@5: 0.0223 image_to_text_R@10: 0.0390 text_to_image_mean_rank: 516.1967 text_to_image_median_rank: 309.0000 text_to_image_R@1: 0.0057 text_to_image_R@5: 0.0260 text_to_image_R@10: 0.0487 clip_val_loss: 4.7750 epoch: 8.0000 num_samples: 3000.0000
2024-08-16,03:44:44 | INFO | Start epoch 8
2024-08-16,03:44:47 | INFO | Train Epoch: 8 [ 256/27000 (1%)] Data (t): 1.500 Batch (t): 2.746, 93.2370/s, 93.2370/s/gpu LR: 0.000042 Logit Scale: 14.613 Contrastive_loss: 3.6235 (3.6235) Loss: 3.6235 (3.6235)
2024-08-16,03:46:52 | INFO | Train Epoch: 8 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.478/s, 204.478/s/gpu LR: 0.000047 Logit Scale: 14.756 Contrastive_loss: 3.9351 (3.7793) Loss: 3.9351 (3.7793)
2024-08-16,03:46:57 | INFO | Train Epoch: 8 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.524/s, 204.524/s/gpu LR: 0.000047 Logit Scale: 14.760 Contrastive_loss: 3.8363 (3.7983) Loss: 3.8363 (3.7983)
2024-08-16,03:46:58 | INFO | Eval Epoch: 9 [256 / 3000] Clip Loss: 4.764020
2024-08-16,03:47:03 | INFO | Eval Epoch: 9 image_to_text_mean_rank: 587.4340 image_to_text_median_rank: 353.0000 image_to_text_R@1: 0.0040 image_to_text_R@5: 0.0187 image_to_text_R@10: 0.0360 text_to_image_mean_rank: 546.2733 text_to_image_median_rank: 318.0000 text_to_image_R@1: 0.0053 text_to_image_R@5: 0.0230 text_to_image_R@10: 0.0460 clip_val_loss: 4.8689 epoch: 9.0000 num_samples: 3000.0000
2024-08-16,03:47:04 | INFO | Start epoch 9
2024-08-16,03:47:07 | INFO | Train Epoch: 9 [ 256/27000 (1%)] Data (t): 1.432 Batch (t): 2.678, 95.6108/s, 95.6108/s/gpu LR: 0.000047 Logit Scale: 14.761 Contrastive_loss: 3.1292 (3.1292) Loss: 3.1292 (3.1292)
2024-08-16,03:49:12 | INFO | Train Epoch: 9 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.417/s, 204.417/s/gpu LR: 0.000052 Logit Scale: 14.924 Contrastive_loss: 3.3670 (3.2481) Loss: 3.3670 (3.2481)
2024-08-16,03:49:17 | INFO | Train Epoch: 9 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.451/s, 204.451/s/gpu LR: 0.000053 Logit Scale: 14.929 Contrastive_loss: 3.4293 (3.3085) Loss: 3.4293 (3.3085)
2024-08-16,03:49:19 | INFO | Eval Epoch: 10 [256 / 3000] Clip Loss: 4.830852
2024-08-16,03:49:24 | INFO | Eval Epoch: 10 image_to_text_mean_rank: 604.2760 image_to_text_median_rank: 366.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0187 image_to_text_R@10: 0.0343 text_to_image_mean_rank: 571.8533 text_to_image_median_rank: 339.0000 text_to_image_R@1: 0.0050 text_to_image_R@5: 0.0233 text_to_image_R@10: 0.0443 clip_val_loss: 4.9887 epoch: 10.0000 num_samples: 3000.0000
2024-08-16,03:49:25 | INFO | Start epoch 10
2024-08-16,03:49:28 | INFO | Train Epoch: 10 [ 256/27000 (1%)] Data (t): 1.458 Batch (t): 2.703, 94.7107/s, 94.7107/s/gpu LR: 0.000053 Logit Scale: 14.930 Contrastive_loss: 2.6423 (2.6423) Loss: 2.6423 (2.6423)
2024-08-16,03:51:33 | INFO | Train Epoch: 10 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.377/s, 204.377/s/gpu LR: 0.000058 Logit Scale: 15.109 Contrastive_loss: 2.7973 (2.7198) Loss: 2.7973 (2.7198)
2024-08-16,03:51:38 | INFO | Train Epoch: 10 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.584/s, 204.584/s/gpu LR: 0.000058 Logit Scale: 15.115 Contrastive_loss: 3.1476 (2.8624) Loss: 3.1476 (2.8624)
2024-08-16,03:51:40 | INFO | Eval Epoch: 11 [256 / 3000] Clip Loss: 4.916827
2024-08-16,03:51:44 | INFO | Eval Epoch: 11 image_to_text_mean_rank: 645.3330 image_to_text_median_rank: 392.0000 image_to_text_R@1: 0.0053 image_to_text_R@5: 0.0183 image_to_text_R@10: 0.0343 text_to_image_mean_rank: 606.6477 text_to_image_median_rank: 378.0000 text_to_image_R@1: 0.0057 text_to_image_R@5: 0.0220 text_to_image_R@10: 0.0393 clip_val_loss: 5.1492 epoch: 11.0000 num_samples: 3000.0000
2024-08-16,03:51:46 | INFO | Start epoch 11
2024-08-16,03:51:48 | INFO | Train Epoch: 11 [ 256/27000 (1%)] Data (t): 1.468 Batch (t): 2.714, 94.3351/s, 94.3351/s/gpu LR: 0.000058 Logit Scale: 15.116 Contrastive_loss: 2.1677 (2.1677) Loss: 2.1677 (2.1677)
2024-08-16,03:53:54 | INFO | Train Epoch: 11 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.580/s, 204.580/s/gpu LR: 0.000063 Logit Scale: 15.304 Contrastive_loss: 2.1280 (2.1479) Loss: 2.1280 (2.1479)
2024-08-16,03:53:59 | INFO | Train Epoch: 11 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.443/s, 204.443/s/gpu LR: 0.000063 Logit Scale: 15.311 Contrastive_loss: 2.1967 (2.1642) Loss: 2.1967 (2.1642)
2024-08-16,03:54:00 | INFO | Eval Epoch: 12 [256 / 3000] Clip Loss: 5.227501
2024-08-16,03:54:05 | INFO | Eval Epoch: 12 image_to_text_mean_rank: 701.0917 image_to_text_median_rank: 446.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0133 image_to_text_R@10: 0.0277 text_to_image_mean_rank: 672.1147 text_to_image_median_rank: 418.0000 text_to_image_R@1: 0.0060 text_to_image_R@5: 0.0197 text_to_image_R@10: 0.0397 clip_val_loss: 5.4125 epoch: 12.0000 num_samples: 3000.0000
2024-08-16,03:54:06 | INFO | Start epoch 12
2024-08-16,03:54:09 | INFO | Train Epoch: 12 [ 256/27000 (1%)] Data (t): 1.458 Batch (t): 2.704, 94.6843/s, 94.6843/s/gpu LR: 0.000063 Logit Scale: 15.313 Contrastive_loss: 1.4394 (1.4394) Loss: 1.4394 (1.4394)
2024-08-16,03:56:14 | INFO | Train Epoch: 12 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.444/s, 204.444/s/gpu LR: 0.000068 Logit Scale: 15.499 Contrastive_loss: 1.4908 (1.4651) Loss: 1.4908 (1.4651)
2024-08-16,03:56:19 | INFO | Train Epoch: 12 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.382/s, 204.382/s/gpu LR: 0.000068 Logit Scale: 15.506 Contrastive_loss: 1.5923 (1.5075) Loss: 1.5923 (1.5075)
2024-08-16,03:56:21 | INFO | Eval Epoch: 13 [256 / 3000] Clip Loss: 5.440553
2024-08-16,03:56:26 | INFO | Eval Epoch: 13 image_to_text_mean_rank: 746.4853 image_to_text_median_rank: 468.0000 image_to_text_R@1: 0.0023 image_to_text_R@5: 0.0130 image_to_text_R@10: 0.0290 text_to_image_mean_rank: 718.4413 text_to_image_median_rank: 455.0000 text_to_image_R@1: 0.0043 text_to_image_R@5: 0.0197 text_to_image_R@10: 0.0323 clip_val_loss: 5.6023 epoch: 13.0000 num_samples: 3000.0000
2024-08-16,03:56:27 | INFO | Start epoch 13
2024-08-16,03:56:30 | INFO | Train Epoch: 13 [ 256/27000 (1%)] Data (t): 1.452 Batch (t): 2.698, 94.8883/s, 94.8883/s/gpu LR: 0.000068 Logit Scale: 15.507 Contrastive_loss: 0.92882 (0.92882) Loss: 0.92882 (0.92882)
2024-08-16,03:58:35 | INFO | Train Epoch: 13 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.386/s, 204.386/s/gpu LR: 0.000073 Logit Scale: 15.674 Contrastive_loss: 0.86548 (0.89715) Loss: 0.86548 (0.89715)
2024-08-16,03:58:40 | INFO | Train Epoch: 13 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.431/s, 204.431/s/gpu LR: 0.000073 Logit Scale: 15.681 Contrastive_loss: 0.90084 (0.89838) Loss: 0.90084 (0.89838)
2024-08-16,03:58:42 | INFO | Eval Epoch: 14 [256 / 3000] Clip Loss: 5.552146
2024-08-16,03:58:46 | INFO | Eval Epoch: 14 image_to_text_mean_rank: 811.2757 image_to_text_median_rank: 553.0000 image_to_text_R@1: 0.0027 image_to_text_R@5: 0.0127 image_to_text_R@10: 0.0207 text_to_image_mean_rank: 788.4850 text_to_image_median_rank: 516.0000 text_to_image_R@1: 0.0047 text_to_image_R@5: 0.0197 text_to_image_R@10: 0.0333 clip_val_loss: 5.8483 epoch: 14.0000 num_samples: 3000.0000
2024-08-16,03:58:48 | INFO | Start epoch 14
2024-08-16,03:58:50 | INFO | Train Epoch: 14 [ 256/27000 (1%)] Data (t): 1.521 Batch (t): 2.767, 92.5317/s, 92.5317/s/gpu LR: 0.000074 Logit Scale: 15.682 Contrastive_loss: 0.52879 (0.52879) Loss: 0.52879 (0.52879)
2024-08-16,04:00:56 | INFO | Train Epoch: 14 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.407/s, 204.407/s/gpu LR: 0.000079 Logit Scale: 15.819 Contrastive_loss: 0.53437 (0.53158) Loss: 0.53437 (0.53158)
2024-08-16,04:01:01 | INFO | Train Epoch: 14 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.393/s, 204.393/s/gpu LR: 0.000079 Logit Scale: 15.825 Contrastive_loss: 0.52500 (0.52939) Loss: 0.52500 (0.52939)
2024-08-16,04:01:02 | INFO | Eval Epoch: 15 [256 / 3000] Clip Loss: 5.805412
2024-08-16,04:01:07 | INFO | Eval Epoch: 15 image_to_text_mean_rank: 865.4823 image_to_text_median_rank: 598.0000 image_to_text_R@1: 0.0050 image_to_text_R@5: 0.0143 image_to_text_R@10: 0.0270 text_to_image_mean_rank: 846.2273 text_to_image_median_rank: 575.0000 text_to_image_R@1: 0.0033 text_to_image_R@5: 0.0163 text_to_image_R@10: 0.0297 clip_val_loss: 6.0065 epoch: 15.0000 num_samples: 3000.0000
2024-08-16,04:01:08 | INFO | Start epoch 15
2024-08-16,04:01:11 | INFO | Train Epoch: 15 [ 256/27000 (1%)] Data (t): 1.513 Batch (t): 2.760, 92.7525/s, 92.7525/s/gpu LR: 0.000079 Logit Scale: 15.826 Contrastive_loss: 0.34025 (0.34025) Loss: 0.34025 (0.34025)
2024-08-16,04:03:16 | INFO | Train Epoch: 15 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.496/s, 204.496/s/gpu LR: 0.000084 Logit Scale: 15.934 Contrastive_loss: 0.29246 (0.31636) Loss: 0.29246 (0.31636)
2024-08-16,04:03:21 | INFO | Train Epoch: 15 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.414/s, 204.414/s/gpu LR: 0.000084 Logit Scale: 15.938 Contrastive_loss: 0.26719 (0.29997) Loss: 0.26719 (0.29997)
2024-08-16,04:03:23 | INFO | Eval Epoch: 16 [256 / 3000] Clip Loss: 6.014488
2024-08-16,04:03:28 | INFO | Eval Epoch: 16 image_to_text_mean_rank: 887.4880 image_to_text_median_rank: 627.0000 image_to_text_R@1: 0.0030 image_to_text_R@5: 0.0140 image_to_text_R@10: 0.0260 text_to_image_mean_rank: 880.3730 text_to_image_median_rank: 619.0000 text_to_image_R@1: 0.0037 text_to_image_R@5: 0.0183 text_to_image_R@10: 0.0320 clip_val_loss: 6.1342 epoch: 16.0000 num_samples: 3000.0000
2024-08-16,04:03:29 | INFO | Start epoch 16
2024-08-16,04:03:32 | INFO | Train Epoch: 16 [ 256/27000 (1%)] Data (t): 1.600 Batch (t): 2.846, 89.9529/s, 89.9529/s/gpu LR: 0.000084 Logit Scale: 15.939 Contrastive_loss: 0.20845 (0.20845) Loss: 0.20845 (0.20845)
2024-08-16,04:05:37 | INFO | Train Epoch: 16 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.440/s, 204.440/s/gpu LR: 0.000089 Logit Scale: 16.026 Contrastive_loss: 0.21264 (0.21055) Loss: 0.21264 (0.21055)
2024-08-16,04:05:42 | INFO | Train Epoch: 16 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.576/s, 204.576/s/gpu LR: 0.000089 Logit Scale: 16.030 Contrastive_loss: 0.18559 (0.20223) Loss: 0.18559 (0.20223)
2024-08-16,04:05:44 | INFO | Eval Epoch: 17 [256 / 3000] Clip Loss: 6.084400
2024-08-16,04:05:49 | INFO | Eval Epoch: 17 image_to_text_mean_rank: 935.7397 image_to_text_median_rank: 700.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0137 image_to_text_R@10: 0.0237 text_to_image_mean_rank: 929.2380 text_to_image_median_rank: 691.0000 text_to_image_R@1: 0.0040 text_to_image_R@5: 0.0160 text_to_image_R@10: 0.0290 clip_val_loss: 6.2851 epoch: 17.0000 num_samples: 3000.0000
2024-08-16,04:05:50 | INFO | Start epoch 17
2024-08-16,04:05:53 | INFO | Train Epoch: 17 [ 256/27000 (1%)] Data (t): 1.469 Batch (t): 2.715, 94.3037/s, 94.3037/s/gpu LR: 0.000089 Logit Scale: 16.031 Contrastive_loss: 0.13997 (0.13997) Loss: 0.13997 (0.13997)
2024-08-16,04:07:58 | INFO | Train Epoch: 17 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.280/s, 204.280/s/gpu LR: 0.000094 Logit Scale: 16.107 Contrastive_loss: 0.14680 (0.14339) Loss: 0.14680 (0.14339)
2024-08-16,04:08:03 | INFO | Train Epoch: 17 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.427/s, 204.427/s/gpu LR: 0.000095 Logit Scale: 16.110 Contrastive_loss: 0.14784 (0.14487) Loss: 0.14784 (0.14487)
2024-08-16,04:08:05 | INFO | Eval Epoch: 18 [256 / 3000] Clip Loss: 6.179710
2024-08-16,04:08:09 | INFO | Eval Epoch: 18 image_to_text_mean_rank: 959.2040 image_to_text_median_rank: 740.0000 image_to_text_R@1: 0.0017 image_to_text_R@5: 0.0143 image_to_text_R@10: 0.0240 text_to_image_mean_rank: 952.1377 text_to_image_median_rank: 737.0000 text_to_image_R@1: 0.0017 text_to_image_R@5: 0.0143 text_to_image_R@10: 0.0293 clip_val_loss: 6.3885 epoch: 18.0000 num_samples: 3000.0000
2024-08-16,04:08:11 | INFO | Start epoch 18
2024-08-16,04:08:13 | INFO | Train Epoch: 18 [ 256/27000 (1%)] Data (t): 1.515 Batch (t): 2.760, 92.7391/s, 92.7391/s/gpu LR: 0.000095 Logit Scale: 16.111 Contrastive_loss: 0.10571 (0.10571) Loss: 0.10571 (0.10571)
2024-08-16,04:10:19 | INFO | Train Epoch: 18 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.597/s, 204.597/s/gpu LR: 0.000100 Logit Scale: 16.180 Contrastive_loss: 0.11256 (0.10913) Loss: 0.11256 (0.10913)
2024-08-16,04:10:24 | INFO | Train Epoch: 18 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.521/s, 204.521/s/gpu LR: 0.000100 Logit Scale: 16.183 Contrastive_loss: 0.10602 (0.10809) Loss: 0.10602 (0.10809)
2024-08-16,04:10:25 | INFO | Eval Epoch: 19 [256 / 3000] Clip Loss: 6.244460
2024-08-16,04:10:30 | INFO | Eval Epoch: 19 image_to_text_mean_rank: 984.8337 image_to_text_median_rank: 751.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0107 image_to_text_R@10: 0.0230 text_to_image_mean_rank: 974.3037 text_to_image_median_rank: 724.0000 text_to_image_R@1: 0.0030 text_to_image_R@5: 0.0130 text_to_image_R@10: 0.0267 clip_val_loss: 6.4495 epoch: 19.0000 num_samples: 3000.0000
2024-08-16,04:10:31 | INFO | Start epoch 19
2024-08-16,04:10:34 | INFO | Train Epoch: 19 [ 256/27000 (1%)] Data (t): 1.482 Batch (t): 2.730, 93.7704/s, 93.7704/s/gpu LR: 0.000100 Logit Scale: 16.184 Contrastive_loss: 0.091815 (0.091815) Loss: 0.091815 (0.091815)
2024-08-16,04:12:39 | INFO | Train Epoch: 19 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.414/s, 204.414/s/gpu LR: 0.000105 Logit Scale: 16.250 Contrastive_loss: 0.10051 (0.096161) Loss: 0.10051 (0.096161)
2024-08-16,04:12:44 | INFO | Train Epoch: 19 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.567/s, 204.567/s/gpu LR: 0.000105 Logit Scale: 16.253 Contrastive_loss: 0.083744 (0.092022) Loss: 0.083744 (0.092022)
2024-08-16,04:12:46 | INFO | Eval Epoch: 20 [256 / 3000] Clip Loss: 6.373804
2024-08-16,04:12:51 | INFO | Eval Epoch: 20 image_to_text_mean_rank: 998.9283 image_to_text_median_rank: 775.0000 image_to_text_R@1: 0.0023 image_to_text_R@5: 0.0120 image_to_text_R@10: 0.0223 text_to_image_mean_rank: 995.4970 text_to_image_median_rank: 768.0000 text_to_image_R@1: 0.0023 text_to_image_R@5: 0.0143 text_to_image_R@10: 0.0240 clip_val_loss: 6.5515 epoch: 20.0000 num_samples: 3000.0000
2024-08-16,04:12:52 | INFO | Start epoch 20
2024-08-16,04:12:55 | INFO | Train Epoch: 20 [ 256/27000 (1%)] Data (t): 1.470 Batch (t): 2.715, 94.2863/s, 94.2863/s/gpu LR: 0.000105 Logit Scale: 16.254 Contrastive_loss: 0.082728 (0.082728) Loss: 0.082728 (0.082728)
2024-08-16,04:15:00 | INFO | Train Epoch: 20 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.462/s, 204.462/s/gpu LR: 0.000110 Logit Scale: 16.322 Contrastive_loss: 0.092753 (0.087741) Loss: 0.092753 (0.087741)
2024-08-16,04:15:05 | INFO | Train Epoch: 20 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.462/s, 204.462/s/gpu LR: 0.000110 Logit Scale: 16.325 Contrastive_loss: 0.10030 (0.091928) Loss: 0.10030 (0.091928)
2024-08-16,04:15:07 | INFO | Eval Epoch: 21 [256 / 3000] Clip Loss: 6.443340
2024-08-16,04:15:11 | INFO | Eval Epoch: 21 image_to_text_mean_rank: 1025.5073 image_to_text_median_rank: 807.0000 image_to_text_R@1: 0.0023 image_to_text_R@5: 0.0103 image_to_text_R@10: 0.0207 text_to_image_mean_rank: 1019.2447 text_to_image_median_rank: 826.0000 text_to_image_R@1: 0.0027 text_to_image_R@5: 0.0127 text_to_image_R@10: 0.0250 clip_val_loss: 6.6143 epoch: 21.0000 num_samples: 3000.0000
2024-08-16,04:15:13 | INFO | Start epoch 21
2024-08-16,04:15:15 | INFO | Train Epoch: 21 [ 256/27000 (1%)] Data (t): 1.476 Batch (t): 2.722, 94.0350/s, 94.0350/s/gpu LR: 0.000110 Logit Scale: 16.325 Contrastive_loss: 0.067669 (0.067669) Loss: 0.067669 (0.067669)
2024-08-16,04:17:21 | INFO | Train Epoch: 21 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.508/s, 204.508/s/gpu LR: 0.000115 Logit Scale: 16.396 Contrastive_loss: 0.081561 (0.074615) Loss: 0.081561 (0.074615)
2024-08-16,04:17:26 | INFO | Train Epoch: 21 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.474/s, 204.474/s/gpu LR: 0.000116 Logit Scale: 16.399 Contrastive_loss: 0.089710 (0.079647) Loss: 0.089710 (0.079647)
2024-08-16,04:17:27 | INFO | Eval Epoch: 22 [256 / 3000] Clip Loss: 6.464468
2024-08-16,04:17:32 | INFO | Eval Epoch: 22 image_to_text_mean_rank: 1013.4487 image_to_text_median_rank: 788.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0103 image_to_text_R@10: 0.0203 text_to_image_mean_rank: 1005.7700 text_to_image_median_rank: 764.0000 text_to_image_R@1: 0.0017 text_to_image_R@5: 0.0140 text_to_image_R@10: 0.0250 clip_val_loss: 6.6335 epoch: 22.0000 num_samples: 3000.0000
2024-08-16,04:17:33 | INFO | Start epoch 22
2024-08-16,04:17:36 | INFO | Train Epoch: 22 [ 256/27000 (1%)] Data (t): 1.477 Batch (t): 2.724, 93.9871/s, 93.9871/s/gpu LR: 0.000116 Logit Scale: 16.400 Contrastive_loss: 0.075971 (0.075971) Loss: 0.075971 (0.075971)
2024-08-16,04:19:41 | INFO | Train Epoch: 22 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.527/s, 204.527/s/gpu LR: 0.000121 Logit Scale: 16.478 Contrastive_loss: 0.094408 (0.085189) Loss: 0.094408 (0.085189)
2024-08-16,04:19:46 | INFO | Train Epoch: 22 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.417/s, 204.417/s/gpu LR: 0.000121 Logit Scale: 16.482 Contrastive_loss: 0.098228 (0.089536) Loss: 0.098228 (0.089536)
2024-08-16,04:19:48 | INFO | Eval Epoch: 23 [256 / 3000] Clip Loss: 6.571661
2024-08-16,04:19:53 | INFO | Eval Epoch: 23 image_to_text_mean_rank: 996.3893 image_to_text_median_rank: 789.0000 image_to_text_R@1: 0.0037 image_to_text_R@5: 0.0113 image_to_text_R@10: 0.0207 text_to_image_mean_rank: 992.0480 text_to_image_median_rank: 786.0000 text_to_image_R@1: 0.0030 text_to_image_R@5: 0.0147 text_to_image_R@10: 0.0270 clip_val_loss: 6.6228 epoch: 23.0000 num_samples: 3000.0000
2024-08-16,04:19:54 | INFO | Start epoch 23
2024-08-16,04:19:57 | INFO | Train Epoch: 23 [ 256/27000 (1%)] Data (t): 1.547 Batch (t): 2.793, 91.6730/s, 91.6730/s/gpu LR: 0.000121 Logit Scale: 16.483 Contrastive_loss: 0.078356 (0.078356) Loss: 0.078356 (0.078356)
2024-08-16,04:22:02 | INFO | Train Epoch: 23 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.462/s, 204.462/s/gpu LR: 0.000126 Logit Scale: 16.572 Contrastive_loss: 0.090556 (0.084456) Loss: 0.090556 (0.084456)
2024-08-16,04:22:07 | INFO | Train Epoch: 23 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.327/s, 204.327/s/gpu LR: 0.000126 Logit Scale: 16.576 Contrastive_loss: 0.086313 (0.085075) Loss: 0.086313 (0.085075)
2024-08-16,04:22:09 | INFO | Eval Epoch: 24 [256 / 3000] Clip Loss: 6.592369
2024-08-16,04:22:14 | INFO | Eval Epoch: 24 image_to_text_mean_rank: 1030.9110 image_to_text_median_rank: 812.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0127 image_to_text_R@10: 0.0210 text_to_image_mean_rank: 1022.6720 text_to_image_median_rank: 805.0000 text_to_image_R@1: 0.0030 text_to_image_R@5: 0.0163 text_to_image_R@10: 0.0290 clip_val_loss: 6.7181 epoch: 24.0000 num_samples: 3000.0000
2024-08-16,04:22:15 | INFO | Start epoch 24
2024-08-16,04:22:18 | INFO | Train Epoch: 24 [ 256/27000 (1%)] Data (t): 1.778 Batch (t): 3.023, 84.6777/s, 84.6777/s/gpu LR: 0.000126 Logit Scale: 16.577 Contrastive_loss: 0.079637 (0.079637) Loss: 0.079637 (0.079637)
2024-08-16,04:24:23 | INFO | Train Epoch: 24 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.535/s, 204.535/s/gpu LR: 0.000131 Logit Scale: 16.685 Contrastive_loss: 0.11149 (0.095563) Loss: 0.11149 (0.095563)
2024-08-16,04:24:28 | INFO | Train Epoch: 24 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.539/s, 204.539/s/gpu LR: 0.000131 Logit Scale: 16.690 Contrastive_loss: 0.10690 (0.099341) Loss: 0.10690 (0.099341)
2024-08-16,04:24:30 | INFO | Eval Epoch: 25 [256 / 3000] Clip Loss: 6.576798
2024-08-16,04:24:35 | INFO | Eval Epoch: 25 image_to_text_mean_rank: 1011.1870 image_to_text_median_rank: 780.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0097 image_to_text_R@10: 0.0180 text_to_image_mean_rank: 994.8833 text_to_image_median_rank: 768.0000 text_to_image_R@1: 0.0023 text_to_image_R@5: 0.0147 text_to_image_R@10: 0.0243 clip_val_loss: 6.6493 epoch: 25.0000 num_samples: 3000.0000
2024-08-16,04:24:36 | INFO | Start epoch 25
2024-08-16,04:24:39 | INFO | Train Epoch: 25 [ 256/27000 (1%)] Data (t): 1.483 Batch (t): 2.726, 93.8938/s, 93.8938/s/gpu LR: 0.000131 Logit Scale: 16.691 Contrastive_loss: 0.094896 (0.094896) Loss: 0.094896 (0.094896)
2024-08-16,04:26:44 | INFO | Train Epoch: 25 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.456/s, 204.456/s/gpu LR: 0.000136 Logit Scale: 16.831 Contrastive_loss: 0.24732 (0.17111) Loss: 0.24732 (0.17111)
2024-08-16,04:26:49 | INFO | Train Epoch: 25 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.546/s, 204.546/s/gpu LR: 0.000137 Logit Scale: 16.840 Contrastive_loss: 0.42561 (0.25594) Loss: 0.42561 (0.25594)
2024-08-16,04:26:50 | INFO | Eval Epoch: 26 [256 / 3000] Clip Loss: 6.370559
2024-08-16,04:26:55 | INFO | Eval Epoch: 26 image_to_text_mean_rank: 1114.0630 image_to_text_median_rank: 930.0000 image_to_text_R@1: 0.0010 image_to_text_R@5: 0.0053 image_to_text_R@10: 0.0103 text_to_image_mean_rank: 1001.3303 text_to_image_median_rank: 768.0000 text_to_image_R@1: 0.0023 text_to_image_R@5: 0.0113 text_to_image_R@10: 0.0217 clip_val_loss: 6.6953 epoch: 26.0000 num_samples: 3000.0000
2024-08-16,04:26:57 | INFO | Start epoch 26
2024-08-16,04:26:59 | INFO | Train Epoch: 26 [ 256/27000 (1%)] Data (t): 1.514 Batch (t): 2.756, 92.8774/s, 92.8774/s/gpu LR: 0.000137 Logit Scale: 16.842 Contrastive_loss: 0.68641 (0.68641) Loss: 0.68641 (0.68641)
2024-08-16,04:29:04 | INFO | Train Epoch: 26 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.423/s, 204.423/s/gpu LR: 0.000142 Logit Scale: 16.877 Contrastive_loss: 2.6772 (1.6818) Loss: 2.6772 (1.6818)
2024-08-16,04:29:09 | INFO | Train Epoch: 26 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.620/s, 204.620/s/gpu LR: 0.000142 Logit Scale: 16.883 Contrastive_loss: 2.6573 (2.0070) Loss: 2.6573 (2.0070)
2024-08-16,04:29:11 | INFO | Eval Epoch: 27 [256 / 3000] Clip Loss: 5.865312
2024-08-16,04:29:16 | INFO | Eval Epoch: 27 image_to_text_mean_rank: 782.6180 image_to_text_median_rank: 546.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0103 image_to_text_R@10: 0.0223 text_to_image_mean_rank: 727.9463 text_to_image_median_rank: 478.0000 text_to_image_R@1: 0.0047 text_to_image_R@5: 0.0193 text_to_image_R@10: 0.0297 clip_val_loss: 5.9258 epoch: 27.0000 num_samples: 3000.0000
2024-08-16,04:29:17 | INFO | Start epoch 27
2024-08-16,04:29:20 | INFO | Train Epoch: 27 [ 256/27000 (1%)] Data (t): 1.459 Batch (t): 2.705, 94.6292/s, 94.6292/s/gpu LR: 0.000142 Logit Scale: 16.884 Contrastive_loss: 1.7112 (1.7112) Loss: 1.7112 (1.7112)
2024-08-16,04:31:25 | INFO | Train Epoch: 27 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.360/s, 204.360/s/gpu LR: 0.000147 Logit Scale: 17.268 Contrastive_loss: 0.84065 (1.2759) Loss: 0.84065 (1.2759)
2024-08-16,04:31:30 | INFO | Train Epoch: 27 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.422/s, 204.422/s/gpu LR: 0.000147 Logit Scale: 17.283 Contrastive_loss: 0.70991 (1.0872) Loss: 0.70991 (1.0872)
2024-08-16,04:31:32 | INFO | Eval Epoch: 28 [256 / 3000] Clip Loss: 6.629393
2024-08-16,04:31:37 | INFO | Eval Epoch: 28 image_to_text_mean_rank: 896.8010 image_to_text_median_rank: 667.0000 image_to_text_R@1: 0.0013 image_to_text_R@5: 0.0097 image_to_text_R@10: 0.0167 text_to_image_mean_rank: 876.8693 text_to_image_median_rank: 625.0000 text_to_image_R@1: 0.0020 text_to_image_R@5: 0.0127 text_to_image_R@10: 0.0240 clip_val_loss: 6.5428 epoch: 28.0000 num_samples: 3000.0000
2024-08-16,04:31:38 | INFO | Start epoch 28
2024-08-16,04:31:41 | INFO | Train Epoch: 28 [ 256/27000 (1%)] Data (t): 1.474 Batch (t): 2.719, 94.1445/s, 94.1445/s/gpu LR: 0.000147 Logit Scale: 17.287 Contrastive_loss: 0.30942 (0.30942) Loss: 0.30942 (0.30942)
2024-08-16,04:33:46 | INFO | Train Epoch: 28 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.480/s, 204.480/s/gpu LR: 0.000152 Logit Scale: 17.513 Contrastive_loss: 0.14982 (0.22962) Loss: 0.14982 (0.22962)
2024-08-16,04:33:51 | INFO | Train Epoch: 28 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.602/s, 204.602/s/gpu LR: 0.000152 Logit Scale: 17.520 Contrastive_loss: 0.17185 (0.21036) Loss: 0.17185 (0.21036)
2024-08-16,04:33:53 | INFO | Eval Epoch: 29 [256 / 3000] Clip Loss: 6.860646
2024-08-16,04:33:57 | INFO | Eval Epoch: 29 image_to_text_mean_rank: 948.3863 image_to_text_median_rank: 727.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0110 image_to_text_R@10: 0.0230 text_to_image_mean_rank: 941.6530 text_to_image_median_rank: 707.0000 text_to_image_R@1: 0.0023 text_to_image_R@5: 0.0147 text_to_image_R@10: 0.0250 clip_val_loss: 6.7876 epoch: 29.0000 num_samples: 3000.0000
2024-08-16,04:33:59 | INFO | Start epoch 29
2024-08-16,04:34:01 | INFO | Train Epoch: 29 [ 256/27000 (1%)] Data (t): 1.484 Batch (t): 2.730, 93.7852/s, 93.7852/s/gpu LR: 0.000152 Logit Scale: 17.521 Contrastive_loss: 0.073008 (0.073008) Loss: 0.073008 (0.073008)
2024-08-16,04:36:06 | INFO | Train Epoch: 29 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.650/s, 204.650/s/gpu LR: 0.000157 Logit Scale: 17.624 Contrastive_loss: 0.061296 (0.067152) Loss: 0.061296 (0.067152)
2024-08-16,04:36:11 | INFO | Train Epoch: 29 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.599/s, 204.599/s/gpu LR: 0.000158 Logit Scale: 17.627 Contrastive_loss: 0.069361 (0.067888) Loss: 0.069361 (0.067888)
2024-08-16,04:36:13 | INFO | Eval Epoch: 30 [256 / 3000] Clip Loss: 7.037732
2024-08-16,04:36:18 | INFO | Eval Epoch: 30 image_to_text_mean_rank: 987.8290 image_to_text_median_rank: 769.0000 image_to_text_R@1: 0.0017 image_to_text_R@5: 0.0083 image_to_text_R@10: 0.0157 text_to_image_mean_rank: 982.1127 text_to_image_median_rank: 774.0000 text_to_image_R@1: 0.0037 text_to_image_R@5: 0.0110 text_to_image_R@10: 0.0200 clip_val_loss: 6.9645 epoch: 30.0000 num_samples: 3000.0000
2024-08-16,04:36:19 | INFO | Start epoch 30
2024-08-16,04:36:22 | INFO | Train Epoch: 30 [ 256/27000 (1%)] Data (t): 1.468 Batch (t): 2.714, 94.3415/s, 94.3415/s/gpu LR: 0.000158 Logit Scale: 17.628 Contrastive_loss: 0.040057 (0.040057) Loss: 0.040057 (0.040057)
2024-08-16,04:38:27 | INFO | Train Epoch: 30 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.454/s, 204.454/s/gpu LR: 0.000163 Logit Scale: 17.701 Contrastive_loss: 0.035407 (0.037732) Loss: 0.035407 (0.037732)
2024-08-16,04:38:32 | INFO | Train Epoch: 30 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.606/s, 204.606/s/gpu LR: 0.000163 Logit Scale: 17.704 Contrastive_loss: 0.046378 (0.040614) Loss: 0.046378 (0.040614)
2024-08-16,04:38:34 | INFO | Eval Epoch: 31 [256 / 3000] Clip Loss: 7.103376
2024-08-16,04:38:38 | INFO | Eval Epoch: 31 image_to_text_mean_rank: 1028.7160 image_to_text_median_rank: 827.0000 image_to_text_R@1: 0.0030 image_to_text_R@5: 0.0087 image_to_text_R@10: 0.0193 text_to_image_mean_rank: 1023.7647 text_to_image_median_rank: 810.0000 text_to_image_R@1: 0.0020 text_to_image_R@5: 0.0117 text_to_image_R@10: 0.0193 clip_val_loss: 7.0932 epoch: 31.0000 num_samples: 3000.0000
2024-08-16,04:38:40 | INFO | Start epoch 31
2024-08-16,04:38:42 | INFO | Train Epoch: 31 [ 256/27000 (1%)] Data (t): 1.446 Batch (t): 2.692, 95.1004/s, 95.1004/s/gpu LR: 0.000163 Logit Scale: 17.704 Contrastive_loss: 0.045176 (0.045176) Loss: 0.045176 (0.045176)
2024-08-16,04:40:47 | INFO | Train Epoch: 31 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.631/s, 204.631/s/gpu LR: 0.000168 Logit Scale: 17.770 Contrastive_loss: 0.063451 (0.054314) Loss: 0.063451 (0.054314)
2024-08-16,04:40:53 | INFO | Train Epoch: 31 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.576/s, 204.576/s/gpu LR: 0.000168 Logit Scale: 17.772 Contrastive_loss: 0.025344 (0.044657) Loss: 0.025344 (0.044657)
2024-08-16,04:40:54 | INFO | Eval Epoch: 32 [256 / 3000] Clip Loss: 7.116255
2024-08-16,04:40:59 | INFO | Eval Epoch: 32 image_to_text_mean_rank: 1041.3660 image_to_text_median_rank: 826.0000 image_to_text_R@1: 0.0013 image_to_text_R@5: 0.0103 image_to_text_R@10: 0.0180 text_to_image_mean_rank: 1038.6310 text_to_image_median_rank: 819.0000 text_to_image_R@1: 0.0020 text_to_image_R@5: 0.0107 text_to_image_R@10: 0.0223 clip_val_loss: 7.1470 epoch: 32.0000 num_samples: 3000.0000
2024-08-16,04:41:00 | INFO | Start epoch 32
2024-08-16,04:41:03 | INFO | Train Epoch: 32 [ 256/27000 (1%)] Data (t): 1.481 Batch (t): 2.728, 93.8307/s, 93.8307/s/gpu LR: 0.000168 Logit Scale: 17.773 Contrastive_loss: 0.024506 (0.024506) Loss: 0.024506 (0.024506)
2024-08-16,04:43:08 | INFO | Train Epoch: 32 [25856/27000 (96%)] Data (t): 0.001 Batch (t): 1.251, 204.585/s, 204.585/s/gpu LR: 0.000173 Logit Scale: 17.837 Contrastive_loss: 0.042150 (0.033328) Loss: 0.042150 (0.033328)
2024-08-16,04:43:13 | INFO | Train Epoch: 32 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.710/s, 204.710/s/gpu LR: 0.000173 Logit Scale: 17.840 Contrastive_loss: 0.021264 (0.029307) Loss: 0.021264 (0.029307)
2024-08-16,04:43:15 | INFO | Eval Epoch: 33 [256 / 3000] Clip Loss: 7.302838
2024-08-16,04:43:20 | INFO | Eval Epoch: 33 image_to_text_mean_rank: 1051.0060 image_to_text_median_rank: 844.0000 image_to_text_R@1: 0.0010 image_to_text_R@5: 0.0083 image_to_text_R@10: 0.0143 text_to_image_mean_rank: 1047.7423 text_to_image_median_rank: 858.0000 text_to_image_R@1: 0.0017 text_to_image_R@5: 0.0107 text_to_image_R@10: 0.0167 clip_val_loss: 7.2055 epoch: 33.0000 num_samples: 3000.0000
2024-08-16,04:43:21 | INFO | Start epoch 33
2024-08-16,04:43:24 | INFO | Train Epoch: 33 [ 256/27000 (1%)] Data (t): 1.471 Batch (t): 2.714, 94.3115/s, 94.3115/s/gpu LR: 0.000173 Logit Scale: 17.840 Contrastive_loss: 0.036071 (0.036071) Loss: 0.036071 (0.036071)
2024-08-16,04:45:29 | INFO | Train Epoch: 33 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.581/s, 204.581/s/gpu LR: 0.000178 Logit Scale: 17.908 Contrastive_loss: 0.032550 (0.034310) Loss: 0.032550 (0.034310)
2024-08-16,04:45:34 | INFO | Train Epoch: 33 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.538/s, 204.538/s/gpu LR: 0.000179 Logit Scale: 17.911 Contrastive_loss: 0.045624 (0.038082) Loss: 0.045624 (0.038082)
2024-08-16,04:45:36 | INFO | Eval Epoch: 34 [256 / 3000] Clip Loss: 7.142300
2024-08-16,04:45:40 | INFO | Eval Epoch: 34 image_to_text_mean_rank: 1043.7917 image_to_text_median_rank: 850.0000 image_to_text_R@1: 0.0017 image_to_text_R@5: 0.0097 image_to_text_R@10: 0.0183 text_to_image_mean_rank: 1041.0867 text_to_image_median_rank: 850.0000 text_to_image_R@1: 0.0027 text_to_image_R@5: 0.0113 text_to_image_R@10: 0.0200 clip_val_loss: 7.1728 epoch: 34.0000 num_samples: 3000.0000
2024-08-16,04:45:42 | INFO | Start epoch 34
2024-08-16,04:45:44 | INFO | Train Epoch: 34 [ 256/27000 (1%)] Data (t): 1.475 Batch (t): 2.721, 94.0990/s, 94.0990/s/gpu LR: 0.000179 Logit Scale: 17.911 Contrastive_loss: 0.030825 (0.030825) Loss: 0.030825 (0.030825)
2024-08-16,04:47:49 | INFO | Train Epoch: 34 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.590/s, 204.590/s/gpu LR: 0.000184 Logit Scale: 17.981 Contrastive_loss: 0.046251 (0.038538) Loss: 0.046251 (0.038538)
2024-08-16,04:47:54 | INFO | Train Epoch: 34 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.829/s, 204.829/s/gpu LR: 0.000184 Logit Scale: 17.984 Contrastive_loss: 0.041009 (0.039362) Loss: 0.041009 (0.039362)
2024-08-16,04:47:56 | INFO | Eval Epoch: 35 [256 / 3000] Clip Loss: 7.147564
2024-08-16,04:48:01 | INFO | Eval Epoch: 35 image_to_text_mean_rank: 1042.3890 image_to_text_median_rank: 852.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0090 image_to_text_R@10: 0.0173 text_to_image_mean_rank: 1035.9400 text_to_image_median_rank: 837.0000 text_to_image_R@1: 0.0020 text_to_image_R@5: 0.0110 text_to_image_R@10: 0.0207 clip_val_loss: 7.1603 epoch: 35.0000 num_samples: 3000.0000
2024-08-16,04:48:02 | INFO | Start epoch 35
2024-08-16,04:48:05 | INFO | Train Epoch: 35 [ 256/27000 (1%)] Data (t): 1.466 Batch (t): 2.711, 94.4327/s, 94.4327/s/gpu LR: 0.000184 Logit Scale: 17.985 Contrastive_loss: 0.032819 (0.032819) Loss: 0.032819 (0.032819)
2024-08-16,04:50:10 | INFO | Train Epoch: 35 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.250, 204.457/s, 204.457/s/gpu LR: 0.000189 Logit Scale: 18.066 Contrastive_loss: 0.047232 (0.040025) Loss: 0.047232 (0.040025)
2024-08-16,04:50:15 | INFO | Train Epoch: 35 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.589/s, 204.589/s/gpu LR: 0.000189 Logit Scale: 18.069 Contrastive_loss: 0.056291 (0.045447) Loss: 0.056291 (0.045447)
2024-08-16,04:50:17 | INFO | Eval Epoch: 36 [256 / 3000] Clip Loss: 7.278274
2024-08-16,04:50:22 | INFO | Eval Epoch: 36 image_to_text_mean_rank: 1058.5333 image_to_text_median_rank: 847.0000 image_to_text_R@1: 0.0030 image_to_text_R@5: 0.0147 image_to_text_R@10: 0.0223 text_to_image_mean_rank: 1052.5207 text_to_image_median_rank: 836.0000 text_to_image_R@1: 0.0023 text_to_image_R@5: 0.0133 text_to_image_R@10: 0.0233 clip_val_loss: 7.2311 epoch: 36.0000 num_samples: 3000.0000
2024-08-16,04:50:23 | INFO | Start epoch 36
2024-08-16,04:50:26 | INFO | Train Epoch: 36 [ 256/27000 (1%)] Data (t): 1.454 Batch (t): 2.699, 94.8579/s, 94.8579/s/gpu LR: 0.000189 Logit Scale: 18.070 Contrastive_loss: 0.030523 (0.030523) Loss: 0.030523 (0.030523)
2024-08-16,04:52:31 | INFO | Train Epoch: 36 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.636/s, 204.636/s/gpu LR: 0.000194 Logit Scale: 18.156 Contrastive_loss: 0.043999 (0.037261) Loss: 0.043999 (0.037261)
2024-08-16,04:52:36 | INFO | Train Epoch: 36 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.744/s, 204.744/s/gpu LR: 0.000194 Logit Scale: 18.160 Contrastive_loss: 0.037191 (0.037238) Loss: 0.037191 (0.037238)
2024-08-16,04:52:38 | INFO | Eval Epoch: 37 [256 / 3000] Clip Loss: 7.176402
2024-08-16,04:52:42 | INFO | Eval Epoch: 37 image_to_text_mean_rank: 1054.3230 image_to_text_median_rank: 875.0000 image_to_text_R@1: 0.0017 image_to_text_R@5: 0.0090 image_to_text_R@10: 0.0180 text_to_image_mean_rank: 1051.4680 text_to_image_median_rank: 860.0000 text_to_image_R@1: 0.0027 text_to_image_R@5: 0.0127 text_to_image_R@10: 0.0197 clip_val_loss: 7.2518 epoch: 37.0000 num_samples: 3000.0000
2024-08-16,04:52:44 | INFO | Start epoch 37
2024-08-16,04:52:46 | INFO | Train Epoch: 37 [ 256/27000 (1%)] Data (t): 1.471 Batch (t): 2.715, 94.2910/s, 94.2910/s/gpu LR: 0.000194 Logit Scale: 18.161 Contrastive_loss: 0.049741 (0.049741) Loss: 0.049741 (0.049741)
2024-08-16,04:54:51 | INFO | Train Epoch: 37 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.555/s, 204.555/s/gpu LR: 0.000199 Logit Scale: 18.264 Contrastive_loss: 0.054978 (0.052359) Loss: 0.054978 (0.052359)
2024-08-16,04:54:56 | INFO | Train Epoch: 37 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.389/s, 204.389/s/gpu LR: 0.000199 Logit Scale: 18.269 Contrastive_loss: 0.049099 (0.051273) Loss: 0.049099 (0.051273)
2024-08-16,04:54:58 | INFO | Eval Epoch: 38 [256 / 3000] Clip Loss: 7.144962
2024-08-16,04:55:03 | INFO | Eval Epoch: 38 image_to_text_mean_rank: 1057.0257 image_to_text_median_rank: 858.0000 image_to_text_R@1: 0.0023 image_to_text_R@5: 0.0100 image_to_text_R@10: 0.0217 text_to_image_mean_rank: 1049.7160 text_to_image_median_rank: 842.0000 text_to_image_R@1: 0.0027 text_to_image_R@5: 0.0123 text_to_image_R@10: 0.0223 clip_val_loss: 7.2849 epoch: 38.0000 num_samples: 3000.0000
2024-08-16,04:55:04 | INFO | Start epoch 38
2024-08-16,04:55:07 | INFO | Train Epoch: 38 [ 256/27000 (1%)] Data (t): 1.472 Batch (t): 2.718, 94.1733/s, 94.1733/s/gpu LR: 0.000200 Logit Scale: 18.270 Contrastive_loss: 0.066994 (0.066994) Loss: 0.066994 (0.066994)
2024-08-16,04:57:12 | INFO | Train Epoch: 38 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.448/s, 204.448/s/gpu LR: 0.000205 Logit Scale: 18.393 Contrastive_loss: 0.075224 (0.071109) Loss: 0.075224 (0.071109)
2024-08-16,04:57:17 | INFO | Train Epoch: 38 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.568/s, 204.568/s/gpu LR: 0.000205 Logit Scale: 18.399 Contrastive_loss: 0.057282 (0.066500) Loss: 0.057282 (0.066500)
2024-08-16,04:57:19 | INFO | Eval Epoch: 39 [256 / 3000] Clip Loss: 7.157355
2024-08-16,04:57:24 | INFO | Eval Epoch: 39 image_to_text_mean_rank: 1028.1930 image_to_text_median_rank: 819.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0100 image_to_text_R@10: 0.0190 text_to_image_mean_rank: 1020.0573 text_to_image_median_rank: 827.0000 text_to_image_R@1: 0.0043 text_to_image_R@5: 0.0140 text_to_image_R@10: 0.0230 clip_val_loss: 7.2089 epoch: 39.0000 num_samples: 3000.0000
2024-08-16,04:57:25 | INFO | Start epoch 39
2024-08-16,04:57:28 | INFO | Train Epoch: 39 [ 256/27000 (1%)] Data (t): 1.482 Batch (t): 2.727, 93.8673/s, 93.8673/s/gpu LR: 0.000205 Logit Scale: 18.400 Contrastive_loss: 0.056646 (0.056646) Loss: 0.056646 (0.056646)
2024-08-16,04:59:33 | INFO | Train Epoch: 39 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.481/s, 204.481/s/gpu LR: 0.000210 Logit Scale: 18.561 Contrastive_loss: 0.055771 (0.056209) Loss: 0.055771 (0.056209)
2024-08-16,04:59:38 | INFO | Train Epoch: 39 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.646/s, 204.646/s/gpu LR: 0.000210 Logit Scale: 18.567 Contrastive_loss: 0.057177 (0.056531) Loss: 0.057177 (0.056531)
2024-08-16,04:59:39 | INFO | Eval Epoch: 40 [256 / 3000] Clip Loss: 7.104994
2024-08-16,04:59:44 | INFO | Eval Epoch: 40 image_to_text_mean_rank: 1018.7757 image_to_text_median_rank: 782.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0103 image_to_text_R@10: 0.0183 text_to_image_mean_rank: 1008.3393 text_to_image_median_rank: 782.0000 text_to_image_R@1: 0.0017 text_to_image_R@5: 0.0150 text_to_image_R@10: 0.0247 clip_val_loss: 7.2402 epoch: 40.0000 num_samples: 3000.0000
2024-08-16,04:59:46 | INFO | Start epoch 40
2024-08-16,04:59:48 | INFO | Train Epoch: 40 [ 256/27000 (1%)] Data (t): 1.463 Batch (t): 2.709, 94.4969/s, 94.4969/s/gpu LR: 0.000210 Logit Scale: 18.569 Contrastive_loss: 0.090862 (0.090862) Loss: 0.090862 (0.090862)
2024-08-16,05:01:54 | INFO | Train Epoch: 40 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.537/s, 204.537/s/gpu LR: 0.000215 Logit Scale: 18.769 Contrastive_loss: 0.094119 (0.092491) Loss: 0.094119 (0.092491)
2024-08-16,05:01:59 | INFO | Train Epoch: 40 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.528/s, 204.528/s/gpu LR: 0.000215 Logit Scale: 18.779 Contrastive_loss: 0.10408 (0.096353) Loss: 0.10408 (0.096353)
2024-08-16,05:02:00 | INFO | Eval Epoch: 41 [256 / 3000] Clip Loss: 7.132548
2024-08-16,05:02:05 | INFO | Eval Epoch: 41 image_to_text_mean_rank: 1011.8893 image_to_text_median_rank: 803.0000 image_to_text_R@1: 0.0030 image_to_text_R@5: 0.0137 image_to_text_R@10: 0.0220 text_to_image_mean_rank: 999.8167 text_to_image_median_rank: 784.0000 text_to_image_R@1: 0.0033 text_to_image_R@5: 0.0133 text_to_image_R@10: 0.0220 clip_val_loss: 7.2351 epoch: 41.0000 num_samples: 3000.0000
2024-08-16,05:02:06 | INFO | Start epoch 41
2024-08-16,05:02:09 | INFO | Train Epoch: 41 [ 256/27000 (1%)] Data (t): 1.459 Batch (t): 2.706, 94.6002/s, 94.6002/s/gpu LR: 0.000215 Logit Scale: 18.782 Contrastive_loss: 0.085834 (0.085834) Loss: 0.085834 (0.085834)
2024-08-16,05:04:14 | INFO | Train Epoch: 41 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.523/s, 204.523/s/gpu LR: 0.000220 Logit Scale: 19.049 Contrastive_loss: 4.7974 (2.4416) Loss: 4.7974 (2.4416)
2024-08-16,05:04:19 | INFO | Train Epoch: 41 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.451/s, 204.451/s/gpu LR: 0.000221 Logit Scale: 19.044 Contrastive_loss: 4.6688 (3.1840) Loss: 4.6688 (3.1840)
2024-08-16,05:04:21 | INFO | Eval Epoch: 42 [256 / 3000] Clip Loss: 4.982571
2024-08-16,05:04:26 | INFO | Eval Epoch: 42 image_to_text_mean_rank: 754.0190 image_to_text_median_rank: 534.0000 image_to_text_R@1: 0.0013 image_to_text_R@5: 0.0113 image_to_text_R@10: 0.0213 text_to_image_mean_rank: 667.4670 text_to_image_median_rank: 441.0000 text_to_image_R@1: 0.0027 text_to_image_R@5: 0.0110 text_to_image_R@10: 0.0227 clip_val_loss: 5.0287 epoch: 42.0000 num_samples: 3000.0000
2024-08-16,05:04:27 | INFO | Start epoch 42
2024-08-16,05:04:30 | INFO | Train Epoch: 42 [ 256/27000 (1%)] Data (t): 1.476 Batch (t): 2.722, 94.0346/s, 94.0346/s/gpu LR: 0.000221 Logit Scale: 19.043 Contrastive_loss: 4.5297 (4.5297) Loss: 4.5297 (4.5297)
2024-08-16,05:06:35 | INFO | Train Epoch: 42 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.332/s, 204.332/s/gpu LR: 0.000226 Logit Scale: 19.079 Contrastive_loss: 2.9309 (3.7303) Loss: 2.9309 (3.7303)
2024-08-16,05:06:40 | INFO | Train Epoch: 42 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.256/s, 204.256/s/gpu LR: 0.000226 Logit Scale: 19.079 Contrastive_loss: 3.1745 (3.5451) Loss: 3.1745 (3.5451)
2024-08-16,05:06:42 | INFO | Eval Epoch: 43 [256 / 3000] Clip Loss: 5.689608
2024-08-16,05:06:47 | INFO | Eval Epoch: 43 image_to_text_mean_rank: 761.4207 image_to_text_median_rank: 525.0000 image_to_text_R@1: 0.0027 image_to_text_R@5: 0.0130 image_to_text_R@10: 0.0227 text_to_image_mean_rank: 737.0500 text_to_image_median_rank: 503.0000 text_to_image_R@1: 0.0033 text_to_image_R@5: 0.0167 text_to_image_R@10: 0.0263 clip_val_loss: 5.9352 epoch: 43.0000 num_samples: 3000.0000
2024-08-16,05:06:48 | INFO | Start epoch 43
2024-08-16,05:06:51 | INFO | Train Epoch: 43 [ 256/27000 (1%)] Data (t): 1.538 Batch (t): 2.784, 91.9524/s, 91.9524/s/gpu LR: 0.000226 Logit Scale: 19.080 Contrastive_loss: 1.8070 (1.8070) Loss: 1.8070 (1.8070)
2024-08-16,05:08:56 | INFO | Train Epoch: 43 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.467/s, 204.467/s/gpu LR: 0.000231 Logit Scale: 19.650 Contrastive_loss: 1.5848 (1.6959) Loss: 1.5848 (1.6959)
2024-08-16,05:09:01 | INFO | Train Epoch: 43 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.317/s, 204.317/s/gpu LR: 0.000231 Logit Scale: 19.674 Contrastive_loss: 1.6317 (1.6745) Loss: 1.6317 (1.6745)
2024-08-16,05:09:03 | INFO | Eval Epoch: 44 [256 / 3000] Clip Loss: 6.758216
2024-08-16,05:09:07 | INFO | Eval Epoch: 44 image_to_text_mean_rank: 887.7003 image_to_text_median_rank: 669.0000 image_to_text_R@1: 0.0010 image_to_text_R@5: 0.0093 image_to_text_R@10: 0.0173 text_to_image_mean_rank: 871.1403 text_to_image_median_rank: 657.0000 text_to_image_R@1: 0.0043 text_to_image_R@5: 0.0137 text_to_image_R@10: 0.0223 clip_val_loss: 6.9089 epoch: 44.0000 num_samples: 3000.0000
2024-08-16,05:09:09 | INFO | Start epoch 44
2024-08-16,05:09:11 | INFO | Train Epoch: 44 [ 256/27000 (1%)] Data (t): 1.472 Batch (t): 2.719, 94.1646/s, 94.1646/s/gpu LR: 0.000231 Logit Scale: 19.680 Contrastive_loss: 0.64078 (0.64078) Loss: 0.64078 (0.64078)
2024-08-16,05:11:17 | INFO | Train Epoch: 44 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.727/s, 204.727/s/gpu LR: 0.000236 Logit Scale: 20.250 Contrastive_loss: 0.27891 (0.45985) Loss: 0.27891 (0.45985)
2024-08-16,05:11:22 | INFO | Train Epoch: 44 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.648/s, 204.648/s/gpu LR: 0.000236 Logit Scale: 20.270 Contrastive_loss: 0.23861 (0.38610) Loss: 0.23861 (0.38610)
2024-08-16,05:11:23 | INFO | Eval Epoch: 45 [256 / 3000] Clip Loss: 7.229187
2024-08-16,05:11:28 | INFO | Eval Epoch: 45 image_to_text_mean_rank: 923.1860 image_to_text_median_rank: 682.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0097 image_to_text_R@10: 0.0190 text_to_image_mean_rank: 912.5870 text_to_image_median_rank: 674.0000 text_to_image_R@1: 0.0027 text_to_image_R@5: 0.0107 text_to_image_R@10: 0.0200 clip_val_loss: 7.3445 epoch: 45.0000 num_samples: 3000.0000
2024-08-16,05:11:29 | INFO | Start epoch 45
2024-08-16,05:11:32 | INFO | Train Epoch: 45 [ 256/27000 (1%)] Data (t): 1.496 Batch (t): 2.740, 93.4189/s, 93.4189/s/gpu LR: 0.000236 Logit Scale: 20.275 Contrastive_loss: 0.11670 (0.11670) Loss: 0.11670 (0.11670)
2024-08-16,05:13:37 | INFO | Train Epoch: 45 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.578/s, 204.578/s/gpu LR: 0.000241 Logit Scale: 20.541 Contrastive_loss: 0.10041 (0.10856) Loss: 0.10041 (0.10856)
2024-08-16,05:13:42 | INFO | Train Epoch: 45 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.456/s, 204.456/s/gpu LR: 0.000242 Logit Scale: 20.549 Contrastive_loss: 0.039388 (0.085500) Loss: 0.039388 (0.085500)
2024-08-16,05:13:44 | INFO | Eval Epoch: 46 [256 / 3000] Clip Loss: 7.590382
2024-08-16,05:13:49 | INFO | Eval Epoch: 46 image_to_text_mean_rank: 955.1153 image_to_text_median_rank: 728.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0107 image_to_text_R@10: 0.0187 text_to_image_mean_rank: 954.1097 text_to_image_median_rank: 736.0000 text_to_image_R@1: 0.0013 text_to_image_R@5: 0.0130 text_to_image_R@10: 0.0230 clip_val_loss: 7.6622 epoch: 46.0000 num_samples: 3000.0000
2024-08-16,05:13:50 | INFO | Start epoch 46
2024-08-16,05:13:53 | INFO | Train Epoch: 46 [ 256/27000 (1%)] Data (t): 1.480 Batch (t): 2.723, 94.0089/s, 94.0089/s/gpu LR: 0.000242 Logit Scale: 20.551 Contrastive_loss: 0.056549 (0.056549) Loss: 0.056549 (0.056549)
2024-08-16,05:15:58 | INFO | Train Epoch: 46 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.542/s, 204.542/s/gpu LR: 0.000247 Logit Scale: 20.695 Contrastive_loss: 0.031992 (0.044271) Loss: 0.031992 (0.044271)
2024-08-16,05:16:03 | INFO | Train Epoch: 46 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.581/s, 204.581/s/gpu LR: 0.000247 Logit Scale: 20.701 Contrastive_loss: 0.046248 (0.044930) Loss: 0.046248 (0.044930)
2024-08-16,05:16:05 | INFO | Eval Epoch: 47 [256 / 3000] Clip Loss: 7.512671
2024-08-16,05:16:10 | INFO | Eval Epoch: 47 image_to_text_mean_rank: 973.5813 image_to_text_median_rank: 747.0000 image_to_text_R@1: 0.0017 image_to_text_R@5: 0.0097 image_to_text_R@10: 0.0177 text_to_image_mean_rank: 973.0503 text_to_image_median_rank: 743.0000 text_to_image_R@1: 0.0010 text_to_image_R@5: 0.0100 text_to_image_R@10: 0.0210 clip_val_loss: 7.7592 epoch: 47.0000 num_samples: 3000.0000
2024-08-16,05:16:11 | INFO | Start epoch 47
2024-08-16,05:16:14 | INFO | Train Epoch: 47 [ 256/27000 (1%)] Data (t): 1.489 Batch (t): 2.734, 93.6261/s, 93.6261/s/gpu LR: 0.000247 Logit Scale: 20.702 Contrastive_loss: 0.042419 (0.042419) Loss: 0.042419 (0.042419)
2024-08-16,05:18:19 | INFO | Train Epoch: 47 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.663/s, 204.663/s/gpu LR: 0.000252 Logit Scale: 20.817 Contrastive_loss: 0.037442 (0.039930) Loss: 0.037442 (0.039930)
2024-08-16,05:18:24 | INFO | Train Epoch: 47 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.630/s, 204.630/s/gpu LR: 0.000252 Logit Scale: 20.821 Contrastive_loss: 0.057514 (0.045792) Loss: 0.057514 (0.045792)
2024-08-16,05:18:25 | INFO | Eval Epoch: 48 [256 / 3000] Clip Loss: 7.678425
2024-08-16,05:18:30 | INFO | Eval Epoch: 48 image_to_text_mean_rank: 995.8417 image_to_text_median_rank: 768.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0120 image_to_text_R@10: 0.0183 text_to_image_mean_rank: 996.3487 text_to_image_median_rank: 770.0000 text_to_image_R@1: 0.0030 text_to_image_R@5: 0.0130 text_to_image_R@10: 0.0197 clip_val_loss: 7.8886 epoch: 48.0000 num_samples: 3000.0000
2024-08-16,05:18:32 | INFO | Start epoch 48
2024-08-16,05:18:34 | INFO | Train Epoch: 48 [ 256/27000 (1%)] Data (t): 1.486 Batch (t): 2.731, 93.7329/s, 93.7329/s/gpu LR: 0.000252 Logit Scale: 20.822 Contrastive_loss: 0.037203 (0.037203) Loss: 0.037203 (0.037203)
2024-08-16,05:20:39 | INFO | Train Epoch: 48 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.250, 204.601/s, 204.601/s/gpu LR: 0.000257 Logit Scale: 20.930 Contrastive_loss: 0.030419 (0.033811) Loss: 0.030419 (0.033811)
2024-08-16,05:20:44 | INFO | Train Epoch: 48 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.401/s, 204.401/s/gpu LR: 0.000257 Logit Scale: 20.934 Contrastive_loss: 0.017202 (0.028274) Loss: 0.017202 (0.028274)
2024-08-16,05:20:46 | INFO | Eval Epoch: 49 [256 / 3000] Clip Loss: 7.609719
2024-08-16,05:20:51 | INFO | Eval Epoch: 49 image_to_text_mean_rank: 1004.3947 image_to_text_median_rank: 827.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0107 image_to_text_R@10: 0.0193 text_to_image_mean_rank: 1001.3020 text_to_image_median_rank: 814.0000 text_to_image_R@1: 0.0020 text_to_image_R@5: 0.0107 text_to_image_R@10: 0.0200 clip_val_loss: 7.9246 epoch: 49.0000 num_samples: 3000.0000
2024-08-16,05:20:52 | INFO | Start epoch 49
2024-08-16,05:20:55 | INFO | Train Epoch: 49 [ 256/27000 (1%)] Data (t): 1.470 Batch (t): 2.716, 94.2600/s, 94.2600/s/gpu LR: 0.000257 Logit Scale: 20.936 Contrastive_loss: 0.031021 (0.031021) Loss: 0.031021 (0.031021)