File size: 66,279 Bytes
56879e9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
2024-08-16,03:25:52 | INFO | Running with a single process. Device cuda:0.
2024-08-16,03:25:52 | INFO | Loaded Align-fMRI-Encoder-small model config.
2024-08-16,03:25:54 | INFO | Model:
2024-08-16,03:25:54 | INFO | CustomTextCLIP(
  (visual): VisionTransformer(
    (conv1): Conv1d(1, 768, kernel_size=(32,), stride=(32,), bias=False)
    (patch_dropout): Identity()
    (ln_pre): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
    (transformer): Transformer(
      (resblocks): ModuleList(
        (0-11): 12 x ResidualAttentionBlock(
          (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          (attn): MultiheadAttention(
            (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
          )
          (ls_1): Identity()
          (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          (mlp): Sequential(
            (c_fc): Linear(in_features=768, out_features=3072, bias=True)
            (gelu): GELU(approximate='none')
            (c_proj): Linear(in_features=3072, out_features=768, bias=True)
          )
          (ls_2): Identity()
        )
      )
    )
    (ln_post): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
  (text): HFTextEncoder(
    (transformer): RobertaModel(
      (embeddings): RobertaEmbeddings(
        (word_embeddings): Embedding(50265, 768, padding_idx=1)
        (position_embeddings): Embedding(514, 768, padding_idx=1)
        (token_type_embeddings): Embedding(1, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): RobertaEncoder(
        (layer): ModuleList(
          (0-11): 12 x RobertaLayer(
            (attention): RobertaAttention(
              (self): RobertaSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): RobertaSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): RobertaIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): RobertaOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
    )
    (pooler): MeanPooler()
    (proj): Sequential(
      (0): Linear(in_features=768, out_features=640, bias=False)
      (1): GELU(approximate='none')
      (2): Linear(in_features=640, out_features=512, bias=False)
    )
  )
)
2024-08-16,03:25:54 | INFO | Params:
2024-08-16,03:25:54 | INFO |   accum_freq: 1
2024-08-16,03:25:54 | INFO |   aug_cfg: {}
2024-08-16,03:25:54 | INFO |   batch_size: 256
2024-08-16,03:25:54 | INFO |   beta1: 0.9
2024-08-16,03:25:54 | INFO |   beta2: 0.999
2024-08-16,03:25:54 | INFO |   checkpoint_path: ./logs/2024_08_16-03_25_52-model_Align-fMRI-Encoder-small-lr_0.0005-b_256-j_4-p_amp/checkpoints
2024-08-16,03:25:54 | INFO |   coca_caption_loss_weight: 2.0
2024-08-16,03:25:54 | INFO |   coca_contrastive_loss_weight: 1.0
2024-08-16,03:25:54 | INFO |   copy_codebase: False
2024-08-16,03:25:54 | INFO |   csv_caption_key: title
2024-08-16,03:25:54 | INFO |   csv_img_key: filepath
2024-08-16,03:25:54 | INFO |   csv_separator: ,
2024-08-16,03:25:54 | INFO |   dataset_resampled: False
2024-08-16,03:25:54 | INFO |   dataset_type: auto
2024-08-16,03:25:54 | INFO |   ddp_static_graph: False
2024-08-16,03:25:54 | INFO |   debug: False
2024-08-16,03:25:54 | INFO |   delete_previous_checkpoint: False
2024-08-16,03:25:54 | INFO |   device: cuda:0
2024-08-16,03:25:54 | INFO |   dist_backend: nccl
2024-08-16,03:25:54 | INFO |   dist_url: env://
2024-08-16,03:25:54 | INFO |   distill: False
2024-08-16,03:25:54 | INFO |   distill_model: None
2024-08-16,03:25:54 | INFO |   distill_pretrained: None
2024-08-16,03:25:54 | INFO |   distributed: False
2024-08-16,03:25:54 | INFO |   epochs: 100
2024-08-16,03:25:54 | INFO |   epochs_cooldown: None
2024-08-16,03:25:54 | INFO |   eps: 1e-08
2024-08-16,03:25:54 | INFO |   force_custom_text: False
2024-08-16,03:25:54 | INFO |   force_image_size: None
2024-08-16,03:25:54 | INFO |   force_patch_dropout: None
2024-08-16,03:25:54 | INFO |   force_quick_gelu: False
2024-08-16,03:25:54 | INFO |   gather_with_grad: False
2024-08-16,03:25:54 | INFO |   grad_checkpointing: False
2024-08-16,03:25:54 | INFO |   grad_clip_norm: None
2024-08-16,03:25:54 | INFO |   horovod: False
2024-08-16,03:25:54 | INFO |   image_interpolation: None
2024-08-16,03:25:54 | INFO |   image_mean: None
2024-08-16,03:25:54 | INFO |   image_resize_mode: None
2024-08-16,03:25:54 | INFO |   image_std: None
2024-08-16,03:25:54 | INFO |   imagenet_v2: None
2024-08-16,03:25:54 | INFO |   imagenet_val: None
2024-08-16,03:25:54 | INFO |   local_loss: False
2024-08-16,03:25:54 | INFO |   local_rank: 0
2024-08-16,03:25:54 | INFO |   lock_image: False
2024-08-16,03:25:54 | INFO |   lock_image_freeze_bn_stats: False
2024-08-16,03:25:54 | INFO |   lock_image_unlocked_groups: 0
2024-08-16,03:25:54 | INFO |   lock_text: True
2024-08-16,03:25:54 | INFO |   lock_text_freeze_layer_norm: False
2024-08-16,03:25:54 | INFO |   lock_text_unlocked_layers: 0
2024-08-16,03:25:54 | INFO |   log_every_n_steps: 100
2024-08-16,03:25:54 | INFO |   log_level: 20
2024-08-16,03:25:54 | INFO |   log_local: False
2024-08-16,03:25:54 | INFO |   log_path: ./logs/2024_08_16-03_25_52-model_Align-fMRI-Encoder-small-lr_0.0005-b_256-j_4-p_amp/out.log
2024-08-16,03:25:54 | INFO |   logs: ./logs/
2024-08-16,03:25:54 | INFO |   lr: 0.0005
2024-08-16,03:25:54 | INFO |   lr_cooldown_end: 0.0
2024-08-16,03:25:54 | INFO |   lr_cooldown_power: 1.0
2024-08-16,03:25:54 | INFO |   lr_scheduler: cosine
2024-08-16,03:25:54 | INFO |   model: Align-fMRI-Encoder-small
2024-08-16,03:25:54 | INFO |   name: 2024_08_16-03_25_52-model_Align-fMRI-Encoder-small-lr_0.0005-b_256-j_4-p_amp
2024-08-16,03:25:54 | INFO |   no_set_device_rank: False
2024-08-16,03:25:54 | INFO |   precision: amp
2024-08-16,03:25:54 | INFO |   pretrained: 
2024-08-16,03:25:54 | INFO |   pretrained_image: False
2024-08-16,03:25:54 | INFO |   rank: 0
2024-08-16,03:25:54 | INFO |   remote_sync: None
2024-08-16,03:25:54 | INFO |   remote_sync_frequency: 300
2024-08-16,03:25:54 | INFO |   remote_sync_protocol: s3
2024-08-16,03:25:54 | INFO |   report_to: 
2024-08-16,03:25:54 | INFO |   resume: None
2024-08-16,03:25:54 | INFO |   save_frequency: 1
2024-08-16,03:25:54 | INFO |   save_most_recent: False
2024-08-16,03:25:54 | INFO |   seed: 0
2024-08-16,03:25:54 | INFO |   siglip: False
2024-08-16,03:25:54 | INFO |   skip_scheduler: False
2024-08-16,03:25:54 | INFO |   tensorboard: False
2024-08-16,03:25:54 | INFO |   tensorboard_path: 
2024-08-16,03:25:54 | INFO |   torchcompile: False
2024-08-16,03:25:54 | INFO |   torchscript: False
2024-08-16,03:25:54 | INFO |   trace: False
2024-08-16,03:25:54 | INFO |   train_data: /root/autodl-tmp/.autodl/Projects/fMRI2TextAligner/notebooks/train.csv
2024-08-16,03:25:54 | INFO |   train_data_upsampling_factors: None
2024-08-16,03:25:54 | INFO |   train_num_samples: None
2024-08-16,03:25:54 | INFO |   use_bn_sync: False
2024-08-16,03:25:54 | INFO |   use_bnb_linear: None
2024-08-16,03:25:54 | INFO |   val_data: /root/autodl-tmp/.autodl/Projects/fMRI2TextAligner/notebooks/val.csv
2024-08-16,03:25:54 | INFO |   val_frequency: 1
2024-08-16,03:25:54 | INFO |   val_num_samples: None
2024-08-16,03:25:54 | INFO |   wandb: False
2024-08-16,03:25:54 | INFO |   wandb_notes: 
2024-08-16,03:25:54 | INFO |   wandb_project_name: open-clip
2024-08-16,03:25:54 | INFO |   warmup: 10000
2024-08-16,03:25:54 | INFO |   wd: 0.2
2024-08-16,03:25:54 | INFO |   workers: 4
2024-08-16,03:25:54 | INFO |   world_size: 1
2024-08-16,03:25:54 | INFO |   zeroshot_frequency: 2
2024-08-16,03:25:58 | INFO | Start epoch 0
2024-08-16,03:26:01 | INFO | Train Epoch: 0 [  256/27000 (1%)] Data (t): 1.639 Batch (t): 3.538, 72.3607/s, 72.3607/s/gpu LR: 0.000000 Logit Scale: 14.286 Contrastive_loss: 5.5485 (5.5485) Loss: 5.5485 (5.5485)
2024-08-16,03:28:06 | INFO | Train Epoch: 0 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.520/s, 204.520/s/gpu LR: 0.000005 Logit Scale: 14.285 Contrastive_loss: 5.5459 (5.5472) Loss: 5.5459 (5.5472)
2024-08-16,03:28:11 | INFO | Train Epoch: 0 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.359/s, 204.359/s/gpu LR: 0.000005 Logit Scale: 14.285 Contrastive_loss: 5.5479 (5.5474) Loss: 5.5479 (5.5474)
2024-08-16,03:28:13 | INFO | Eval Epoch: 1 [256 / 3000]	Clip Loss: 5.542412	
2024-08-16,03:28:18 | INFO | Eval Epoch: 1 image_to_text_mean_rank: 1451.4443	image_to_text_median_rank: 1410.0000	image_to_text_R@1: 0.0003	image_to_text_R@5: 0.0023	image_to_text_R@10: 0.0060	text_to_image_mean_rank: 1439.4327	text_to_image_median_rank: 1409.0000	text_to_image_R@1: 0.0007	text_to_image_R@5: 0.0020	text_to_image_R@10: 0.0043	clip_val_loss: 5.5223	epoch: 1.0000	num_samples: 3000.0000
2024-08-16,03:28:19 | INFO | Start epoch 1
2024-08-16,03:28:22 | INFO | Train Epoch: 1 [  256/27000 (1%)] Data (t): 1.447 Batch (t): 2.695, 95.0061/s, 95.0061/s/gpu LR: 0.000005 Logit Scale: 14.285 Contrastive_loss: 5.5397 (5.5397) Loss: 5.5397 (5.5397)
2024-08-16,03:30:27 | INFO | Train Epoch: 1 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.426/s, 204.426/s/gpu LR: 0.000010 Logit Scale: 14.290 Contrastive_loss: 5.4991 (5.5194) Loss: 5.4991 (5.5194)
2024-08-16,03:30:32 | INFO | Train Epoch: 1 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.419/s, 204.419/s/gpu LR: 0.000010 Logit Scale: 14.290 Contrastive_loss: 5.4470 (5.4953) Loss: 5.4470 (5.4953)
2024-08-16,03:30:34 | INFO | Eval Epoch: 2 [256 / 3000]	Clip Loss: 5.452873	
2024-08-16,03:30:38 | INFO | Eval Epoch: 2 image_to_text_mean_rank: 1193.8437	image_to_text_median_rank: 1062.0000	image_to_text_R@1: 0.0013	image_to_text_R@5: 0.0037	image_to_text_R@10: 0.0067	text_to_image_mean_rank: 1196.7497	text_to_image_median_rank: 1078.0000	text_to_image_R@1: 0.0013	text_to_image_R@5: 0.0060	text_to_image_R@10: 0.0090	clip_val_loss: 5.4537	epoch: 2.0000	num_samples: 3000.0000
2024-08-16,03:30:40 | INFO | Start epoch 2
2024-08-16,03:30:42 | INFO | Train Epoch: 2 [  256/27000 (1%)] Data (t): 1.420 Batch (t): 2.666, 96.0263/s, 96.0263/s/gpu LR: 0.000011 Logit Scale: 14.290 Contrastive_loss: 5.4566 (5.4566) Loss: 5.4566 (5.4566)
2024-08-16,03:32:48 | INFO | Train Epoch: 2 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.495/s, 204.495/s/gpu LR: 0.000016 Logit Scale: 14.324 Contrastive_loss: 5.0180 (5.2373) Loss: 5.0180 (5.2373)
2024-08-16,03:32:53 | INFO | Train Epoch: 2 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.358/s, 204.358/s/gpu LR: 0.000016 Logit Scale: 14.325 Contrastive_loss: 5.1190 (5.1979) Loss: 5.1190 (5.1979)
2024-08-16,03:32:54 | INFO | Eval Epoch: 3 [256 / 3000]	Clip Loss: 5.063042	
2024-08-16,03:32:59 | INFO | Eval Epoch: 3 image_to_text_mean_rank: 782.7933	image_to_text_median_rank: 592.0000	image_to_text_R@1: 0.0017	image_to_text_R@5: 0.0083	image_to_text_R@10: 0.0157	text_to_image_mean_rank: 727.7393	text_to_image_median_rank: 536.0000	text_to_image_R@1: 0.0030	text_to_image_R@5: 0.0117	text_to_image_R@10: 0.0230	clip_val_loss: 5.0752	epoch: 3.0000	num_samples: 3000.0000
2024-08-16,03:33:00 | INFO | Start epoch 3
2024-08-16,03:33:03 | INFO | Train Epoch: 3 [  256/27000 (1%)] Data (t): 1.597 Batch (t): 2.842, 90.0810/s, 90.0810/s/gpu LR: 0.000016 Logit Scale: 14.326 Contrastive_loss: 5.1011 (5.1011) Loss: 5.1011 (5.1011)
2024-08-16,03:35:09 | INFO | Train Epoch: 3 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.253, 204.414/s, 204.414/s/gpu LR: 0.000021 Logit Scale: 14.339 Contrastive_loss: 4.9292 (5.0151) Loss: 4.9292 (5.0151)
2024-08-16,03:35:14 | INFO | Train Epoch: 3 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.467/s, 204.467/s/gpu LR: 0.000021 Logit Scale: 14.339 Contrastive_loss: 4.8818 (4.9707) Loss: 4.8818 (4.9707)
2024-08-16,03:35:15 | INFO | Eval Epoch: 4 [256 / 3000]	Clip Loss: 4.868340	
2024-08-16,03:35:20 | INFO | Eval Epoch: 4 image_to_text_mean_rank: 683.8327	image_to_text_median_rank: 457.0000	image_to_text_R@1: 0.0043	image_to_text_R@5: 0.0123	image_to_text_R@10: 0.0230	text_to_image_mean_rank: 612.9693	text_to_image_median_rank: 408.0000	text_to_image_R@1: 0.0033	text_to_image_R@5: 0.0143	text_to_image_R@10: 0.0283	clip_val_loss: 4.9190	epoch: 4.0000	num_samples: 3000.0000
2024-08-16,03:35:21 | INFO | Start epoch 4
2024-08-16,03:35:24 | INFO | Train Epoch: 4 [  256/27000 (1%)] Data (t): 1.489 Batch (t): 2.735, 93.5960/s, 93.5960/s/gpu LR: 0.000021 Logit Scale: 14.339 Contrastive_loss: 4.6774 (4.6774) Loss: 4.6774 (4.6774)
2024-08-16,03:37:29 | INFO | Train Epoch: 4 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.427/s, 204.427/s/gpu LR: 0.000026 Logit Scale: 14.352 Contrastive_loss: 4.6595 (4.6684) Loss: 4.6595 (4.6684)
2024-08-16,03:37:34 | INFO | Train Epoch: 4 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.444/s, 204.444/s/gpu LR: 0.000026 Logit Scale: 14.352 Contrastive_loss: 4.7835 (4.7068) Loss: 4.7835 (4.7068)
2024-08-16,03:37:36 | INFO | Eval Epoch: 5 [256 / 3000]	Clip Loss: 4.786659	
2024-08-16,03:37:41 | INFO | Eval Epoch: 5 image_to_text_mean_rank: 620.0710	image_to_text_median_rank: 405.0000	image_to_text_R@1: 0.0033	image_to_text_R@5: 0.0150	image_to_text_R@10: 0.0250	text_to_image_mean_rank: 564.3297	text_to_image_median_rank: 358.0000	text_to_image_R@1: 0.0047	text_to_image_R@5: 0.0173	text_to_image_R@10: 0.0367	clip_val_loss: 4.8233	epoch: 5.0000	num_samples: 3000.0000
2024-08-16,03:37:42 | INFO | Start epoch 5
2024-08-16,03:37:45 | INFO | Train Epoch: 5 [  256/27000 (1%)] Data (t): 1.495 Batch (t): 2.740, 93.4321/s, 93.4321/s/gpu LR: 0.000026 Logit Scale: 14.352 Contrastive_loss: 4.5845 (4.5845) Loss: 4.5845 (4.5845)
2024-08-16,03:39:50 | INFO | Train Epoch: 5 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.440/s, 204.440/s/gpu LR: 0.000031 Logit Scale: 14.392 Contrastive_loss: 4.5224 (4.5534) Loss: 4.5224 (4.5534)
2024-08-16,03:39:55 | INFO | Train Epoch: 5 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.301/s, 204.301/s/gpu LR: 0.000031 Logit Scale: 14.393 Contrastive_loss: 4.4956 (4.5342) Loss: 4.4956 (4.5342)
2024-08-16,03:39:56 | INFO | Eval Epoch: 6 [256 / 3000]	Clip Loss: 4.666817	
2024-08-16,03:40:01 | INFO | Eval Epoch: 6 image_to_text_mean_rank: 560.2250	image_to_text_median_rank: 363.0000	image_to_text_R@1: 0.0037	image_to_text_R@5: 0.0190	image_to_text_R@10: 0.0360	text_to_image_mean_rank: 524.7307	text_to_image_median_rank: 326.0000	text_to_image_R@1: 0.0037	text_to_image_R@5: 0.0223	text_to_image_R@10: 0.0403	clip_val_loss: 4.7456	epoch: 6.0000	num_samples: 3000.0000
2024-08-16,03:40:03 | INFO | Start epoch 6
2024-08-16,03:40:05 | INFO | Train Epoch: 6 [  256/27000 (1%)] Data (t): 1.443 Batch (t): 2.691, 95.1276/s, 95.1276/s/gpu LR: 0.000032 Logit Scale: 14.394 Contrastive_loss: 4.1144 (4.1144) Loss: 4.1144 (4.1144)
2024-08-16,03:42:10 | INFO | Train Epoch: 6 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.240/s, 204.240/s/gpu LR: 0.000037 Logit Scale: 14.484 Contrastive_loss: 4.3783 (4.2463) Loss: 4.3783 (4.2463)
2024-08-16,03:42:15 | INFO | Train Epoch: 6 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.514/s, 204.514/s/gpu LR: 0.000037 Logit Scale: 14.486 Contrastive_loss: 4.4021 (4.2983) Loss: 4.4021 (4.2983)
2024-08-16,03:42:17 | INFO | Eval Epoch: 7 [256 / 3000]	Clip Loss: 4.655993	
2024-08-16,03:42:22 | INFO | Eval Epoch: 7 image_to_text_mean_rank: 563.7200	image_to_text_median_rank: 352.0000	image_to_text_R@1: 0.0037	image_to_text_R@5: 0.0177	image_to_text_R@10: 0.0317	text_to_image_mean_rank: 515.4990	text_to_image_median_rank: 306.0000	text_to_image_R@1: 0.0067	text_to_image_R@5: 0.0250	text_to_image_R@10: 0.0453	clip_val_loss: 4.7377	epoch: 7.0000	num_samples: 3000.0000
2024-08-16,03:42:23 | INFO | Start epoch 7
2024-08-16,03:42:26 | INFO | Train Epoch: 7 [  256/27000 (1%)] Data (t): 1.452 Batch (t): 2.698, 94.8848/s, 94.8848/s/gpu LR: 0.000037 Logit Scale: 14.487 Contrastive_loss: 3.9120 (3.9120) Loss: 3.9120 (3.9120)
2024-08-16,03:44:31 | INFO | Train Epoch: 7 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.415/s, 204.415/s/gpu LR: 0.000042 Logit Scale: 14.608 Contrastive_loss: 3.9964 (3.9542) Loss: 3.9964 (3.9542)
2024-08-16,03:44:36 | INFO | Train Epoch: 7 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.414/s, 204.414/s/gpu LR: 0.000042 Logit Scale: 14.612 Contrastive_loss: 3.9585 (3.9556) Loss: 3.9585 (3.9556)
2024-08-16,03:44:38 | INFO | Eval Epoch: 8 [256 / 3000]	Clip Loss: 4.634455	
2024-08-16,03:44:42 | INFO | Eval Epoch: 8 image_to_text_mean_rank: 551.6537	image_to_text_median_rank: 340.0000	image_to_text_R@1: 0.0050	image_to_text_R@5: 0.0223	image_to_text_R@10: 0.0390	text_to_image_mean_rank: 516.1967	text_to_image_median_rank: 309.0000	text_to_image_R@1: 0.0057	text_to_image_R@5: 0.0260	text_to_image_R@10: 0.0487	clip_val_loss: 4.7750	epoch: 8.0000	num_samples: 3000.0000
2024-08-16,03:44:44 | INFO | Start epoch 8
2024-08-16,03:44:47 | INFO | Train Epoch: 8 [  256/27000 (1%)] Data (t): 1.500 Batch (t): 2.746, 93.2370/s, 93.2370/s/gpu LR: 0.000042 Logit Scale: 14.613 Contrastive_loss: 3.6235 (3.6235) Loss: 3.6235 (3.6235)
2024-08-16,03:46:52 | INFO | Train Epoch: 8 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.478/s, 204.478/s/gpu LR: 0.000047 Logit Scale: 14.756 Contrastive_loss: 3.9351 (3.7793) Loss: 3.9351 (3.7793)
2024-08-16,03:46:57 | INFO | Train Epoch: 8 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.524/s, 204.524/s/gpu LR: 0.000047 Logit Scale: 14.760 Contrastive_loss: 3.8363 (3.7983) Loss: 3.8363 (3.7983)
2024-08-16,03:46:58 | INFO | Eval Epoch: 9 [256 / 3000]	Clip Loss: 4.764020	
2024-08-16,03:47:03 | INFO | Eval Epoch: 9 image_to_text_mean_rank: 587.4340	image_to_text_median_rank: 353.0000	image_to_text_R@1: 0.0040	image_to_text_R@5: 0.0187	image_to_text_R@10: 0.0360	text_to_image_mean_rank: 546.2733	text_to_image_median_rank: 318.0000	text_to_image_R@1: 0.0053	text_to_image_R@5: 0.0230	text_to_image_R@10: 0.0460	clip_val_loss: 4.8689	epoch: 9.0000	num_samples: 3000.0000
2024-08-16,03:47:04 | INFO | Start epoch 9
2024-08-16,03:47:07 | INFO | Train Epoch: 9 [  256/27000 (1%)] Data (t): 1.432 Batch (t): 2.678, 95.6108/s, 95.6108/s/gpu LR: 0.000047 Logit Scale: 14.761 Contrastive_loss: 3.1292 (3.1292) Loss: 3.1292 (3.1292)
2024-08-16,03:49:12 | INFO | Train Epoch: 9 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.417/s, 204.417/s/gpu LR: 0.000052 Logit Scale: 14.924 Contrastive_loss: 3.3670 (3.2481) Loss: 3.3670 (3.2481)
2024-08-16,03:49:17 | INFO | Train Epoch: 9 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.451/s, 204.451/s/gpu LR: 0.000053 Logit Scale: 14.929 Contrastive_loss: 3.4293 (3.3085) Loss: 3.4293 (3.3085)
2024-08-16,03:49:19 | INFO | Eval Epoch: 10 [256 / 3000]	Clip Loss: 4.830852	
2024-08-16,03:49:24 | INFO | Eval Epoch: 10 image_to_text_mean_rank: 604.2760	image_to_text_median_rank: 366.0000	image_to_text_R@1: 0.0033	image_to_text_R@5: 0.0187	image_to_text_R@10: 0.0343	text_to_image_mean_rank: 571.8533	text_to_image_median_rank: 339.0000	text_to_image_R@1: 0.0050	text_to_image_R@5: 0.0233	text_to_image_R@10: 0.0443	clip_val_loss: 4.9887	epoch: 10.0000	num_samples: 3000.0000
2024-08-16,03:49:25 | INFO | Start epoch 10
2024-08-16,03:49:28 | INFO | Train Epoch: 10 [  256/27000 (1%)] Data (t): 1.458 Batch (t): 2.703, 94.7107/s, 94.7107/s/gpu LR: 0.000053 Logit Scale: 14.930 Contrastive_loss: 2.6423 (2.6423) Loss: 2.6423 (2.6423)
2024-08-16,03:51:33 | INFO | Train Epoch: 10 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.377/s, 204.377/s/gpu LR: 0.000058 Logit Scale: 15.109 Contrastive_loss: 2.7973 (2.7198) Loss: 2.7973 (2.7198)
2024-08-16,03:51:38 | INFO | Train Epoch: 10 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.584/s, 204.584/s/gpu LR: 0.000058 Logit Scale: 15.115 Contrastive_loss: 3.1476 (2.8624) Loss: 3.1476 (2.8624)
2024-08-16,03:51:40 | INFO | Eval Epoch: 11 [256 / 3000]	Clip Loss: 4.916827	
2024-08-16,03:51:44 | INFO | Eval Epoch: 11 image_to_text_mean_rank: 645.3330	image_to_text_median_rank: 392.0000	image_to_text_R@1: 0.0053	image_to_text_R@5: 0.0183	image_to_text_R@10: 0.0343	text_to_image_mean_rank: 606.6477	text_to_image_median_rank: 378.0000	text_to_image_R@1: 0.0057	text_to_image_R@5: 0.0220	text_to_image_R@10: 0.0393	clip_val_loss: 5.1492	epoch: 11.0000	num_samples: 3000.0000
2024-08-16,03:51:46 | INFO | Start epoch 11
2024-08-16,03:51:48 | INFO | Train Epoch: 11 [  256/27000 (1%)] Data (t): 1.468 Batch (t): 2.714, 94.3351/s, 94.3351/s/gpu LR: 0.000058 Logit Scale: 15.116 Contrastive_loss: 2.1677 (2.1677) Loss: 2.1677 (2.1677)
2024-08-16,03:53:54 | INFO | Train Epoch: 11 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.580/s, 204.580/s/gpu LR: 0.000063 Logit Scale: 15.304 Contrastive_loss: 2.1280 (2.1479) Loss: 2.1280 (2.1479)
2024-08-16,03:53:59 | INFO | Train Epoch: 11 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.443/s, 204.443/s/gpu LR: 0.000063 Logit Scale: 15.311 Contrastive_loss: 2.1967 (2.1642) Loss: 2.1967 (2.1642)
2024-08-16,03:54:00 | INFO | Eval Epoch: 12 [256 / 3000]	Clip Loss: 5.227501	
2024-08-16,03:54:05 | INFO | Eval Epoch: 12 image_to_text_mean_rank: 701.0917	image_to_text_median_rank: 446.0000	image_to_text_R@1: 0.0033	image_to_text_R@5: 0.0133	image_to_text_R@10: 0.0277	text_to_image_mean_rank: 672.1147	text_to_image_median_rank: 418.0000	text_to_image_R@1: 0.0060	text_to_image_R@5: 0.0197	text_to_image_R@10: 0.0397	clip_val_loss: 5.4125	epoch: 12.0000	num_samples: 3000.0000
2024-08-16,03:54:06 | INFO | Start epoch 12
2024-08-16,03:54:09 | INFO | Train Epoch: 12 [  256/27000 (1%)] Data (t): 1.458 Batch (t): 2.704, 94.6843/s, 94.6843/s/gpu LR: 0.000063 Logit Scale: 15.313 Contrastive_loss: 1.4394 (1.4394) Loss: 1.4394 (1.4394)
2024-08-16,03:56:14 | INFO | Train Epoch: 12 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.444/s, 204.444/s/gpu LR: 0.000068 Logit Scale: 15.499 Contrastive_loss: 1.4908 (1.4651) Loss: 1.4908 (1.4651)
2024-08-16,03:56:19 | INFO | Train Epoch: 12 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.382/s, 204.382/s/gpu LR: 0.000068 Logit Scale: 15.506 Contrastive_loss: 1.5923 (1.5075) Loss: 1.5923 (1.5075)
2024-08-16,03:56:21 | INFO | Eval Epoch: 13 [256 / 3000]	Clip Loss: 5.440553	
2024-08-16,03:56:26 | INFO | Eval Epoch: 13 image_to_text_mean_rank: 746.4853	image_to_text_median_rank: 468.0000	image_to_text_R@1: 0.0023	image_to_text_R@5: 0.0130	image_to_text_R@10: 0.0290	text_to_image_mean_rank: 718.4413	text_to_image_median_rank: 455.0000	text_to_image_R@1: 0.0043	text_to_image_R@5: 0.0197	text_to_image_R@10: 0.0323	clip_val_loss: 5.6023	epoch: 13.0000	num_samples: 3000.0000
2024-08-16,03:56:27 | INFO | Start epoch 13
2024-08-16,03:56:30 | INFO | Train Epoch: 13 [  256/27000 (1%)] Data (t): 1.452 Batch (t): 2.698, 94.8883/s, 94.8883/s/gpu LR: 0.000068 Logit Scale: 15.507 Contrastive_loss: 0.92882 (0.92882) Loss: 0.92882 (0.92882)
2024-08-16,03:58:35 | INFO | Train Epoch: 13 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.386/s, 204.386/s/gpu LR: 0.000073 Logit Scale: 15.674 Contrastive_loss: 0.86548 (0.89715) Loss: 0.86548 (0.89715)
2024-08-16,03:58:40 | INFO | Train Epoch: 13 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.431/s, 204.431/s/gpu LR: 0.000073 Logit Scale: 15.681 Contrastive_loss: 0.90084 (0.89838) Loss: 0.90084 (0.89838)
2024-08-16,03:58:42 | INFO | Eval Epoch: 14 [256 / 3000]	Clip Loss: 5.552146	
2024-08-16,03:58:46 | INFO | Eval Epoch: 14 image_to_text_mean_rank: 811.2757	image_to_text_median_rank: 553.0000	image_to_text_R@1: 0.0027	image_to_text_R@5: 0.0127	image_to_text_R@10: 0.0207	text_to_image_mean_rank: 788.4850	text_to_image_median_rank: 516.0000	text_to_image_R@1: 0.0047	text_to_image_R@5: 0.0197	text_to_image_R@10: 0.0333	clip_val_loss: 5.8483	epoch: 14.0000	num_samples: 3000.0000
2024-08-16,03:58:48 | INFO | Start epoch 14
2024-08-16,03:58:50 | INFO | Train Epoch: 14 [  256/27000 (1%)] Data (t): 1.521 Batch (t): 2.767, 92.5317/s, 92.5317/s/gpu LR: 0.000074 Logit Scale: 15.682 Contrastive_loss: 0.52879 (0.52879) Loss: 0.52879 (0.52879)
2024-08-16,04:00:56 | INFO | Train Epoch: 14 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.407/s, 204.407/s/gpu LR: 0.000079 Logit Scale: 15.819 Contrastive_loss: 0.53437 (0.53158) Loss: 0.53437 (0.53158)
2024-08-16,04:01:01 | INFO | Train Epoch: 14 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.393/s, 204.393/s/gpu LR: 0.000079 Logit Scale: 15.825 Contrastive_loss: 0.52500 (0.52939) Loss: 0.52500 (0.52939)
2024-08-16,04:01:02 | INFO | Eval Epoch: 15 [256 / 3000]	Clip Loss: 5.805412	
2024-08-16,04:01:07 | INFO | Eval Epoch: 15 image_to_text_mean_rank: 865.4823	image_to_text_median_rank: 598.0000	image_to_text_R@1: 0.0050	image_to_text_R@5: 0.0143	image_to_text_R@10: 0.0270	text_to_image_mean_rank: 846.2273	text_to_image_median_rank: 575.0000	text_to_image_R@1: 0.0033	text_to_image_R@5: 0.0163	text_to_image_R@10: 0.0297	clip_val_loss: 6.0065	epoch: 15.0000	num_samples: 3000.0000
2024-08-16,04:01:08 | INFO | Start epoch 15
2024-08-16,04:01:11 | INFO | Train Epoch: 15 [  256/27000 (1%)] Data (t): 1.513 Batch (t): 2.760, 92.7525/s, 92.7525/s/gpu LR: 0.000079 Logit Scale: 15.826 Contrastive_loss: 0.34025 (0.34025) Loss: 0.34025 (0.34025)
2024-08-16,04:03:16 | INFO | Train Epoch: 15 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.496/s, 204.496/s/gpu LR: 0.000084 Logit Scale: 15.934 Contrastive_loss: 0.29246 (0.31636) Loss: 0.29246 (0.31636)
2024-08-16,04:03:21 | INFO | Train Epoch: 15 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.414/s, 204.414/s/gpu LR: 0.000084 Logit Scale: 15.938 Contrastive_loss: 0.26719 (0.29997) Loss: 0.26719 (0.29997)
2024-08-16,04:03:23 | INFO | Eval Epoch: 16 [256 / 3000]	Clip Loss: 6.014488	
2024-08-16,04:03:28 | INFO | Eval Epoch: 16 image_to_text_mean_rank: 887.4880	image_to_text_median_rank: 627.0000	image_to_text_R@1: 0.0030	image_to_text_R@5: 0.0140	image_to_text_R@10: 0.0260	text_to_image_mean_rank: 880.3730	text_to_image_median_rank: 619.0000	text_to_image_R@1: 0.0037	text_to_image_R@5: 0.0183	text_to_image_R@10: 0.0320	clip_val_loss: 6.1342	epoch: 16.0000	num_samples: 3000.0000
2024-08-16,04:03:29 | INFO | Start epoch 16
2024-08-16,04:03:32 | INFO | Train Epoch: 16 [  256/27000 (1%)] Data (t): 1.600 Batch (t): 2.846, 89.9529/s, 89.9529/s/gpu LR: 0.000084 Logit Scale: 15.939 Contrastive_loss: 0.20845 (0.20845) Loss: 0.20845 (0.20845)
2024-08-16,04:05:37 | INFO | Train Epoch: 16 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.440/s, 204.440/s/gpu LR: 0.000089 Logit Scale: 16.026 Contrastive_loss: 0.21264 (0.21055) Loss: 0.21264 (0.21055)
2024-08-16,04:05:42 | INFO | Train Epoch: 16 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.576/s, 204.576/s/gpu LR: 0.000089 Logit Scale: 16.030 Contrastive_loss: 0.18559 (0.20223) Loss: 0.18559 (0.20223)
2024-08-16,04:05:44 | INFO | Eval Epoch: 17 [256 / 3000]	Clip Loss: 6.084400	
2024-08-16,04:05:49 | INFO | Eval Epoch: 17 image_to_text_mean_rank: 935.7397	image_to_text_median_rank: 700.0000	image_to_text_R@1: 0.0033	image_to_text_R@5: 0.0137	image_to_text_R@10: 0.0237	text_to_image_mean_rank: 929.2380	text_to_image_median_rank: 691.0000	text_to_image_R@1: 0.0040	text_to_image_R@5: 0.0160	text_to_image_R@10: 0.0290	clip_val_loss: 6.2851	epoch: 17.0000	num_samples: 3000.0000
2024-08-16,04:05:50 | INFO | Start epoch 17
2024-08-16,04:05:53 | INFO | Train Epoch: 17 [  256/27000 (1%)] Data (t): 1.469 Batch (t): 2.715, 94.3037/s, 94.3037/s/gpu LR: 0.000089 Logit Scale: 16.031 Contrastive_loss: 0.13997 (0.13997) Loss: 0.13997 (0.13997)
2024-08-16,04:07:58 | INFO | Train Epoch: 17 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.280/s, 204.280/s/gpu LR: 0.000094 Logit Scale: 16.107 Contrastive_loss: 0.14680 (0.14339) Loss: 0.14680 (0.14339)
2024-08-16,04:08:03 | INFO | Train Epoch: 17 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.427/s, 204.427/s/gpu LR: 0.000095 Logit Scale: 16.110 Contrastive_loss: 0.14784 (0.14487) Loss: 0.14784 (0.14487)
2024-08-16,04:08:05 | INFO | Eval Epoch: 18 [256 / 3000]	Clip Loss: 6.179710	
2024-08-16,04:08:09 | INFO | Eval Epoch: 18 image_to_text_mean_rank: 959.2040	image_to_text_median_rank: 740.0000	image_to_text_R@1: 0.0017	image_to_text_R@5: 0.0143	image_to_text_R@10: 0.0240	text_to_image_mean_rank: 952.1377	text_to_image_median_rank: 737.0000	text_to_image_R@1: 0.0017	text_to_image_R@5: 0.0143	text_to_image_R@10: 0.0293	clip_val_loss: 6.3885	epoch: 18.0000	num_samples: 3000.0000
2024-08-16,04:08:11 | INFO | Start epoch 18
2024-08-16,04:08:13 | INFO | Train Epoch: 18 [  256/27000 (1%)] Data (t): 1.515 Batch (t): 2.760, 92.7391/s, 92.7391/s/gpu LR: 0.000095 Logit Scale: 16.111 Contrastive_loss: 0.10571 (0.10571) Loss: 0.10571 (0.10571)
2024-08-16,04:10:19 | INFO | Train Epoch: 18 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.597/s, 204.597/s/gpu LR: 0.000100 Logit Scale: 16.180 Contrastive_loss: 0.11256 (0.10913) Loss: 0.11256 (0.10913)
2024-08-16,04:10:24 | INFO | Train Epoch: 18 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.521/s, 204.521/s/gpu LR: 0.000100 Logit Scale: 16.183 Contrastive_loss: 0.10602 (0.10809) Loss: 0.10602 (0.10809)
2024-08-16,04:10:25 | INFO | Eval Epoch: 19 [256 / 3000]	Clip Loss: 6.244460	
2024-08-16,04:10:30 | INFO | Eval Epoch: 19 image_to_text_mean_rank: 984.8337	image_to_text_median_rank: 751.0000	image_to_text_R@1: 0.0033	image_to_text_R@5: 0.0107	image_to_text_R@10: 0.0230	text_to_image_mean_rank: 974.3037	text_to_image_median_rank: 724.0000	text_to_image_R@1: 0.0030	text_to_image_R@5: 0.0130	text_to_image_R@10: 0.0267	clip_val_loss: 6.4495	epoch: 19.0000	num_samples: 3000.0000
2024-08-16,04:10:31 | INFO | Start epoch 19
2024-08-16,04:10:34 | INFO | Train Epoch: 19 [  256/27000 (1%)] Data (t): 1.482 Batch (t): 2.730, 93.7704/s, 93.7704/s/gpu LR: 0.000100 Logit Scale: 16.184 Contrastive_loss: 0.091815 (0.091815) Loss: 0.091815 (0.091815)
2024-08-16,04:12:39 | INFO | Train Epoch: 19 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.414/s, 204.414/s/gpu LR: 0.000105 Logit Scale: 16.250 Contrastive_loss: 0.10051 (0.096161) Loss: 0.10051 (0.096161)
2024-08-16,04:12:44 | INFO | Train Epoch: 19 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.567/s, 204.567/s/gpu LR: 0.000105 Logit Scale: 16.253 Contrastive_loss: 0.083744 (0.092022) Loss: 0.083744 (0.092022)
2024-08-16,04:12:46 | INFO | Eval Epoch: 20 [256 / 3000]	Clip Loss: 6.373804	
2024-08-16,04:12:51 | INFO | Eval Epoch: 20 image_to_text_mean_rank: 998.9283	image_to_text_median_rank: 775.0000	image_to_text_R@1: 0.0023	image_to_text_R@5: 0.0120	image_to_text_R@10: 0.0223	text_to_image_mean_rank: 995.4970	text_to_image_median_rank: 768.0000	text_to_image_R@1: 0.0023	text_to_image_R@5: 0.0143	text_to_image_R@10: 0.0240	clip_val_loss: 6.5515	epoch: 20.0000	num_samples: 3000.0000
2024-08-16,04:12:52 | INFO | Start epoch 20
2024-08-16,04:12:55 | INFO | Train Epoch: 20 [  256/27000 (1%)] Data (t): 1.470 Batch (t): 2.715, 94.2863/s, 94.2863/s/gpu LR: 0.000105 Logit Scale: 16.254 Contrastive_loss: 0.082728 (0.082728) Loss: 0.082728 (0.082728)
2024-08-16,04:15:00 | INFO | Train Epoch: 20 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.462/s, 204.462/s/gpu LR: 0.000110 Logit Scale: 16.322 Contrastive_loss: 0.092753 (0.087741) Loss: 0.092753 (0.087741)
2024-08-16,04:15:05 | INFO | Train Epoch: 20 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.462/s, 204.462/s/gpu LR: 0.000110 Logit Scale: 16.325 Contrastive_loss: 0.10030 (0.091928) Loss: 0.10030 (0.091928)
2024-08-16,04:15:07 | INFO | Eval Epoch: 21 [256 / 3000]	Clip Loss: 6.443340	
2024-08-16,04:15:11 | INFO | Eval Epoch: 21 image_to_text_mean_rank: 1025.5073	image_to_text_median_rank: 807.0000	image_to_text_R@1: 0.0023	image_to_text_R@5: 0.0103	image_to_text_R@10: 0.0207	text_to_image_mean_rank: 1019.2447	text_to_image_median_rank: 826.0000	text_to_image_R@1: 0.0027	text_to_image_R@5: 0.0127	text_to_image_R@10: 0.0250	clip_val_loss: 6.6143	epoch: 21.0000	num_samples: 3000.0000
2024-08-16,04:15:13 | INFO | Start epoch 21
2024-08-16,04:15:15 | INFO | Train Epoch: 21 [  256/27000 (1%)] Data (t): 1.476 Batch (t): 2.722, 94.0350/s, 94.0350/s/gpu LR: 0.000110 Logit Scale: 16.325 Contrastive_loss: 0.067669 (0.067669) Loss: 0.067669 (0.067669)
2024-08-16,04:17:21 | INFO | Train Epoch: 21 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.508/s, 204.508/s/gpu LR: 0.000115 Logit Scale: 16.396 Contrastive_loss: 0.081561 (0.074615) Loss: 0.081561 (0.074615)
2024-08-16,04:17:26 | INFO | Train Epoch: 21 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.474/s, 204.474/s/gpu LR: 0.000116 Logit Scale: 16.399 Contrastive_loss: 0.089710 (0.079647) Loss: 0.089710 (0.079647)
2024-08-16,04:17:27 | INFO | Eval Epoch: 22 [256 / 3000]	Clip Loss: 6.464468	
2024-08-16,04:17:32 | INFO | Eval Epoch: 22 image_to_text_mean_rank: 1013.4487	image_to_text_median_rank: 788.0000	image_to_text_R@1: 0.0020	image_to_text_R@5: 0.0103	image_to_text_R@10: 0.0203	text_to_image_mean_rank: 1005.7700	text_to_image_median_rank: 764.0000	text_to_image_R@1: 0.0017	text_to_image_R@5: 0.0140	text_to_image_R@10: 0.0250	clip_val_loss: 6.6335	epoch: 22.0000	num_samples: 3000.0000
2024-08-16,04:17:33 | INFO | Start epoch 22
2024-08-16,04:17:36 | INFO | Train Epoch: 22 [  256/27000 (1%)] Data (t): 1.477 Batch (t): 2.724, 93.9871/s, 93.9871/s/gpu LR: 0.000116 Logit Scale: 16.400 Contrastive_loss: 0.075971 (0.075971) Loss: 0.075971 (0.075971)
2024-08-16,04:19:41 | INFO | Train Epoch: 22 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.527/s, 204.527/s/gpu LR: 0.000121 Logit Scale: 16.478 Contrastive_loss: 0.094408 (0.085189) Loss: 0.094408 (0.085189)
2024-08-16,04:19:46 | INFO | Train Epoch: 22 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.417/s, 204.417/s/gpu LR: 0.000121 Logit Scale: 16.482 Contrastive_loss: 0.098228 (0.089536) Loss: 0.098228 (0.089536)
2024-08-16,04:19:48 | INFO | Eval Epoch: 23 [256 / 3000]	Clip Loss: 6.571661	
2024-08-16,04:19:53 | INFO | Eval Epoch: 23 image_to_text_mean_rank: 996.3893	image_to_text_median_rank: 789.0000	image_to_text_R@1: 0.0037	image_to_text_R@5: 0.0113	image_to_text_R@10: 0.0207	text_to_image_mean_rank: 992.0480	text_to_image_median_rank: 786.0000	text_to_image_R@1: 0.0030	text_to_image_R@5: 0.0147	text_to_image_R@10: 0.0270	clip_val_loss: 6.6228	epoch: 23.0000	num_samples: 3000.0000
2024-08-16,04:19:54 | INFO | Start epoch 23
2024-08-16,04:19:57 | INFO | Train Epoch: 23 [  256/27000 (1%)] Data (t): 1.547 Batch (t): 2.793, 91.6730/s, 91.6730/s/gpu LR: 0.000121 Logit Scale: 16.483 Contrastive_loss: 0.078356 (0.078356) Loss: 0.078356 (0.078356)
2024-08-16,04:22:02 | INFO | Train Epoch: 23 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.462/s, 204.462/s/gpu LR: 0.000126 Logit Scale: 16.572 Contrastive_loss: 0.090556 (0.084456) Loss: 0.090556 (0.084456)
2024-08-16,04:22:07 | INFO | Train Epoch: 23 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.327/s, 204.327/s/gpu LR: 0.000126 Logit Scale: 16.576 Contrastive_loss: 0.086313 (0.085075) Loss: 0.086313 (0.085075)
2024-08-16,04:22:09 | INFO | Eval Epoch: 24 [256 / 3000]	Clip Loss: 6.592369	
2024-08-16,04:22:14 | INFO | Eval Epoch: 24 image_to_text_mean_rank: 1030.9110	image_to_text_median_rank: 812.0000	image_to_text_R@1: 0.0020	image_to_text_R@5: 0.0127	image_to_text_R@10: 0.0210	text_to_image_mean_rank: 1022.6720	text_to_image_median_rank: 805.0000	text_to_image_R@1: 0.0030	text_to_image_R@5: 0.0163	text_to_image_R@10: 0.0290	clip_val_loss: 6.7181	epoch: 24.0000	num_samples: 3000.0000
2024-08-16,04:22:15 | INFO | Start epoch 24
2024-08-16,04:22:18 | INFO | Train Epoch: 24 [  256/27000 (1%)] Data (t): 1.778 Batch (t): 3.023, 84.6777/s, 84.6777/s/gpu LR: 0.000126 Logit Scale: 16.577 Contrastive_loss: 0.079637 (0.079637) Loss: 0.079637 (0.079637)
2024-08-16,04:24:23 | INFO | Train Epoch: 24 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.535/s, 204.535/s/gpu LR: 0.000131 Logit Scale: 16.685 Contrastive_loss: 0.11149 (0.095563) Loss: 0.11149 (0.095563)
2024-08-16,04:24:28 | INFO | Train Epoch: 24 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.539/s, 204.539/s/gpu LR: 0.000131 Logit Scale: 16.690 Contrastive_loss: 0.10690 (0.099341) Loss: 0.10690 (0.099341)
2024-08-16,04:24:30 | INFO | Eval Epoch: 25 [256 / 3000]	Clip Loss: 6.576798	
2024-08-16,04:24:35 | INFO | Eval Epoch: 25 image_to_text_mean_rank: 1011.1870	image_to_text_median_rank: 780.0000	image_to_text_R@1: 0.0033	image_to_text_R@5: 0.0097	image_to_text_R@10: 0.0180	text_to_image_mean_rank: 994.8833	text_to_image_median_rank: 768.0000	text_to_image_R@1: 0.0023	text_to_image_R@5: 0.0147	text_to_image_R@10: 0.0243	clip_val_loss: 6.6493	epoch: 25.0000	num_samples: 3000.0000
2024-08-16,04:24:36 | INFO | Start epoch 25
2024-08-16,04:24:39 | INFO | Train Epoch: 25 [  256/27000 (1%)] Data (t): 1.483 Batch (t): 2.726, 93.8938/s, 93.8938/s/gpu LR: 0.000131 Logit Scale: 16.691 Contrastive_loss: 0.094896 (0.094896) Loss: 0.094896 (0.094896)
2024-08-16,04:26:44 | INFO | Train Epoch: 25 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.456/s, 204.456/s/gpu LR: 0.000136 Logit Scale: 16.831 Contrastive_loss: 0.24732 (0.17111) Loss: 0.24732 (0.17111)
2024-08-16,04:26:49 | INFO | Train Epoch: 25 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.546/s, 204.546/s/gpu LR: 0.000137 Logit Scale: 16.840 Contrastive_loss: 0.42561 (0.25594) Loss: 0.42561 (0.25594)
2024-08-16,04:26:50 | INFO | Eval Epoch: 26 [256 / 3000]	Clip Loss: 6.370559	
2024-08-16,04:26:55 | INFO | Eval Epoch: 26 image_to_text_mean_rank: 1114.0630	image_to_text_median_rank: 930.0000	image_to_text_R@1: 0.0010	image_to_text_R@5: 0.0053	image_to_text_R@10: 0.0103	text_to_image_mean_rank: 1001.3303	text_to_image_median_rank: 768.0000	text_to_image_R@1: 0.0023	text_to_image_R@5: 0.0113	text_to_image_R@10: 0.0217	clip_val_loss: 6.6953	epoch: 26.0000	num_samples: 3000.0000
2024-08-16,04:26:57 | INFO | Start epoch 26
2024-08-16,04:26:59 | INFO | Train Epoch: 26 [  256/27000 (1%)] Data (t): 1.514 Batch (t): 2.756, 92.8774/s, 92.8774/s/gpu LR: 0.000137 Logit Scale: 16.842 Contrastive_loss: 0.68641 (0.68641) Loss: 0.68641 (0.68641)
2024-08-16,04:29:04 | INFO | Train Epoch: 26 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.423/s, 204.423/s/gpu LR: 0.000142 Logit Scale: 16.877 Contrastive_loss: 2.6772 (1.6818) Loss: 2.6772 (1.6818)
2024-08-16,04:29:09 | INFO | Train Epoch: 26 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.620/s, 204.620/s/gpu LR: 0.000142 Logit Scale: 16.883 Contrastive_loss: 2.6573 (2.0070) Loss: 2.6573 (2.0070)
2024-08-16,04:29:11 | INFO | Eval Epoch: 27 [256 / 3000]	Clip Loss: 5.865312	
2024-08-16,04:29:16 | INFO | Eval Epoch: 27 image_to_text_mean_rank: 782.6180	image_to_text_median_rank: 546.0000	image_to_text_R@1: 0.0020	image_to_text_R@5: 0.0103	image_to_text_R@10: 0.0223	text_to_image_mean_rank: 727.9463	text_to_image_median_rank: 478.0000	text_to_image_R@1: 0.0047	text_to_image_R@5: 0.0193	text_to_image_R@10: 0.0297	clip_val_loss: 5.9258	epoch: 27.0000	num_samples: 3000.0000
2024-08-16,04:29:17 | INFO | Start epoch 27
2024-08-16,04:29:20 | INFO | Train Epoch: 27 [  256/27000 (1%)] Data (t): 1.459 Batch (t): 2.705, 94.6292/s, 94.6292/s/gpu LR: 0.000142 Logit Scale: 16.884 Contrastive_loss: 1.7112 (1.7112) Loss: 1.7112 (1.7112)
2024-08-16,04:31:25 | INFO | Train Epoch: 27 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.360/s, 204.360/s/gpu LR: 0.000147 Logit Scale: 17.268 Contrastive_loss: 0.84065 (1.2759) Loss: 0.84065 (1.2759)
2024-08-16,04:31:30 | INFO | Train Epoch: 27 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.422/s, 204.422/s/gpu LR: 0.000147 Logit Scale: 17.283 Contrastive_loss: 0.70991 (1.0872) Loss: 0.70991 (1.0872)
2024-08-16,04:31:32 | INFO | Eval Epoch: 28 [256 / 3000]	Clip Loss: 6.629393	
2024-08-16,04:31:37 | INFO | Eval Epoch: 28 image_to_text_mean_rank: 896.8010	image_to_text_median_rank: 667.0000	image_to_text_R@1: 0.0013	image_to_text_R@5: 0.0097	image_to_text_R@10: 0.0167	text_to_image_mean_rank: 876.8693	text_to_image_median_rank: 625.0000	text_to_image_R@1: 0.0020	text_to_image_R@5: 0.0127	text_to_image_R@10: 0.0240	clip_val_loss: 6.5428	epoch: 28.0000	num_samples: 3000.0000
2024-08-16,04:31:38 | INFO | Start epoch 28
2024-08-16,04:31:41 | INFO | Train Epoch: 28 [  256/27000 (1%)] Data (t): 1.474 Batch (t): 2.719, 94.1445/s, 94.1445/s/gpu LR: 0.000147 Logit Scale: 17.287 Contrastive_loss: 0.30942 (0.30942) Loss: 0.30942 (0.30942)
2024-08-16,04:33:46 | INFO | Train Epoch: 28 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.480/s, 204.480/s/gpu LR: 0.000152 Logit Scale: 17.513 Contrastive_loss: 0.14982 (0.22962) Loss: 0.14982 (0.22962)
2024-08-16,04:33:51 | INFO | Train Epoch: 28 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.602/s, 204.602/s/gpu LR: 0.000152 Logit Scale: 17.520 Contrastive_loss: 0.17185 (0.21036) Loss: 0.17185 (0.21036)
2024-08-16,04:33:53 | INFO | Eval Epoch: 29 [256 / 3000]	Clip Loss: 6.860646	
2024-08-16,04:33:57 | INFO | Eval Epoch: 29 image_to_text_mean_rank: 948.3863	image_to_text_median_rank: 727.0000	image_to_text_R@1: 0.0020	image_to_text_R@5: 0.0110	image_to_text_R@10: 0.0230	text_to_image_mean_rank: 941.6530	text_to_image_median_rank: 707.0000	text_to_image_R@1: 0.0023	text_to_image_R@5: 0.0147	text_to_image_R@10: 0.0250	clip_val_loss: 6.7876	epoch: 29.0000	num_samples: 3000.0000
2024-08-16,04:33:59 | INFO | Start epoch 29
2024-08-16,04:34:01 | INFO | Train Epoch: 29 [  256/27000 (1%)] Data (t): 1.484 Batch (t): 2.730, 93.7852/s, 93.7852/s/gpu LR: 0.000152 Logit Scale: 17.521 Contrastive_loss: 0.073008 (0.073008) Loss: 0.073008 (0.073008)
2024-08-16,04:36:06 | INFO | Train Epoch: 29 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.650/s, 204.650/s/gpu LR: 0.000157 Logit Scale: 17.624 Contrastive_loss: 0.061296 (0.067152) Loss: 0.061296 (0.067152)
2024-08-16,04:36:11 | INFO | Train Epoch: 29 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.599/s, 204.599/s/gpu LR: 0.000158 Logit Scale: 17.627 Contrastive_loss: 0.069361 (0.067888) Loss: 0.069361 (0.067888)
2024-08-16,04:36:13 | INFO | Eval Epoch: 30 [256 / 3000]	Clip Loss: 7.037732	
2024-08-16,04:36:18 | INFO | Eval Epoch: 30 image_to_text_mean_rank: 987.8290	image_to_text_median_rank: 769.0000	image_to_text_R@1: 0.0017	image_to_text_R@5: 0.0083	image_to_text_R@10: 0.0157	text_to_image_mean_rank: 982.1127	text_to_image_median_rank: 774.0000	text_to_image_R@1: 0.0037	text_to_image_R@5: 0.0110	text_to_image_R@10: 0.0200	clip_val_loss: 6.9645	epoch: 30.0000	num_samples: 3000.0000
2024-08-16,04:36:19 | INFO | Start epoch 30
2024-08-16,04:36:22 | INFO | Train Epoch: 30 [  256/27000 (1%)] Data (t): 1.468 Batch (t): 2.714, 94.3415/s, 94.3415/s/gpu LR: 0.000158 Logit Scale: 17.628 Contrastive_loss: 0.040057 (0.040057) Loss: 0.040057 (0.040057)
2024-08-16,04:38:27 | INFO | Train Epoch: 30 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.454/s, 204.454/s/gpu LR: 0.000163 Logit Scale: 17.701 Contrastive_loss: 0.035407 (0.037732) Loss: 0.035407 (0.037732)
2024-08-16,04:38:32 | INFO | Train Epoch: 30 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.606/s, 204.606/s/gpu LR: 0.000163 Logit Scale: 17.704 Contrastive_loss: 0.046378 (0.040614) Loss: 0.046378 (0.040614)
2024-08-16,04:38:34 | INFO | Eval Epoch: 31 [256 / 3000]	Clip Loss: 7.103376	
2024-08-16,04:38:38 | INFO | Eval Epoch: 31 image_to_text_mean_rank: 1028.7160	image_to_text_median_rank: 827.0000	image_to_text_R@1: 0.0030	image_to_text_R@5: 0.0087	image_to_text_R@10: 0.0193	text_to_image_mean_rank: 1023.7647	text_to_image_median_rank: 810.0000	text_to_image_R@1: 0.0020	text_to_image_R@5: 0.0117	text_to_image_R@10: 0.0193	clip_val_loss: 7.0932	epoch: 31.0000	num_samples: 3000.0000
2024-08-16,04:38:40 | INFO | Start epoch 31
2024-08-16,04:38:42 | INFO | Train Epoch: 31 [  256/27000 (1%)] Data (t): 1.446 Batch (t): 2.692, 95.1004/s, 95.1004/s/gpu LR: 0.000163 Logit Scale: 17.704 Contrastive_loss: 0.045176 (0.045176) Loss: 0.045176 (0.045176)
2024-08-16,04:40:47 | INFO | Train Epoch: 31 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.631/s, 204.631/s/gpu LR: 0.000168 Logit Scale: 17.770 Contrastive_loss: 0.063451 (0.054314) Loss: 0.063451 (0.054314)
2024-08-16,04:40:53 | INFO | Train Epoch: 31 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.576/s, 204.576/s/gpu LR: 0.000168 Logit Scale: 17.772 Contrastive_loss: 0.025344 (0.044657) Loss: 0.025344 (0.044657)
2024-08-16,04:40:54 | INFO | Eval Epoch: 32 [256 / 3000]	Clip Loss: 7.116255	
2024-08-16,04:40:59 | INFO | Eval Epoch: 32 image_to_text_mean_rank: 1041.3660	image_to_text_median_rank: 826.0000	image_to_text_R@1: 0.0013	image_to_text_R@5: 0.0103	image_to_text_R@10: 0.0180	text_to_image_mean_rank: 1038.6310	text_to_image_median_rank: 819.0000	text_to_image_R@1: 0.0020	text_to_image_R@5: 0.0107	text_to_image_R@10: 0.0223	clip_val_loss: 7.1470	epoch: 32.0000	num_samples: 3000.0000
2024-08-16,04:41:00 | INFO | Start epoch 32
2024-08-16,04:41:03 | INFO | Train Epoch: 32 [  256/27000 (1%)] Data (t): 1.481 Batch (t): 2.728, 93.8307/s, 93.8307/s/gpu LR: 0.000168 Logit Scale: 17.773 Contrastive_loss: 0.024506 (0.024506) Loss: 0.024506 (0.024506)
2024-08-16,04:43:08 | INFO | Train Epoch: 32 [25856/27000 (96%)] Data (t): 0.001 Batch (t): 1.251, 204.585/s, 204.585/s/gpu LR: 0.000173 Logit Scale: 17.837 Contrastive_loss: 0.042150 (0.033328) Loss: 0.042150 (0.033328)
2024-08-16,04:43:13 | INFO | Train Epoch: 32 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.710/s, 204.710/s/gpu LR: 0.000173 Logit Scale: 17.840 Contrastive_loss: 0.021264 (0.029307) Loss: 0.021264 (0.029307)
2024-08-16,04:43:15 | INFO | Eval Epoch: 33 [256 / 3000]	Clip Loss: 7.302838	
2024-08-16,04:43:20 | INFO | Eval Epoch: 33 image_to_text_mean_rank: 1051.0060	image_to_text_median_rank: 844.0000	image_to_text_R@1: 0.0010	image_to_text_R@5: 0.0083	image_to_text_R@10: 0.0143	text_to_image_mean_rank: 1047.7423	text_to_image_median_rank: 858.0000	text_to_image_R@1: 0.0017	text_to_image_R@5: 0.0107	text_to_image_R@10: 0.0167	clip_val_loss: 7.2055	epoch: 33.0000	num_samples: 3000.0000
2024-08-16,04:43:21 | INFO | Start epoch 33
2024-08-16,04:43:24 | INFO | Train Epoch: 33 [  256/27000 (1%)] Data (t): 1.471 Batch (t): 2.714, 94.3115/s, 94.3115/s/gpu LR: 0.000173 Logit Scale: 17.840 Contrastive_loss: 0.036071 (0.036071) Loss: 0.036071 (0.036071)
2024-08-16,04:45:29 | INFO | Train Epoch: 33 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.581/s, 204.581/s/gpu LR: 0.000178 Logit Scale: 17.908 Contrastive_loss: 0.032550 (0.034310) Loss: 0.032550 (0.034310)
2024-08-16,04:45:34 | INFO | Train Epoch: 33 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.538/s, 204.538/s/gpu LR: 0.000179 Logit Scale: 17.911 Contrastive_loss: 0.045624 (0.038082) Loss: 0.045624 (0.038082)
2024-08-16,04:45:36 | INFO | Eval Epoch: 34 [256 / 3000]	Clip Loss: 7.142300	
2024-08-16,04:45:40 | INFO | Eval Epoch: 34 image_to_text_mean_rank: 1043.7917	image_to_text_median_rank: 850.0000	image_to_text_R@1: 0.0017	image_to_text_R@5: 0.0097	image_to_text_R@10: 0.0183	text_to_image_mean_rank: 1041.0867	text_to_image_median_rank: 850.0000	text_to_image_R@1: 0.0027	text_to_image_R@5: 0.0113	text_to_image_R@10: 0.0200	clip_val_loss: 7.1728	epoch: 34.0000	num_samples: 3000.0000
2024-08-16,04:45:42 | INFO | Start epoch 34
2024-08-16,04:45:44 | INFO | Train Epoch: 34 [  256/27000 (1%)] Data (t): 1.475 Batch (t): 2.721, 94.0990/s, 94.0990/s/gpu LR: 0.000179 Logit Scale: 17.911 Contrastive_loss: 0.030825 (0.030825) Loss: 0.030825 (0.030825)
2024-08-16,04:47:49 | INFO | Train Epoch: 34 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.590/s, 204.590/s/gpu LR: 0.000184 Logit Scale: 17.981 Contrastive_loss: 0.046251 (0.038538) Loss: 0.046251 (0.038538)
2024-08-16,04:47:54 | INFO | Train Epoch: 34 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.829/s, 204.829/s/gpu LR: 0.000184 Logit Scale: 17.984 Contrastive_loss: 0.041009 (0.039362) Loss: 0.041009 (0.039362)
2024-08-16,04:47:56 | INFO | Eval Epoch: 35 [256 / 3000]	Clip Loss: 7.147564	
2024-08-16,04:48:01 | INFO | Eval Epoch: 35 image_to_text_mean_rank: 1042.3890	image_to_text_median_rank: 852.0000	image_to_text_R@1: 0.0020	image_to_text_R@5: 0.0090	image_to_text_R@10: 0.0173	text_to_image_mean_rank: 1035.9400	text_to_image_median_rank: 837.0000	text_to_image_R@1: 0.0020	text_to_image_R@5: 0.0110	text_to_image_R@10: 0.0207	clip_val_loss: 7.1603	epoch: 35.0000	num_samples: 3000.0000
2024-08-16,04:48:02 | INFO | Start epoch 35
2024-08-16,04:48:05 | INFO | Train Epoch: 35 [  256/27000 (1%)] Data (t): 1.466 Batch (t): 2.711, 94.4327/s, 94.4327/s/gpu LR: 0.000184 Logit Scale: 17.985 Contrastive_loss: 0.032819 (0.032819) Loss: 0.032819 (0.032819)
2024-08-16,04:50:10 | INFO | Train Epoch: 35 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.250, 204.457/s, 204.457/s/gpu LR: 0.000189 Logit Scale: 18.066 Contrastive_loss: 0.047232 (0.040025) Loss: 0.047232 (0.040025)
2024-08-16,04:50:15 | INFO | Train Epoch: 35 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.589/s, 204.589/s/gpu LR: 0.000189 Logit Scale: 18.069 Contrastive_loss: 0.056291 (0.045447) Loss: 0.056291 (0.045447)
2024-08-16,04:50:17 | INFO | Eval Epoch: 36 [256 / 3000]	Clip Loss: 7.278274	
2024-08-16,04:50:22 | INFO | Eval Epoch: 36 image_to_text_mean_rank: 1058.5333	image_to_text_median_rank: 847.0000	image_to_text_R@1: 0.0030	image_to_text_R@5: 0.0147	image_to_text_R@10: 0.0223	text_to_image_mean_rank: 1052.5207	text_to_image_median_rank: 836.0000	text_to_image_R@1: 0.0023	text_to_image_R@5: 0.0133	text_to_image_R@10: 0.0233	clip_val_loss: 7.2311	epoch: 36.0000	num_samples: 3000.0000
2024-08-16,04:50:23 | INFO | Start epoch 36
2024-08-16,04:50:26 | INFO | Train Epoch: 36 [  256/27000 (1%)] Data (t): 1.454 Batch (t): 2.699, 94.8579/s, 94.8579/s/gpu LR: 0.000189 Logit Scale: 18.070 Contrastive_loss: 0.030523 (0.030523) Loss: 0.030523 (0.030523)
2024-08-16,04:52:31 | INFO | Train Epoch: 36 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.636/s, 204.636/s/gpu LR: 0.000194 Logit Scale: 18.156 Contrastive_loss: 0.043999 (0.037261) Loss: 0.043999 (0.037261)
2024-08-16,04:52:36 | INFO | Train Epoch: 36 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.744/s, 204.744/s/gpu LR: 0.000194 Logit Scale: 18.160 Contrastive_loss: 0.037191 (0.037238) Loss: 0.037191 (0.037238)
2024-08-16,04:52:38 | INFO | Eval Epoch: 37 [256 / 3000]	Clip Loss: 7.176402	
2024-08-16,04:52:42 | INFO | Eval Epoch: 37 image_to_text_mean_rank: 1054.3230	image_to_text_median_rank: 875.0000	image_to_text_R@1: 0.0017	image_to_text_R@5: 0.0090	image_to_text_R@10: 0.0180	text_to_image_mean_rank: 1051.4680	text_to_image_median_rank: 860.0000	text_to_image_R@1: 0.0027	text_to_image_R@5: 0.0127	text_to_image_R@10: 0.0197	clip_val_loss: 7.2518	epoch: 37.0000	num_samples: 3000.0000
2024-08-16,04:52:44 | INFO | Start epoch 37
2024-08-16,04:52:46 | INFO | Train Epoch: 37 [  256/27000 (1%)] Data (t): 1.471 Batch (t): 2.715, 94.2910/s, 94.2910/s/gpu LR: 0.000194 Logit Scale: 18.161 Contrastive_loss: 0.049741 (0.049741) Loss: 0.049741 (0.049741)
2024-08-16,04:54:51 | INFO | Train Epoch: 37 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.555/s, 204.555/s/gpu LR: 0.000199 Logit Scale: 18.264 Contrastive_loss: 0.054978 (0.052359) Loss: 0.054978 (0.052359)
2024-08-16,04:54:56 | INFO | Train Epoch: 37 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.389/s, 204.389/s/gpu LR: 0.000199 Logit Scale: 18.269 Contrastive_loss: 0.049099 (0.051273) Loss: 0.049099 (0.051273)
2024-08-16,04:54:58 | INFO | Eval Epoch: 38 [256 / 3000]	Clip Loss: 7.144962	
2024-08-16,04:55:03 | INFO | Eval Epoch: 38 image_to_text_mean_rank: 1057.0257	image_to_text_median_rank: 858.0000	image_to_text_R@1: 0.0023	image_to_text_R@5: 0.0100	image_to_text_R@10: 0.0217	text_to_image_mean_rank: 1049.7160	text_to_image_median_rank: 842.0000	text_to_image_R@1: 0.0027	text_to_image_R@5: 0.0123	text_to_image_R@10: 0.0223	clip_val_loss: 7.2849	epoch: 38.0000	num_samples: 3000.0000
2024-08-16,04:55:04 | INFO | Start epoch 38
2024-08-16,04:55:07 | INFO | Train Epoch: 38 [  256/27000 (1%)] Data (t): 1.472 Batch (t): 2.718, 94.1733/s, 94.1733/s/gpu LR: 0.000200 Logit Scale: 18.270 Contrastive_loss: 0.066994 (0.066994) Loss: 0.066994 (0.066994)
2024-08-16,04:57:12 | INFO | Train Epoch: 38 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.448/s, 204.448/s/gpu LR: 0.000205 Logit Scale: 18.393 Contrastive_loss: 0.075224 (0.071109) Loss: 0.075224 (0.071109)
2024-08-16,04:57:17 | INFO | Train Epoch: 38 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.568/s, 204.568/s/gpu LR: 0.000205 Logit Scale: 18.399 Contrastive_loss: 0.057282 (0.066500) Loss: 0.057282 (0.066500)
2024-08-16,04:57:19 | INFO | Eval Epoch: 39 [256 / 3000]	Clip Loss: 7.157355	
2024-08-16,04:57:24 | INFO | Eval Epoch: 39 image_to_text_mean_rank: 1028.1930	image_to_text_median_rank: 819.0000	image_to_text_R@1: 0.0033	image_to_text_R@5: 0.0100	image_to_text_R@10: 0.0190	text_to_image_mean_rank: 1020.0573	text_to_image_median_rank: 827.0000	text_to_image_R@1: 0.0043	text_to_image_R@5: 0.0140	text_to_image_R@10: 0.0230	clip_val_loss: 7.2089	epoch: 39.0000	num_samples: 3000.0000
2024-08-16,04:57:25 | INFO | Start epoch 39
2024-08-16,04:57:28 | INFO | Train Epoch: 39 [  256/27000 (1%)] Data (t): 1.482 Batch (t): 2.727, 93.8673/s, 93.8673/s/gpu LR: 0.000205 Logit Scale: 18.400 Contrastive_loss: 0.056646 (0.056646) Loss: 0.056646 (0.056646)
2024-08-16,04:59:33 | INFO | Train Epoch: 39 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.481/s, 204.481/s/gpu LR: 0.000210 Logit Scale: 18.561 Contrastive_loss: 0.055771 (0.056209) Loss: 0.055771 (0.056209)
2024-08-16,04:59:38 | INFO | Train Epoch: 39 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.646/s, 204.646/s/gpu LR: 0.000210 Logit Scale: 18.567 Contrastive_loss: 0.057177 (0.056531) Loss: 0.057177 (0.056531)
2024-08-16,04:59:39 | INFO | Eval Epoch: 40 [256 / 3000]	Clip Loss: 7.104994	
2024-08-16,04:59:44 | INFO | Eval Epoch: 40 image_to_text_mean_rank: 1018.7757	image_to_text_median_rank: 782.0000	image_to_text_R@1: 0.0020	image_to_text_R@5: 0.0103	image_to_text_R@10: 0.0183	text_to_image_mean_rank: 1008.3393	text_to_image_median_rank: 782.0000	text_to_image_R@1: 0.0017	text_to_image_R@5: 0.0150	text_to_image_R@10: 0.0247	clip_val_loss: 7.2402	epoch: 40.0000	num_samples: 3000.0000
2024-08-16,04:59:46 | INFO | Start epoch 40
2024-08-16,04:59:48 | INFO | Train Epoch: 40 [  256/27000 (1%)] Data (t): 1.463 Batch (t): 2.709, 94.4969/s, 94.4969/s/gpu LR: 0.000210 Logit Scale: 18.569 Contrastive_loss: 0.090862 (0.090862) Loss: 0.090862 (0.090862)
2024-08-16,05:01:54 | INFO | Train Epoch: 40 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.537/s, 204.537/s/gpu LR: 0.000215 Logit Scale: 18.769 Contrastive_loss: 0.094119 (0.092491) Loss: 0.094119 (0.092491)
2024-08-16,05:01:59 | INFO | Train Epoch: 40 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.528/s, 204.528/s/gpu LR: 0.000215 Logit Scale: 18.779 Contrastive_loss: 0.10408 (0.096353) Loss: 0.10408 (0.096353)
2024-08-16,05:02:00 | INFO | Eval Epoch: 41 [256 / 3000]	Clip Loss: 7.132548	
2024-08-16,05:02:05 | INFO | Eval Epoch: 41 image_to_text_mean_rank: 1011.8893	image_to_text_median_rank: 803.0000	image_to_text_R@1: 0.0030	image_to_text_R@5: 0.0137	image_to_text_R@10: 0.0220	text_to_image_mean_rank: 999.8167	text_to_image_median_rank: 784.0000	text_to_image_R@1: 0.0033	text_to_image_R@5: 0.0133	text_to_image_R@10: 0.0220	clip_val_loss: 7.2351	epoch: 41.0000	num_samples: 3000.0000
2024-08-16,05:02:06 | INFO | Start epoch 41
2024-08-16,05:02:09 | INFO | Train Epoch: 41 [  256/27000 (1%)] Data (t): 1.459 Batch (t): 2.706, 94.6002/s, 94.6002/s/gpu LR: 0.000215 Logit Scale: 18.782 Contrastive_loss: 0.085834 (0.085834) Loss: 0.085834 (0.085834)
2024-08-16,05:04:14 | INFO | Train Epoch: 41 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.523/s, 204.523/s/gpu LR: 0.000220 Logit Scale: 19.049 Contrastive_loss: 4.7974 (2.4416) Loss: 4.7974 (2.4416)
2024-08-16,05:04:19 | INFO | Train Epoch: 41 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.451/s, 204.451/s/gpu LR: 0.000221 Logit Scale: 19.044 Contrastive_loss: 4.6688 (3.1840) Loss: 4.6688 (3.1840)
2024-08-16,05:04:21 | INFO | Eval Epoch: 42 [256 / 3000]	Clip Loss: 4.982571	
2024-08-16,05:04:26 | INFO | Eval Epoch: 42 image_to_text_mean_rank: 754.0190	image_to_text_median_rank: 534.0000	image_to_text_R@1: 0.0013	image_to_text_R@5: 0.0113	image_to_text_R@10: 0.0213	text_to_image_mean_rank: 667.4670	text_to_image_median_rank: 441.0000	text_to_image_R@1: 0.0027	text_to_image_R@5: 0.0110	text_to_image_R@10: 0.0227	clip_val_loss: 5.0287	epoch: 42.0000	num_samples: 3000.0000
2024-08-16,05:04:27 | INFO | Start epoch 42
2024-08-16,05:04:30 | INFO | Train Epoch: 42 [  256/27000 (1%)] Data (t): 1.476 Batch (t): 2.722, 94.0346/s, 94.0346/s/gpu LR: 0.000221 Logit Scale: 19.043 Contrastive_loss: 4.5297 (4.5297) Loss: 4.5297 (4.5297)
2024-08-16,05:06:35 | INFO | Train Epoch: 42 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.332/s, 204.332/s/gpu LR: 0.000226 Logit Scale: 19.079 Contrastive_loss: 2.9309 (3.7303) Loss: 2.9309 (3.7303)
2024-08-16,05:06:40 | INFO | Train Epoch: 42 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.256/s, 204.256/s/gpu LR: 0.000226 Logit Scale: 19.079 Contrastive_loss: 3.1745 (3.5451) Loss: 3.1745 (3.5451)
2024-08-16,05:06:42 | INFO | Eval Epoch: 43 [256 / 3000]	Clip Loss: 5.689608	
2024-08-16,05:06:47 | INFO | Eval Epoch: 43 image_to_text_mean_rank: 761.4207	image_to_text_median_rank: 525.0000	image_to_text_R@1: 0.0027	image_to_text_R@5: 0.0130	image_to_text_R@10: 0.0227	text_to_image_mean_rank: 737.0500	text_to_image_median_rank: 503.0000	text_to_image_R@1: 0.0033	text_to_image_R@5: 0.0167	text_to_image_R@10: 0.0263	clip_val_loss: 5.9352	epoch: 43.0000	num_samples: 3000.0000
2024-08-16,05:06:48 | INFO | Start epoch 43
2024-08-16,05:06:51 | INFO | Train Epoch: 43 [  256/27000 (1%)] Data (t): 1.538 Batch (t): 2.784, 91.9524/s, 91.9524/s/gpu LR: 0.000226 Logit Scale: 19.080 Contrastive_loss: 1.8070 (1.8070) Loss: 1.8070 (1.8070)
2024-08-16,05:08:56 | INFO | Train Epoch: 43 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.467/s, 204.467/s/gpu LR: 0.000231 Logit Scale: 19.650 Contrastive_loss: 1.5848 (1.6959) Loss: 1.5848 (1.6959)
2024-08-16,05:09:01 | INFO | Train Epoch: 43 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.317/s, 204.317/s/gpu LR: 0.000231 Logit Scale: 19.674 Contrastive_loss: 1.6317 (1.6745) Loss: 1.6317 (1.6745)
2024-08-16,05:09:03 | INFO | Eval Epoch: 44 [256 / 3000]	Clip Loss: 6.758216	
2024-08-16,05:09:07 | INFO | Eval Epoch: 44 image_to_text_mean_rank: 887.7003	image_to_text_median_rank: 669.0000	image_to_text_R@1: 0.0010	image_to_text_R@5: 0.0093	image_to_text_R@10: 0.0173	text_to_image_mean_rank: 871.1403	text_to_image_median_rank: 657.0000	text_to_image_R@1: 0.0043	text_to_image_R@5: 0.0137	text_to_image_R@10: 0.0223	clip_val_loss: 6.9089	epoch: 44.0000	num_samples: 3000.0000
2024-08-16,05:09:09 | INFO | Start epoch 44
2024-08-16,05:09:11 | INFO | Train Epoch: 44 [  256/27000 (1%)] Data (t): 1.472 Batch (t): 2.719, 94.1646/s, 94.1646/s/gpu LR: 0.000231 Logit Scale: 19.680 Contrastive_loss: 0.64078 (0.64078) Loss: 0.64078 (0.64078)
2024-08-16,05:11:17 | INFO | Train Epoch: 44 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.727/s, 204.727/s/gpu LR: 0.000236 Logit Scale: 20.250 Contrastive_loss: 0.27891 (0.45985) Loss: 0.27891 (0.45985)
2024-08-16,05:11:22 | INFO | Train Epoch: 44 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.648/s, 204.648/s/gpu LR: 0.000236 Logit Scale: 20.270 Contrastive_loss: 0.23861 (0.38610) Loss: 0.23861 (0.38610)
2024-08-16,05:11:23 | INFO | Eval Epoch: 45 [256 / 3000]	Clip Loss: 7.229187	
2024-08-16,05:11:28 | INFO | Eval Epoch: 45 image_to_text_mean_rank: 923.1860	image_to_text_median_rank: 682.0000	image_to_text_R@1: 0.0020	image_to_text_R@5: 0.0097	image_to_text_R@10: 0.0190	text_to_image_mean_rank: 912.5870	text_to_image_median_rank: 674.0000	text_to_image_R@1: 0.0027	text_to_image_R@5: 0.0107	text_to_image_R@10: 0.0200	clip_val_loss: 7.3445	epoch: 45.0000	num_samples: 3000.0000
2024-08-16,05:11:29 | INFO | Start epoch 45
2024-08-16,05:11:32 | INFO | Train Epoch: 45 [  256/27000 (1%)] Data (t): 1.496 Batch (t): 2.740, 93.4189/s, 93.4189/s/gpu LR: 0.000236 Logit Scale: 20.275 Contrastive_loss: 0.11670 (0.11670) Loss: 0.11670 (0.11670)
2024-08-16,05:13:37 | INFO | Train Epoch: 45 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.578/s, 204.578/s/gpu LR: 0.000241 Logit Scale: 20.541 Contrastive_loss: 0.10041 (0.10856) Loss: 0.10041 (0.10856)
2024-08-16,05:13:42 | INFO | Train Epoch: 45 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.456/s, 204.456/s/gpu LR: 0.000242 Logit Scale: 20.549 Contrastive_loss: 0.039388 (0.085500) Loss: 0.039388 (0.085500)
2024-08-16,05:13:44 | INFO | Eval Epoch: 46 [256 / 3000]	Clip Loss: 7.590382	
2024-08-16,05:13:49 | INFO | Eval Epoch: 46 image_to_text_mean_rank: 955.1153	image_to_text_median_rank: 728.0000	image_to_text_R@1: 0.0020	image_to_text_R@5: 0.0107	image_to_text_R@10: 0.0187	text_to_image_mean_rank: 954.1097	text_to_image_median_rank: 736.0000	text_to_image_R@1: 0.0013	text_to_image_R@5: 0.0130	text_to_image_R@10: 0.0230	clip_val_loss: 7.6622	epoch: 46.0000	num_samples: 3000.0000
2024-08-16,05:13:50 | INFO | Start epoch 46
2024-08-16,05:13:53 | INFO | Train Epoch: 46 [  256/27000 (1%)] Data (t): 1.480 Batch (t): 2.723, 94.0089/s, 94.0089/s/gpu LR: 0.000242 Logit Scale: 20.551 Contrastive_loss: 0.056549 (0.056549) Loss: 0.056549 (0.056549)
2024-08-16,05:15:58 | INFO | Train Epoch: 46 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.542/s, 204.542/s/gpu LR: 0.000247 Logit Scale: 20.695 Contrastive_loss: 0.031992 (0.044271) Loss: 0.031992 (0.044271)
2024-08-16,05:16:03 | INFO | Train Epoch: 46 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.581/s, 204.581/s/gpu LR: 0.000247 Logit Scale: 20.701 Contrastive_loss: 0.046248 (0.044930) Loss: 0.046248 (0.044930)
2024-08-16,05:16:05 | INFO | Eval Epoch: 47 [256 / 3000]	Clip Loss: 7.512671	
2024-08-16,05:16:10 | INFO | Eval Epoch: 47 image_to_text_mean_rank: 973.5813	image_to_text_median_rank: 747.0000	image_to_text_R@1: 0.0017	image_to_text_R@5: 0.0097	image_to_text_R@10: 0.0177	text_to_image_mean_rank: 973.0503	text_to_image_median_rank: 743.0000	text_to_image_R@1: 0.0010	text_to_image_R@5: 0.0100	text_to_image_R@10: 0.0210	clip_val_loss: 7.7592	epoch: 47.0000	num_samples: 3000.0000
2024-08-16,05:16:11 | INFO | Start epoch 47
2024-08-16,05:16:14 | INFO | Train Epoch: 47 [  256/27000 (1%)] Data (t): 1.489 Batch (t): 2.734, 93.6261/s, 93.6261/s/gpu LR: 0.000247 Logit Scale: 20.702 Contrastive_loss: 0.042419 (0.042419) Loss: 0.042419 (0.042419)
2024-08-16,05:18:19 | INFO | Train Epoch: 47 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.663/s, 204.663/s/gpu LR: 0.000252 Logit Scale: 20.817 Contrastive_loss: 0.037442 (0.039930) Loss: 0.037442 (0.039930)
2024-08-16,05:18:24 | INFO | Train Epoch: 47 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.630/s, 204.630/s/gpu LR: 0.000252 Logit Scale: 20.821 Contrastive_loss: 0.057514 (0.045792) Loss: 0.057514 (0.045792)
2024-08-16,05:18:25 | INFO | Eval Epoch: 48 [256 / 3000]	Clip Loss: 7.678425	
2024-08-16,05:18:30 | INFO | Eval Epoch: 48 image_to_text_mean_rank: 995.8417	image_to_text_median_rank: 768.0000	image_to_text_R@1: 0.0020	image_to_text_R@5: 0.0120	image_to_text_R@10: 0.0183	text_to_image_mean_rank: 996.3487	text_to_image_median_rank: 770.0000	text_to_image_R@1: 0.0030	text_to_image_R@5: 0.0130	text_to_image_R@10: 0.0197	clip_val_loss: 7.8886	epoch: 48.0000	num_samples: 3000.0000
2024-08-16,05:18:32 | INFO | Start epoch 48
2024-08-16,05:18:34 | INFO | Train Epoch: 48 [  256/27000 (1%)] Data (t): 1.486 Batch (t): 2.731, 93.7329/s, 93.7329/s/gpu LR: 0.000252 Logit Scale: 20.822 Contrastive_loss: 0.037203 (0.037203) Loss: 0.037203 (0.037203)
2024-08-16,05:20:39 | INFO | Train Epoch: 48 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.250, 204.601/s, 204.601/s/gpu LR: 0.000257 Logit Scale: 20.930 Contrastive_loss: 0.030419 (0.033811) Loss: 0.030419 (0.033811)
2024-08-16,05:20:44 | INFO | Train Epoch: 48 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.401/s, 204.401/s/gpu LR: 0.000257 Logit Scale: 20.934 Contrastive_loss: 0.017202 (0.028274) Loss: 0.017202 (0.028274)
2024-08-16,05:20:46 | INFO | Eval Epoch: 49 [256 / 3000]	Clip Loss: 7.609719	
2024-08-16,05:20:51 | INFO | Eval Epoch: 49 image_to_text_mean_rank: 1004.3947	image_to_text_median_rank: 827.0000	image_to_text_R@1: 0.0020	image_to_text_R@5: 0.0107	image_to_text_R@10: 0.0193	text_to_image_mean_rank: 1001.3020	text_to_image_median_rank: 814.0000	text_to_image_R@1: 0.0020	text_to_image_R@5: 0.0107	text_to_image_R@10: 0.0200	clip_val_loss: 7.9246	epoch: 49.0000	num_samples: 3000.0000
2024-08-16,05:20:52 | INFO | Start epoch 49
2024-08-16,05:20:55 | INFO | Train Epoch: 49 [  256/27000 (1%)] Data (t): 1.470 Batch (t): 2.716, 94.2600/s, 94.2600/s/gpu LR: 0.000257 Logit Scale: 20.936 Contrastive_loss: 0.031021 (0.031021) Loss: 0.031021 (0.031021)