RickC1999 commited on
Commit
56879e9
·
verified ·
1 Parent(s): cf4d5a0
checkpoint/test00/out.log ADDED
@@ -0,0 +1,467 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-08-16,03:25:52 | INFO | Running with a single process. Device cuda:0.
2
+ 2024-08-16,03:25:52 | INFO | Loaded Align-fMRI-Encoder-small model config.
3
+ 2024-08-16,03:25:54 | INFO | Model:
4
+ 2024-08-16,03:25:54 | INFO | CustomTextCLIP(
5
+ (visual): VisionTransformer(
6
+ (conv1): Conv1d(1, 768, kernel_size=(32,), stride=(32,), bias=False)
7
+ (patch_dropout): Identity()
8
+ (ln_pre): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
9
+ (transformer): Transformer(
10
+ (resblocks): ModuleList(
11
+ (0-11): 12 x ResidualAttentionBlock(
12
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
13
+ (attn): MultiheadAttention(
14
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
15
+ )
16
+ (ls_1): Identity()
17
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
18
+ (mlp): Sequential(
19
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
20
+ (gelu): GELU(approximate='none')
21
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
22
+ )
23
+ (ls_2): Identity()
24
+ )
25
+ )
26
+ )
27
+ (ln_post): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
28
+ )
29
+ (text): HFTextEncoder(
30
+ (transformer): RobertaModel(
31
+ (embeddings): RobertaEmbeddings(
32
+ (word_embeddings): Embedding(50265, 768, padding_idx=1)
33
+ (position_embeddings): Embedding(514, 768, padding_idx=1)
34
+ (token_type_embeddings): Embedding(1, 768)
35
+ (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
36
+ (dropout): Dropout(p=0.1, inplace=False)
37
+ )
38
+ (encoder): RobertaEncoder(
39
+ (layer): ModuleList(
40
+ (0-11): 12 x RobertaLayer(
41
+ (attention): RobertaAttention(
42
+ (self): RobertaSelfAttention(
43
+ (query): Linear(in_features=768, out_features=768, bias=True)
44
+ (key): Linear(in_features=768, out_features=768, bias=True)
45
+ (value): Linear(in_features=768, out_features=768, bias=True)
46
+ (dropout): Dropout(p=0.1, inplace=False)
47
+ )
48
+ (output): RobertaSelfOutput(
49
+ (dense): Linear(in_features=768, out_features=768, bias=True)
50
+ (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
51
+ (dropout): Dropout(p=0.1, inplace=False)
52
+ )
53
+ )
54
+ (intermediate): RobertaIntermediate(
55
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
56
+ (intermediate_act_fn): GELUActivation()
57
+ )
58
+ (output): RobertaOutput(
59
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
60
+ (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
61
+ (dropout): Dropout(p=0.1, inplace=False)
62
+ )
63
+ )
64
+ )
65
+ )
66
+ )
67
+ (pooler): MeanPooler()
68
+ (proj): Sequential(
69
+ (0): Linear(in_features=768, out_features=640, bias=False)
70
+ (1): GELU(approximate='none')
71
+ (2): Linear(in_features=640, out_features=512, bias=False)
72
+ )
73
+ )
74
+ )
75
+ 2024-08-16,03:25:54 | INFO | Params:
76
+ 2024-08-16,03:25:54 | INFO | accum_freq: 1
77
+ 2024-08-16,03:25:54 | INFO | aug_cfg: {}
78
+ 2024-08-16,03:25:54 | INFO | batch_size: 256
79
+ 2024-08-16,03:25:54 | INFO | beta1: 0.9
80
+ 2024-08-16,03:25:54 | INFO | beta2: 0.999
81
+ 2024-08-16,03:25:54 | INFO | checkpoint_path: ./logs/2024_08_16-03_25_52-model_Align-fMRI-Encoder-small-lr_0.0005-b_256-j_4-p_amp/checkpoints
82
+ 2024-08-16,03:25:54 | INFO | coca_caption_loss_weight: 2.0
83
+ 2024-08-16,03:25:54 | INFO | coca_contrastive_loss_weight: 1.0
84
+ 2024-08-16,03:25:54 | INFO | copy_codebase: False
85
+ 2024-08-16,03:25:54 | INFO | csv_caption_key: title
86
+ 2024-08-16,03:25:54 | INFO | csv_img_key: filepath
87
+ 2024-08-16,03:25:54 | INFO | csv_separator: ,
88
+ 2024-08-16,03:25:54 | INFO | dataset_resampled: False
89
+ 2024-08-16,03:25:54 | INFO | dataset_type: auto
90
+ 2024-08-16,03:25:54 | INFO | ddp_static_graph: False
91
+ 2024-08-16,03:25:54 | INFO | debug: False
92
+ 2024-08-16,03:25:54 | INFO | delete_previous_checkpoint: False
93
+ 2024-08-16,03:25:54 | INFO | device: cuda:0
94
+ 2024-08-16,03:25:54 | INFO | dist_backend: nccl
95
+ 2024-08-16,03:25:54 | INFO | dist_url: env://
96
+ 2024-08-16,03:25:54 | INFO | distill: False
97
+ 2024-08-16,03:25:54 | INFO | distill_model: None
98
+ 2024-08-16,03:25:54 | INFO | distill_pretrained: None
99
+ 2024-08-16,03:25:54 | INFO | distributed: False
100
+ 2024-08-16,03:25:54 | INFO | epochs: 100
101
+ 2024-08-16,03:25:54 | INFO | epochs_cooldown: None
102
+ 2024-08-16,03:25:54 | INFO | eps: 1e-08
103
+ 2024-08-16,03:25:54 | INFO | force_custom_text: False
104
+ 2024-08-16,03:25:54 | INFO | force_image_size: None
105
+ 2024-08-16,03:25:54 | INFO | force_patch_dropout: None
106
+ 2024-08-16,03:25:54 | INFO | force_quick_gelu: False
107
+ 2024-08-16,03:25:54 | INFO | gather_with_grad: False
108
+ 2024-08-16,03:25:54 | INFO | grad_checkpointing: False
109
+ 2024-08-16,03:25:54 | INFO | grad_clip_norm: None
110
+ 2024-08-16,03:25:54 | INFO | horovod: False
111
+ 2024-08-16,03:25:54 | INFO | image_interpolation: None
112
+ 2024-08-16,03:25:54 | INFO | image_mean: None
113
+ 2024-08-16,03:25:54 | INFO | image_resize_mode: None
114
+ 2024-08-16,03:25:54 | INFO | image_std: None
115
+ 2024-08-16,03:25:54 | INFO | imagenet_v2: None
116
+ 2024-08-16,03:25:54 | INFO | imagenet_val: None
117
+ 2024-08-16,03:25:54 | INFO | local_loss: False
118
+ 2024-08-16,03:25:54 | INFO | local_rank: 0
119
+ 2024-08-16,03:25:54 | INFO | lock_image: False
120
+ 2024-08-16,03:25:54 | INFO | lock_image_freeze_bn_stats: False
121
+ 2024-08-16,03:25:54 | INFO | lock_image_unlocked_groups: 0
122
+ 2024-08-16,03:25:54 | INFO | lock_text: True
123
+ 2024-08-16,03:25:54 | INFO | lock_text_freeze_layer_norm: False
124
+ 2024-08-16,03:25:54 | INFO | lock_text_unlocked_layers: 0
125
+ 2024-08-16,03:25:54 | INFO | log_every_n_steps: 100
126
+ 2024-08-16,03:25:54 | INFO | log_level: 20
127
+ 2024-08-16,03:25:54 | INFO | log_local: False
128
+ 2024-08-16,03:25:54 | INFO | log_path: ./logs/2024_08_16-03_25_52-model_Align-fMRI-Encoder-small-lr_0.0005-b_256-j_4-p_amp/out.log
129
+ 2024-08-16,03:25:54 | INFO | logs: ./logs/
130
+ 2024-08-16,03:25:54 | INFO | lr: 0.0005
131
+ 2024-08-16,03:25:54 | INFO | lr_cooldown_end: 0.0
132
+ 2024-08-16,03:25:54 | INFO | lr_cooldown_power: 1.0
133
+ 2024-08-16,03:25:54 | INFO | lr_scheduler: cosine
134
+ 2024-08-16,03:25:54 | INFO | model: Align-fMRI-Encoder-small
135
+ 2024-08-16,03:25:54 | INFO | name: 2024_08_16-03_25_52-model_Align-fMRI-Encoder-small-lr_0.0005-b_256-j_4-p_amp
136
+ 2024-08-16,03:25:54 | INFO | no_set_device_rank: False
137
+ 2024-08-16,03:25:54 | INFO | precision: amp
138
+ 2024-08-16,03:25:54 | INFO | pretrained:
139
+ 2024-08-16,03:25:54 | INFO | pretrained_image: False
140
+ 2024-08-16,03:25:54 | INFO | rank: 0
141
+ 2024-08-16,03:25:54 | INFO | remote_sync: None
142
+ 2024-08-16,03:25:54 | INFO | remote_sync_frequency: 300
143
+ 2024-08-16,03:25:54 | INFO | remote_sync_protocol: s3
144
+ 2024-08-16,03:25:54 | INFO | report_to:
145
+ 2024-08-16,03:25:54 | INFO | resume: None
146
+ 2024-08-16,03:25:54 | INFO | save_frequency: 1
147
+ 2024-08-16,03:25:54 | INFO | save_most_recent: False
148
+ 2024-08-16,03:25:54 | INFO | seed: 0
149
+ 2024-08-16,03:25:54 | INFO | siglip: False
150
+ 2024-08-16,03:25:54 | INFO | skip_scheduler: False
151
+ 2024-08-16,03:25:54 | INFO | tensorboard: False
152
+ 2024-08-16,03:25:54 | INFO | tensorboard_path:
153
+ 2024-08-16,03:25:54 | INFO | torchcompile: False
154
+ 2024-08-16,03:25:54 | INFO | torchscript: False
155
+ 2024-08-16,03:25:54 | INFO | trace: False
156
+ 2024-08-16,03:25:54 | INFO | train_data: /root/autodl-tmp/.autodl/Projects/fMRI2TextAligner/notebooks/train.csv
157
+ 2024-08-16,03:25:54 | INFO | train_data_upsampling_factors: None
158
+ 2024-08-16,03:25:54 | INFO | train_num_samples: None
159
+ 2024-08-16,03:25:54 | INFO | use_bn_sync: False
160
+ 2024-08-16,03:25:54 | INFO | use_bnb_linear: None
161
+ 2024-08-16,03:25:54 | INFO | val_data: /root/autodl-tmp/.autodl/Projects/fMRI2TextAligner/notebooks/val.csv
162
+ 2024-08-16,03:25:54 | INFO | val_frequency: 1
163
+ 2024-08-16,03:25:54 | INFO | val_num_samples: None
164
+ 2024-08-16,03:25:54 | INFO | wandb: False
165
+ 2024-08-16,03:25:54 | INFO | wandb_notes:
166
+ 2024-08-16,03:25:54 | INFO | wandb_project_name: open-clip
167
+ 2024-08-16,03:25:54 | INFO | warmup: 10000
168
+ 2024-08-16,03:25:54 | INFO | wd: 0.2
169
+ 2024-08-16,03:25:54 | INFO | workers: 4
170
+ 2024-08-16,03:25:54 | INFO | world_size: 1
171
+ 2024-08-16,03:25:54 | INFO | zeroshot_frequency: 2
172
+ 2024-08-16,03:25:58 | INFO | Start epoch 0
173
+ 2024-08-16,03:26:01 | INFO | Train Epoch: 0 [ 256/27000 (1%)] Data (t): 1.639 Batch (t): 3.538, 72.3607/s, 72.3607/s/gpu LR: 0.000000 Logit Scale: 14.286 Contrastive_loss: 5.5485 (5.5485) Loss: 5.5485 (5.5485)
174
+ 2024-08-16,03:28:06 | INFO | Train Epoch: 0 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.520/s, 204.520/s/gpu LR: 0.000005 Logit Scale: 14.285 Contrastive_loss: 5.5459 (5.5472) Loss: 5.5459 (5.5472)
175
+ 2024-08-16,03:28:11 | INFO | Train Epoch: 0 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.359/s, 204.359/s/gpu LR: 0.000005 Logit Scale: 14.285 Contrastive_loss: 5.5479 (5.5474) Loss: 5.5479 (5.5474)
176
+ 2024-08-16,03:28:13 | INFO | Eval Epoch: 1 [256 / 3000] Clip Loss: 5.542412
177
+ 2024-08-16,03:28:18 | INFO | Eval Epoch: 1 image_to_text_mean_rank: 1451.4443 image_to_text_median_rank: 1410.0000 image_to_text_R@1: 0.0003 image_to_text_R@5: 0.0023 image_to_text_R@10: 0.0060 text_to_image_mean_rank: 1439.4327 text_to_image_median_rank: 1409.0000 text_to_image_R@1: 0.0007 text_to_image_R@5: 0.0020 text_to_image_R@10: 0.0043 clip_val_loss: 5.5223 epoch: 1.0000 num_samples: 3000.0000
178
+ 2024-08-16,03:28:19 | INFO | Start epoch 1
179
+ 2024-08-16,03:28:22 | INFO | Train Epoch: 1 [ 256/27000 (1%)] Data (t): 1.447 Batch (t): 2.695, 95.0061/s, 95.0061/s/gpu LR: 0.000005 Logit Scale: 14.285 Contrastive_loss: 5.5397 (5.5397) Loss: 5.5397 (5.5397)
180
+ 2024-08-16,03:30:27 | INFO | Train Epoch: 1 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.426/s, 204.426/s/gpu LR: 0.000010 Logit Scale: 14.290 Contrastive_loss: 5.4991 (5.5194) Loss: 5.4991 (5.5194)
181
+ 2024-08-16,03:30:32 | INFO | Train Epoch: 1 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.419/s, 204.419/s/gpu LR: 0.000010 Logit Scale: 14.290 Contrastive_loss: 5.4470 (5.4953) Loss: 5.4470 (5.4953)
182
+ 2024-08-16,03:30:34 | INFO | Eval Epoch: 2 [256 / 3000] Clip Loss: 5.452873
183
+ 2024-08-16,03:30:38 | INFO | Eval Epoch: 2 image_to_text_mean_rank: 1193.8437 image_to_text_median_rank: 1062.0000 image_to_text_R@1: 0.0013 image_to_text_R@5: 0.0037 image_to_text_R@10: 0.0067 text_to_image_mean_rank: 1196.7497 text_to_image_median_rank: 1078.0000 text_to_image_R@1: 0.0013 text_to_image_R@5: 0.0060 text_to_image_R@10: 0.0090 clip_val_loss: 5.4537 epoch: 2.0000 num_samples: 3000.0000
184
+ 2024-08-16,03:30:40 | INFO | Start epoch 2
185
+ 2024-08-16,03:30:42 | INFO | Train Epoch: 2 [ 256/27000 (1%)] Data (t): 1.420 Batch (t): 2.666, 96.0263/s, 96.0263/s/gpu LR: 0.000011 Logit Scale: 14.290 Contrastive_loss: 5.4566 (5.4566) Loss: 5.4566 (5.4566)
186
+ 2024-08-16,03:32:48 | INFO | Train Epoch: 2 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.495/s, 204.495/s/gpu LR: 0.000016 Logit Scale: 14.324 Contrastive_loss: 5.0180 (5.2373) Loss: 5.0180 (5.2373)
187
+ 2024-08-16,03:32:53 | INFO | Train Epoch: 2 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.358/s, 204.358/s/gpu LR: 0.000016 Logit Scale: 14.325 Contrastive_loss: 5.1190 (5.1979) Loss: 5.1190 (5.1979)
188
+ 2024-08-16,03:32:54 | INFO | Eval Epoch: 3 [256 / 3000] Clip Loss: 5.063042
189
+ 2024-08-16,03:32:59 | INFO | Eval Epoch: 3 image_to_text_mean_rank: 782.7933 image_to_text_median_rank: 592.0000 image_to_text_R@1: 0.0017 image_to_text_R@5: 0.0083 image_to_text_R@10: 0.0157 text_to_image_mean_rank: 727.7393 text_to_image_median_rank: 536.0000 text_to_image_R@1: 0.0030 text_to_image_R@5: 0.0117 text_to_image_R@10: 0.0230 clip_val_loss: 5.0752 epoch: 3.0000 num_samples: 3000.0000
190
+ 2024-08-16,03:33:00 | INFO | Start epoch 3
191
+ 2024-08-16,03:33:03 | INFO | Train Epoch: 3 [ 256/27000 (1%)] Data (t): 1.597 Batch (t): 2.842, 90.0810/s, 90.0810/s/gpu LR: 0.000016 Logit Scale: 14.326 Contrastive_loss: 5.1011 (5.1011) Loss: 5.1011 (5.1011)
192
+ 2024-08-16,03:35:09 | INFO | Train Epoch: 3 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.253, 204.414/s, 204.414/s/gpu LR: 0.000021 Logit Scale: 14.339 Contrastive_loss: 4.9292 (5.0151) Loss: 4.9292 (5.0151)
193
+ 2024-08-16,03:35:14 | INFO | Train Epoch: 3 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.467/s, 204.467/s/gpu LR: 0.000021 Logit Scale: 14.339 Contrastive_loss: 4.8818 (4.9707) Loss: 4.8818 (4.9707)
194
+ 2024-08-16,03:35:15 | INFO | Eval Epoch: 4 [256 / 3000] Clip Loss: 4.868340
195
+ 2024-08-16,03:35:20 | INFO | Eval Epoch: 4 image_to_text_mean_rank: 683.8327 image_to_text_median_rank: 457.0000 image_to_text_R@1: 0.0043 image_to_text_R@5: 0.0123 image_to_text_R@10: 0.0230 text_to_image_mean_rank: 612.9693 text_to_image_median_rank: 408.0000 text_to_image_R@1: 0.0033 text_to_image_R@5: 0.0143 text_to_image_R@10: 0.0283 clip_val_loss: 4.9190 epoch: 4.0000 num_samples: 3000.0000
196
+ 2024-08-16,03:35:21 | INFO | Start epoch 4
197
+ 2024-08-16,03:35:24 | INFO | Train Epoch: 4 [ 256/27000 (1%)] Data (t): 1.489 Batch (t): 2.735, 93.5960/s, 93.5960/s/gpu LR: 0.000021 Logit Scale: 14.339 Contrastive_loss: 4.6774 (4.6774) Loss: 4.6774 (4.6774)
198
+ 2024-08-16,03:37:29 | INFO | Train Epoch: 4 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.427/s, 204.427/s/gpu LR: 0.000026 Logit Scale: 14.352 Contrastive_loss: 4.6595 (4.6684) Loss: 4.6595 (4.6684)
199
+ 2024-08-16,03:37:34 | INFO | Train Epoch: 4 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.444/s, 204.444/s/gpu LR: 0.000026 Logit Scale: 14.352 Contrastive_loss: 4.7835 (4.7068) Loss: 4.7835 (4.7068)
200
+ 2024-08-16,03:37:36 | INFO | Eval Epoch: 5 [256 / 3000] Clip Loss: 4.786659
201
+ 2024-08-16,03:37:41 | INFO | Eval Epoch: 5 image_to_text_mean_rank: 620.0710 image_to_text_median_rank: 405.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0150 image_to_text_R@10: 0.0250 text_to_image_mean_rank: 564.3297 text_to_image_median_rank: 358.0000 text_to_image_R@1: 0.0047 text_to_image_R@5: 0.0173 text_to_image_R@10: 0.0367 clip_val_loss: 4.8233 epoch: 5.0000 num_samples: 3000.0000
202
+ 2024-08-16,03:37:42 | INFO | Start epoch 5
203
+ 2024-08-16,03:37:45 | INFO | Train Epoch: 5 [ 256/27000 (1%)] Data (t): 1.495 Batch (t): 2.740, 93.4321/s, 93.4321/s/gpu LR: 0.000026 Logit Scale: 14.352 Contrastive_loss: 4.5845 (4.5845) Loss: 4.5845 (4.5845)
204
+ 2024-08-16,03:39:50 | INFO | Train Epoch: 5 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.440/s, 204.440/s/gpu LR: 0.000031 Logit Scale: 14.392 Contrastive_loss: 4.5224 (4.5534) Loss: 4.5224 (4.5534)
205
+ 2024-08-16,03:39:55 | INFO | Train Epoch: 5 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.301/s, 204.301/s/gpu LR: 0.000031 Logit Scale: 14.393 Contrastive_loss: 4.4956 (4.5342) Loss: 4.4956 (4.5342)
206
+ 2024-08-16,03:39:56 | INFO | Eval Epoch: 6 [256 / 3000] Clip Loss: 4.666817
207
+ 2024-08-16,03:40:01 | INFO | Eval Epoch: 6 image_to_text_mean_rank: 560.2250 image_to_text_median_rank: 363.0000 image_to_text_R@1: 0.0037 image_to_text_R@5: 0.0190 image_to_text_R@10: 0.0360 text_to_image_mean_rank: 524.7307 text_to_image_median_rank: 326.0000 text_to_image_R@1: 0.0037 text_to_image_R@5: 0.0223 text_to_image_R@10: 0.0403 clip_val_loss: 4.7456 epoch: 6.0000 num_samples: 3000.0000
208
+ 2024-08-16,03:40:03 | INFO | Start epoch 6
209
+ 2024-08-16,03:40:05 | INFO | Train Epoch: 6 [ 256/27000 (1%)] Data (t): 1.443 Batch (t): 2.691, 95.1276/s, 95.1276/s/gpu LR: 0.000032 Logit Scale: 14.394 Contrastive_loss: 4.1144 (4.1144) Loss: 4.1144 (4.1144)
210
+ 2024-08-16,03:42:10 | INFO | Train Epoch: 6 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.240/s, 204.240/s/gpu LR: 0.000037 Logit Scale: 14.484 Contrastive_loss: 4.3783 (4.2463) Loss: 4.3783 (4.2463)
211
+ 2024-08-16,03:42:15 | INFO | Train Epoch: 6 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.514/s, 204.514/s/gpu LR: 0.000037 Logit Scale: 14.486 Contrastive_loss: 4.4021 (4.2983) Loss: 4.4021 (4.2983)
212
+ 2024-08-16,03:42:17 | INFO | Eval Epoch: 7 [256 / 3000] Clip Loss: 4.655993
213
+ 2024-08-16,03:42:22 | INFO | Eval Epoch: 7 image_to_text_mean_rank: 563.7200 image_to_text_median_rank: 352.0000 image_to_text_R@1: 0.0037 image_to_text_R@5: 0.0177 image_to_text_R@10: 0.0317 text_to_image_mean_rank: 515.4990 text_to_image_median_rank: 306.0000 text_to_image_R@1: 0.0067 text_to_image_R@5: 0.0250 text_to_image_R@10: 0.0453 clip_val_loss: 4.7377 epoch: 7.0000 num_samples: 3000.0000
214
+ 2024-08-16,03:42:23 | INFO | Start epoch 7
215
+ 2024-08-16,03:42:26 | INFO | Train Epoch: 7 [ 256/27000 (1%)] Data (t): 1.452 Batch (t): 2.698, 94.8848/s, 94.8848/s/gpu LR: 0.000037 Logit Scale: 14.487 Contrastive_loss: 3.9120 (3.9120) Loss: 3.9120 (3.9120)
216
+ 2024-08-16,03:44:31 | INFO | Train Epoch: 7 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.415/s, 204.415/s/gpu LR: 0.000042 Logit Scale: 14.608 Contrastive_loss: 3.9964 (3.9542) Loss: 3.9964 (3.9542)
217
+ 2024-08-16,03:44:36 | INFO | Train Epoch: 7 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.414/s, 204.414/s/gpu LR: 0.000042 Logit Scale: 14.612 Contrastive_loss: 3.9585 (3.9556) Loss: 3.9585 (3.9556)
218
+ 2024-08-16,03:44:38 | INFO | Eval Epoch: 8 [256 / 3000] Clip Loss: 4.634455
219
+ 2024-08-16,03:44:42 | INFO | Eval Epoch: 8 image_to_text_mean_rank: 551.6537 image_to_text_median_rank: 340.0000 image_to_text_R@1: 0.0050 image_to_text_R@5: 0.0223 image_to_text_R@10: 0.0390 text_to_image_mean_rank: 516.1967 text_to_image_median_rank: 309.0000 text_to_image_R@1: 0.0057 text_to_image_R@5: 0.0260 text_to_image_R@10: 0.0487 clip_val_loss: 4.7750 epoch: 8.0000 num_samples: 3000.0000
220
+ 2024-08-16,03:44:44 | INFO | Start epoch 8
221
+ 2024-08-16,03:44:47 | INFO | Train Epoch: 8 [ 256/27000 (1%)] Data (t): 1.500 Batch (t): 2.746, 93.2370/s, 93.2370/s/gpu LR: 0.000042 Logit Scale: 14.613 Contrastive_loss: 3.6235 (3.6235) Loss: 3.6235 (3.6235)
222
+ 2024-08-16,03:46:52 | INFO | Train Epoch: 8 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.478/s, 204.478/s/gpu LR: 0.000047 Logit Scale: 14.756 Contrastive_loss: 3.9351 (3.7793) Loss: 3.9351 (3.7793)
223
+ 2024-08-16,03:46:57 | INFO | Train Epoch: 8 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.524/s, 204.524/s/gpu LR: 0.000047 Logit Scale: 14.760 Contrastive_loss: 3.8363 (3.7983) Loss: 3.8363 (3.7983)
224
+ 2024-08-16,03:46:58 | INFO | Eval Epoch: 9 [256 / 3000] Clip Loss: 4.764020
225
+ 2024-08-16,03:47:03 | INFO | Eval Epoch: 9 image_to_text_mean_rank: 587.4340 image_to_text_median_rank: 353.0000 image_to_text_R@1: 0.0040 image_to_text_R@5: 0.0187 image_to_text_R@10: 0.0360 text_to_image_mean_rank: 546.2733 text_to_image_median_rank: 318.0000 text_to_image_R@1: 0.0053 text_to_image_R@5: 0.0230 text_to_image_R@10: 0.0460 clip_val_loss: 4.8689 epoch: 9.0000 num_samples: 3000.0000
226
+ 2024-08-16,03:47:04 | INFO | Start epoch 9
227
+ 2024-08-16,03:47:07 | INFO | Train Epoch: 9 [ 256/27000 (1%)] Data (t): 1.432 Batch (t): 2.678, 95.6108/s, 95.6108/s/gpu LR: 0.000047 Logit Scale: 14.761 Contrastive_loss: 3.1292 (3.1292) Loss: 3.1292 (3.1292)
228
+ 2024-08-16,03:49:12 | INFO | Train Epoch: 9 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.417/s, 204.417/s/gpu LR: 0.000052 Logit Scale: 14.924 Contrastive_loss: 3.3670 (3.2481) Loss: 3.3670 (3.2481)
229
+ 2024-08-16,03:49:17 | INFO | Train Epoch: 9 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.451/s, 204.451/s/gpu LR: 0.000053 Logit Scale: 14.929 Contrastive_loss: 3.4293 (3.3085) Loss: 3.4293 (3.3085)
230
+ 2024-08-16,03:49:19 | INFO | Eval Epoch: 10 [256 / 3000] Clip Loss: 4.830852
231
+ 2024-08-16,03:49:24 | INFO | Eval Epoch: 10 image_to_text_mean_rank: 604.2760 image_to_text_median_rank: 366.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0187 image_to_text_R@10: 0.0343 text_to_image_mean_rank: 571.8533 text_to_image_median_rank: 339.0000 text_to_image_R@1: 0.0050 text_to_image_R@5: 0.0233 text_to_image_R@10: 0.0443 clip_val_loss: 4.9887 epoch: 10.0000 num_samples: 3000.0000
232
+ 2024-08-16,03:49:25 | INFO | Start epoch 10
233
+ 2024-08-16,03:49:28 | INFO | Train Epoch: 10 [ 256/27000 (1%)] Data (t): 1.458 Batch (t): 2.703, 94.7107/s, 94.7107/s/gpu LR: 0.000053 Logit Scale: 14.930 Contrastive_loss: 2.6423 (2.6423) Loss: 2.6423 (2.6423)
234
+ 2024-08-16,03:51:33 | INFO | Train Epoch: 10 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.377/s, 204.377/s/gpu LR: 0.000058 Logit Scale: 15.109 Contrastive_loss: 2.7973 (2.7198) Loss: 2.7973 (2.7198)
235
+ 2024-08-16,03:51:38 | INFO | Train Epoch: 10 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.584/s, 204.584/s/gpu LR: 0.000058 Logit Scale: 15.115 Contrastive_loss: 3.1476 (2.8624) Loss: 3.1476 (2.8624)
236
+ 2024-08-16,03:51:40 | INFO | Eval Epoch: 11 [256 / 3000] Clip Loss: 4.916827
237
+ 2024-08-16,03:51:44 | INFO | Eval Epoch: 11 image_to_text_mean_rank: 645.3330 image_to_text_median_rank: 392.0000 image_to_text_R@1: 0.0053 image_to_text_R@5: 0.0183 image_to_text_R@10: 0.0343 text_to_image_mean_rank: 606.6477 text_to_image_median_rank: 378.0000 text_to_image_R@1: 0.0057 text_to_image_R@5: 0.0220 text_to_image_R@10: 0.0393 clip_val_loss: 5.1492 epoch: 11.0000 num_samples: 3000.0000
238
+ 2024-08-16,03:51:46 | INFO | Start epoch 11
239
+ 2024-08-16,03:51:48 | INFO | Train Epoch: 11 [ 256/27000 (1%)] Data (t): 1.468 Batch (t): 2.714, 94.3351/s, 94.3351/s/gpu LR: 0.000058 Logit Scale: 15.116 Contrastive_loss: 2.1677 (2.1677) Loss: 2.1677 (2.1677)
240
+ 2024-08-16,03:53:54 | INFO | Train Epoch: 11 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.580/s, 204.580/s/gpu LR: 0.000063 Logit Scale: 15.304 Contrastive_loss: 2.1280 (2.1479) Loss: 2.1280 (2.1479)
241
+ 2024-08-16,03:53:59 | INFO | Train Epoch: 11 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.443/s, 204.443/s/gpu LR: 0.000063 Logit Scale: 15.311 Contrastive_loss: 2.1967 (2.1642) Loss: 2.1967 (2.1642)
242
+ 2024-08-16,03:54:00 | INFO | Eval Epoch: 12 [256 / 3000] Clip Loss: 5.227501
243
+ 2024-08-16,03:54:05 | INFO | Eval Epoch: 12 image_to_text_mean_rank: 701.0917 image_to_text_median_rank: 446.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0133 image_to_text_R@10: 0.0277 text_to_image_mean_rank: 672.1147 text_to_image_median_rank: 418.0000 text_to_image_R@1: 0.0060 text_to_image_R@5: 0.0197 text_to_image_R@10: 0.0397 clip_val_loss: 5.4125 epoch: 12.0000 num_samples: 3000.0000
244
+ 2024-08-16,03:54:06 | INFO | Start epoch 12
245
+ 2024-08-16,03:54:09 | INFO | Train Epoch: 12 [ 256/27000 (1%)] Data (t): 1.458 Batch (t): 2.704, 94.6843/s, 94.6843/s/gpu LR: 0.000063 Logit Scale: 15.313 Contrastive_loss: 1.4394 (1.4394) Loss: 1.4394 (1.4394)
246
+ 2024-08-16,03:56:14 | INFO | Train Epoch: 12 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.444/s, 204.444/s/gpu LR: 0.000068 Logit Scale: 15.499 Contrastive_loss: 1.4908 (1.4651) Loss: 1.4908 (1.4651)
247
+ 2024-08-16,03:56:19 | INFO | Train Epoch: 12 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.382/s, 204.382/s/gpu LR: 0.000068 Logit Scale: 15.506 Contrastive_loss: 1.5923 (1.5075) Loss: 1.5923 (1.5075)
248
+ 2024-08-16,03:56:21 | INFO | Eval Epoch: 13 [256 / 3000] Clip Loss: 5.440553
249
+ 2024-08-16,03:56:26 | INFO | Eval Epoch: 13 image_to_text_mean_rank: 746.4853 image_to_text_median_rank: 468.0000 image_to_text_R@1: 0.0023 image_to_text_R@5: 0.0130 image_to_text_R@10: 0.0290 text_to_image_mean_rank: 718.4413 text_to_image_median_rank: 455.0000 text_to_image_R@1: 0.0043 text_to_image_R@5: 0.0197 text_to_image_R@10: 0.0323 clip_val_loss: 5.6023 epoch: 13.0000 num_samples: 3000.0000
250
+ 2024-08-16,03:56:27 | INFO | Start epoch 13
251
+ 2024-08-16,03:56:30 | INFO | Train Epoch: 13 [ 256/27000 (1%)] Data (t): 1.452 Batch (t): 2.698, 94.8883/s, 94.8883/s/gpu LR: 0.000068 Logit Scale: 15.507 Contrastive_loss: 0.92882 (0.92882) Loss: 0.92882 (0.92882)
252
+ 2024-08-16,03:58:35 | INFO | Train Epoch: 13 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.386/s, 204.386/s/gpu LR: 0.000073 Logit Scale: 15.674 Contrastive_loss: 0.86548 (0.89715) Loss: 0.86548 (0.89715)
253
+ 2024-08-16,03:58:40 | INFO | Train Epoch: 13 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.431/s, 204.431/s/gpu LR: 0.000073 Logit Scale: 15.681 Contrastive_loss: 0.90084 (0.89838) Loss: 0.90084 (0.89838)
254
+ 2024-08-16,03:58:42 | INFO | Eval Epoch: 14 [256 / 3000] Clip Loss: 5.552146
255
+ 2024-08-16,03:58:46 | INFO | Eval Epoch: 14 image_to_text_mean_rank: 811.2757 image_to_text_median_rank: 553.0000 image_to_text_R@1: 0.0027 image_to_text_R@5: 0.0127 image_to_text_R@10: 0.0207 text_to_image_mean_rank: 788.4850 text_to_image_median_rank: 516.0000 text_to_image_R@1: 0.0047 text_to_image_R@5: 0.0197 text_to_image_R@10: 0.0333 clip_val_loss: 5.8483 epoch: 14.0000 num_samples: 3000.0000
256
+ 2024-08-16,03:58:48 | INFO | Start epoch 14
257
+ 2024-08-16,03:58:50 | INFO | Train Epoch: 14 [ 256/27000 (1%)] Data (t): 1.521 Batch (t): 2.767, 92.5317/s, 92.5317/s/gpu LR: 0.000074 Logit Scale: 15.682 Contrastive_loss: 0.52879 (0.52879) Loss: 0.52879 (0.52879)
258
+ 2024-08-16,04:00:56 | INFO | Train Epoch: 14 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.407/s, 204.407/s/gpu LR: 0.000079 Logit Scale: 15.819 Contrastive_loss: 0.53437 (0.53158) Loss: 0.53437 (0.53158)
259
+ 2024-08-16,04:01:01 | INFO | Train Epoch: 14 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.393/s, 204.393/s/gpu LR: 0.000079 Logit Scale: 15.825 Contrastive_loss: 0.52500 (0.52939) Loss: 0.52500 (0.52939)
260
+ 2024-08-16,04:01:02 | INFO | Eval Epoch: 15 [256 / 3000] Clip Loss: 5.805412
261
+ 2024-08-16,04:01:07 | INFO | Eval Epoch: 15 image_to_text_mean_rank: 865.4823 image_to_text_median_rank: 598.0000 image_to_text_R@1: 0.0050 image_to_text_R@5: 0.0143 image_to_text_R@10: 0.0270 text_to_image_mean_rank: 846.2273 text_to_image_median_rank: 575.0000 text_to_image_R@1: 0.0033 text_to_image_R@5: 0.0163 text_to_image_R@10: 0.0297 clip_val_loss: 6.0065 epoch: 15.0000 num_samples: 3000.0000
262
+ 2024-08-16,04:01:08 | INFO | Start epoch 15
263
+ 2024-08-16,04:01:11 | INFO | Train Epoch: 15 [ 256/27000 (1%)] Data (t): 1.513 Batch (t): 2.760, 92.7525/s, 92.7525/s/gpu LR: 0.000079 Logit Scale: 15.826 Contrastive_loss: 0.34025 (0.34025) Loss: 0.34025 (0.34025)
264
+ 2024-08-16,04:03:16 | INFO | Train Epoch: 15 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.496/s, 204.496/s/gpu LR: 0.000084 Logit Scale: 15.934 Contrastive_loss: 0.29246 (0.31636) Loss: 0.29246 (0.31636)
265
+ 2024-08-16,04:03:21 | INFO | Train Epoch: 15 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.414/s, 204.414/s/gpu LR: 0.000084 Logit Scale: 15.938 Contrastive_loss: 0.26719 (0.29997) Loss: 0.26719 (0.29997)
266
+ 2024-08-16,04:03:23 | INFO | Eval Epoch: 16 [256 / 3000] Clip Loss: 6.014488
267
+ 2024-08-16,04:03:28 | INFO | Eval Epoch: 16 image_to_text_mean_rank: 887.4880 image_to_text_median_rank: 627.0000 image_to_text_R@1: 0.0030 image_to_text_R@5: 0.0140 image_to_text_R@10: 0.0260 text_to_image_mean_rank: 880.3730 text_to_image_median_rank: 619.0000 text_to_image_R@1: 0.0037 text_to_image_R@5: 0.0183 text_to_image_R@10: 0.0320 clip_val_loss: 6.1342 epoch: 16.0000 num_samples: 3000.0000
268
+ 2024-08-16,04:03:29 | INFO | Start epoch 16
269
+ 2024-08-16,04:03:32 | INFO | Train Epoch: 16 [ 256/27000 (1%)] Data (t): 1.600 Batch (t): 2.846, 89.9529/s, 89.9529/s/gpu LR: 0.000084 Logit Scale: 15.939 Contrastive_loss: 0.20845 (0.20845) Loss: 0.20845 (0.20845)
270
+ 2024-08-16,04:05:37 | INFO | Train Epoch: 16 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.440/s, 204.440/s/gpu LR: 0.000089 Logit Scale: 16.026 Contrastive_loss: 0.21264 (0.21055) Loss: 0.21264 (0.21055)
271
+ 2024-08-16,04:05:42 | INFO | Train Epoch: 16 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.576/s, 204.576/s/gpu LR: 0.000089 Logit Scale: 16.030 Contrastive_loss: 0.18559 (0.20223) Loss: 0.18559 (0.20223)
272
+ 2024-08-16,04:05:44 | INFO | Eval Epoch: 17 [256 / 3000] Clip Loss: 6.084400
273
+ 2024-08-16,04:05:49 | INFO | Eval Epoch: 17 image_to_text_mean_rank: 935.7397 image_to_text_median_rank: 700.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0137 image_to_text_R@10: 0.0237 text_to_image_mean_rank: 929.2380 text_to_image_median_rank: 691.0000 text_to_image_R@1: 0.0040 text_to_image_R@5: 0.0160 text_to_image_R@10: 0.0290 clip_val_loss: 6.2851 epoch: 17.0000 num_samples: 3000.0000
274
+ 2024-08-16,04:05:50 | INFO | Start epoch 17
275
+ 2024-08-16,04:05:53 | INFO | Train Epoch: 17 [ 256/27000 (1%)] Data (t): 1.469 Batch (t): 2.715, 94.3037/s, 94.3037/s/gpu LR: 0.000089 Logit Scale: 16.031 Contrastive_loss: 0.13997 (0.13997) Loss: 0.13997 (0.13997)
276
+ 2024-08-16,04:07:58 | INFO | Train Epoch: 17 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.280/s, 204.280/s/gpu LR: 0.000094 Logit Scale: 16.107 Contrastive_loss: 0.14680 (0.14339) Loss: 0.14680 (0.14339)
277
+ 2024-08-16,04:08:03 | INFO | Train Epoch: 17 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.427/s, 204.427/s/gpu LR: 0.000095 Logit Scale: 16.110 Contrastive_loss: 0.14784 (0.14487) Loss: 0.14784 (0.14487)
278
+ 2024-08-16,04:08:05 | INFO | Eval Epoch: 18 [256 / 3000] Clip Loss: 6.179710
279
+ 2024-08-16,04:08:09 | INFO | Eval Epoch: 18 image_to_text_mean_rank: 959.2040 image_to_text_median_rank: 740.0000 image_to_text_R@1: 0.0017 image_to_text_R@5: 0.0143 image_to_text_R@10: 0.0240 text_to_image_mean_rank: 952.1377 text_to_image_median_rank: 737.0000 text_to_image_R@1: 0.0017 text_to_image_R@5: 0.0143 text_to_image_R@10: 0.0293 clip_val_loss: 6.3885 epoch: 18.0000 num_samples: 3000.0000
280
+ 2024-08-16,04:08:11 | INFO | Start epoch 18
281
+ 2024-08-16,04:08:13 | INFO | Train Epoch: 18 [ 256/27000 (1%)] Data (t): 1.515 Batch (t): 2.760, 92.7391/s, 92.7391/s/gpu LR: 0.000095 Logit Scale: 16.111 Contrastive_loss: 0.10571 (0.10571) Loss: 0.10571 (0.10571)
282
+ 2024-08-16,04:10:19 | INFO | Train Epoch: 18 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.597/s, 204.597/s/gpu LR: 0.000100 Logit Scale: 16.180 Contrastive_loss: 0.11256 (0.10913) Loss: 0.11256 (0.10913)
283
+ 2024-08-16,04:10:24 | INFO | Train Epoch: 18 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.521/s, 204.521/s/gpu LR: 0.000100 Logit Scale: 16.183 Contrastive_loss: 0.10602 (0.10809) Loss: 0.10602 (0.10809)
284
+ 2024-08-16,04:10:25 | INFO | Eval Epoch: 19 [256 / 3000] Clip Loss: 6.244460
285
+ 2024-08-16,04:10:30 | INFO | Eval Epoch: 19 image_to_text_mean_rank: 984.8337 image_to_text_median_rank: 751.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0107 image_to_text_R@10: 0.0230 text_to_image_mean_rank: 974.3037 text_to_image_median_rank: 724.0000 text_to_image_R@1: 0.0030 text_to_image_R@5: 0.0130 text_to_image_R@10: 0.0267 clip_val_loss: 6.4495 epoch: 19.0000 num_samples: 3000.0000
286
+ 2024-08-16,04:10:31 | INFO | Start epoch 19
287
+ 2024-08-16,04:10:34 | INFO | Train Epoch: 19 [ 256/27000 (1%)] Data (t): 1.482 Batch (t): 2.730, 93.7704/s, 93.7704/s/gpu LR: 0.000100 Logit Scale: 16.184 Contrastive_loss: 0.091815 (0.091815) Loss: 0.091815 (0.091815)
288
+ 2024-08-16,04:12:39 | INFO | Train Epoch: 19 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.414/s, 204.414/s/gpu LR: 0.000105 Logit Scale: 16.250 Contrastive_loss: 0.10051 (0.096161) Loss: 0.10051 (0.096161)
289
+ 2024-08-16,04:12:44 | INFO | Train Epoch: 19 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.567/s, 204.567/s/gpu LR: 0.000105 Logit Scale: 16.253 Contrastive_loss: 0.083744 (0.092022) Loss: 0.083744 (0.092022)
290
+ 2024-08-16,04:12:46 | INFO | Eval Epoch: 20 [256 / 3000] Clip Loss: 6.373804
291
+ 2024-08-16,04:12:51 | INFO | Eval Epoch: 20 image_to_text_mean_rank: 998.9283 image_to_text_median_rank: 775.0000 image_to_text_R@1: 0.0023 image_to_text_R@5: 0.0120 image_to_text_R@10: 0.0223 text_to_image_mean_rank: 995.4970 text_to_image_median_rank: 768.0000 text_to_image_R@1: 0.0023 text_to_image_R@5: 0.0143 text_to_image_R@10: 0.0240 clip_val_loss: 6.5515 epoch: 20.0000 num_samples: 3000.0000
292
+ 2024-08-16,04:12:52 | INFO | Start epoch 20
293
+ 2024-08-16,04:12:55 | INFO | Train Epoch: 20 [ 256/27000 (1%)] Data (t): 1.470 Batch (t): 2.715, 94.2863/s, 94.2863/s/gpu LR: 0.000105 Logit Scale: 16.254 Contrastive_loss: 0.082728 (0.082728) Loss: 0.082728 (0.082728)
294
+ 2024-08-16,04:15:00 | INFO | Train Epoch: 20 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.462/s, 204.462/s/gpu LR: 0.000110 Logit Scale: 16.322 Contrastive_loss: 0.092753 (0.087741) Loss: 0.092753 (0.087741)
295
+ 2024-08-16,04:15:05 | INFO | Train Epoch: 20 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.462/s, 204.462/s/gpu LR: 0.000110 Logit Scale: 16.325 Contrastive_loss: 0.10030 (0.091928) Loss: 0.10030 (0.091928)
296
+ 2024-08-16,04:15:07 | INFO | Eval Epoch: 21 [256 / 3000] Clip Loss: 6.443340
297
+ 2024-08-16,04:15:11 | INFO | Eval Epoch: 21 image_to_text_mean_rank: 1025.5073 image_to_text_median_rank: 807.0000 image_to_text_R@1: 0.0023 image_to_text_R@5: 0.0103 image_to_text_R@10: 0.0207 text_to_image_mean_rank: 1019.2447 text_to_image_median_rank: 826.0000 text_to_image_R@1: 0.0027 text_to_image_R@5: 0.0127 text_to_image_R@10: 0.0250 clip_val_loss: 6.6143 epoch: 21.0000 num_samples: 3000.0000
298
+ 2024-08-16,04:15:13 | INFO | Start epoch 21
299
+ 2024-08-16,04:15:15 | INFO | Train Epoch: 21 [ 256/27000 (1%)] Data (t): 1.476 Batch (t): 2.722, 94.0350/s, 94.0350/s/gpu LR: 0.000110 Logit Scale: 16.325 Contrastive_loss: 0.067669 (0.067669) Loss: 0.067669 (0.067669)
300
+ 2024-08-16,04:17:21 | INFO | Train Epoch: 21 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.508/s, 204.508/s/gpu LR: 0.000115 Logit Scale: 16.396 Contrastive_loss: 0.081561 (0.074615) Loss: 0.081561 (0.074615)
301
+ 2024-08-16,04:17:26 | INFO | Train Epoch: 21 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.474/s, 204.474/s/gpu LR: 0.000116 Logit Scale: 16.399 Contrastive_loss: 0.089710 (0.079647) Loss: 0.089710 (0.079647)
302
+ 2024-08-16,04:17:27 | INFO | Eval Epoch: 22 [256 / 3000] Clip Loss: 6.464468
303
+ 2024-08-16,04:17:32 | INFO | Eval Epoch: 22 image_to_text_mean_rank: 1013.4487 image_to_text_median_rank: 788.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0103 image_to_text_R@10: 0.0203 text_to_image_mean_rank: 1005.7700 text_to_image_median_rank: 764.0000 text_to_image_R@1: 0.0017 text_to_image_R@5: 0.0140 text_to_image_R@10: 0.0250 clip_val_loss: 6.6335 epoch: 22.0000 num_samples: 3000.0000
304
+ 2024-08-16,04:17:33 | INFO | Start epoch 22
305
+ 2024-08-16,04:17:36 | INFO | Train Epoch: 22 [ 256/27000 (1%)] Data (t): 1.477 Batch (t): 2.724, 93.9871/s, 93.9871/s/gpu LR: 0.000116 Logit Scale: 16.400 Contrastive_loss: 0.075971 (0.075971) Loss: 0.075971 (0.075971)
306
+ 2024-08-16,04:19:41 | INFO | Train Epoch: 22 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.527/s, 204.527/s/gpu LR: 0.000121 Logit Scale: 16.478 Contrastive_loss: 0.094408 (0.085189) Loss: 0.094408 (0.085189)
307
+ 2024-08-16,04:19:46 | INFO | Train Epoch: 22 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.417/s, 204.417/s/gpu LR: 0.000121 Logit Scale: 16.482 Contrastive_loss: 0.098228 (0.089536) Loss: 0.098228 (0.089536)
308
+ 2024-08-16,04:19:48 | INFO | Eval Epoch: 23 [256 / 3000] Clip Loss: 6.571661
309
+ 2024-08-16,04:19:53 | INFO | Eval Epoch: 23 image_to_text_mean_rank: 996.3893 image_to_text_median_rank: 789.0000 image_to_text_R@1: 0.0037 image_to_text_R@5: 0.0113 image_to_text_R@10: 0.0207 text_to_image_mean_rank: 992.0480 text_to_image_median_rank: 786.0000 text_to_image_R@1: 0.0030 text_to_image_R@5: 0.0147 text_to_image_R@10: 0.0270 clip_val_loss: 6.6228 epoch: 23.0000 num_samples: 3000.0000
310
+ 2024-08-16,04:19:54 | INFO | Start epoch 23
311
+ 2024-08-16,04:19:57 | INFO | Train Epoch: 23 [ 256/27000 (1%)] Data (t): 1.547 Batch (t): 2.793, 91.6730/s, 91.6730/s/gpu LR: 0.000121 Logit Scale: 16.483 Contrastive_loss: 0.078356 (0.078356) Loss: 0.078356 (0.078356)
312
+ 2024-08-16,04:22:02 | INFO | Train Epoch: 23 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.462/s, 204.462/s/gpu LR: 0.000126 Logit Scale: 16.572 Contrastive_loss: 0.090556 (0.084456) Loss: 0.090556 (0.084456)
313
+ 2024-08-16,04:22:07 | INFO | Train Epoch: 23 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.327/s, 204.327/s/gpu LR: 0.000126 Logit Scale: 16.576 Contrastive_loss: 0.086313 (0.085075) Loss: 0.086313 (0.085075)
314
+ 2024-08-16,04:22:09 | INFO | Eval Epoch: 24 [256 / 3000] Clip Loss: 6.592369
315
+ 2024-08-16,04:22:14 | INFO | Eval Epoch: 24 image_to_text_mean_rank: 1030.9110 image_to_text_median_rank: 812.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0127 image_to_text_R@10: 0.0210 text_to_image_mean_rank: 1022.6720 text_to_image_median_rank: 805.0000 text_to_image_R@1: 0.0030 text_to_image_R@5: 0.0163 text_to_image_R@10: 0.0290 clip_val_loss: 6.7181 epoch: 24.0000 num_samples: 3000.0000
316
+ 2024-08-16,04:22:15 | INFO | Start epoch 24
317
+ 2024-08-16,04:22:18 | INFO | Train Epoch: 24 [ 256/27000 (1%)] Data (t): 1.778 Batch (t): 3.023, 84.6777/s, 84.6777/s/gpu LR: 0.000126 Logit Scale: 16.577 Contrastive_loss: 0.079637 (0.079637) Loss: 0.079637 (0.079637)
318
+ 2024-08-16,04:24:23 | INFO | Train Epoch: 24 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.535/s, 204.535/s/gpu LR: 0.000131 Logit Scale: 16.685 Contrastive_loss: 0.11149 (0.095563) Loss: 0.11149 (0.095563)
319
+ 2024-08-16,04:24:28 | INFO | Train Epoch: 24 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.539/s, 204.539/s/gpu LR: 0.000131 Logit Scale: 16.690 Contrastive_loss: 0.10690 (0.099341) Loss: 0.10690 (0.099341)
320
+ 2024-08-16,04:24:30 | INFO | Eval Epoch: 25 [256 / 3000] Clip Loss: 6.576798
321
+ 2024-08-16,04:24:35 | INFO | Eval Epoch: 25 image_to_text_mean_rank: 1011.1870 image_to_text_median_rank: 780.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0097 image_to_text_R@10: 0.0180 text_to_image_mean_rank: 994.8833 text_to_image_median_rank: 768.0000 text_to_image_R@1: 0.0023 text_to_image_R@5: 0.0147 text_to_image_R@10: 0.0243 clip_val_loss: 6.6493 epoch: 25.0000 num_samples: 3000.0000
322
+ 2024-08-16,04:24:36 | INFO | Start epoch 25
323
+ 2024-08-16,04:24:39 | INFO | Train Epoch: 25 [ 256/27000 (1%)] Data (t): 1.483 Batch (t): 2.726, 93.8938/s, 93.8938/s/gpu LR: 0.000131 Logit Scale: 16.691 Contrastive_loss: 0.094896 (0.094896) Loss: 0.094896 (0.094896)
324
+ 2024-08-16,04:26:44 | INFO | Train Epoch: 25 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.456/s, 204.456/s/gpu LR: 0.000136 Logit Scale: 16.831 Contrastive_loss: 0.24732 (0.17111) Loss: 0.24732 (0.17111)
325
+ 2024-08-16,04:26:49 | INFO | Train Epoch: 25 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.546/s, 204.546/s/gpu LR: 0.000137 Logit Scale: 16.840 Contrastive_loss: 0.42561 (0.25594) Loss: 0.42561 (0.25594)
326
+ 2024-08-16,04:26:50 | INFO | Eval Epoch: 26 [256 / 3000] Clip Loss: 6.370559
327
+ 2024-08-16,04:26:55 | INFO | Eval Epoch: 26 image_to_text_mean_rank: 1114.0630 image_to_text_median_rank: 930.0000 image_to_text_R@1: 0.0010 image_to_text_R@5: 0.0053 image_to_text_R@10: 0.0103 text_to_image_mean_rank: 1001.3303 text_to_image_median_rank: 768.0000 text_to_image_R@1: 0.0023 text_to_image_R@5: 0.0113 text_to_image_R@10: 0.0217 clip_val_loss: 6.6953 epoch: 26.0000 num_samples: 3000.0000
328
+ 2024-08-16,04:26:57 | INFO | Start epoch 26
329
+ 2024-08-16,04:26:59 | INFO | Train Epoch: 26 [ 256/27000 (1%)] Data (t): 1.514 Batch (t): 2.756, 92.8774/s, 92.8774/s/gpu LR: 0.000137 Logit Scale: 16.842 Contrastive_loss: 0.68641 (0.68641) Loss: 0.68641 (0.68641)
330
+ 2024-08-16,04:29:04 | INFO | Train Epoch: 26 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.423/s, 204.423/s/gpu LR: 0.000142 Logit Scale: 16.877 Contrastive_loss: 2.6772 (1.6818) Loss: 2.6772 (1.6818)
331
+ 2024-08-16,04:29:09 | INFO | Train Epoch: 26 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.620/s, 204.620/s/gpu LR: 0.000142 Logit Scale: 16.883 Contrastive_loss: 2.6573 (2.0070) Loss: 2.6573 (2.0070)
332
+ 2024-08-16,04:29:11 | INFO | Eval Epoch: 27 [256 / 3000] Clip Loss: 5.865312
333
+ 2024-08-16,04:29:16 | INFO | Eval Epoch: 27 image_to_text_mean_rank: 782.6180 image_to_text_median_rank: 546.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0103 image_to_text_R@10: 0.0223 text_to_image_mean_rank: 727.9463 text_to_image_median_rank: 478.0000 text_to_image_R@1: 0.0047 text_to_image_R@5: 0.0193 text_to_image_R@10: 0.0297 clip_val_loss: 5.9258 epoch: 27.0000 num_samples: 3000.0000
334
+ 2024-08-16,04:29:17 | INFO | Start epoch 27
335
+ 2024-08-16,04:29:20 | INFO | Train Epoch: 27 [ 256/27000 (1%)] Data (t): 1.459 Batch (t): 2.705, 94.6292/s, 94.6292/s/gpu LR: 0.000142 Logit Scale: 16.884 Contrastive_loss: 1.7112 (1.7112) Loss: 1.7112 (1.7112)
336
+ 2024-08-16,04:31:25 | INFO | Train Epoch: 27 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.360/s, 204.360/s/gpu LR: 0.000147 Logit Scale: 17.268 Contrastive_loss: 0.84065 (1.2759) Loss: 0.84065 (1.2759)
337
+ 2024-08-16,04:31:30 | INFO | Train Epoch: 27 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.422/s, 204.422/s/gpu LR: 0.000147 Logit Scale: 17.283 Contrastive_loss: 0.70991 (1.0872) Loss: 0.70991 (1.0872)
338
+ 2024-08-16,04:31:32 | INFO | Eval Epoch: 28 [256 / 3000] Clip Loss: 6.629393
339
+ 2024-08-16,04:31:37 | INFO | Eval Epoch: 28 image_to_text_mean_rank: 896.8010 image_to_text_median_rank: 667.0000 image_to_text_R@1: 0.0013 image_to_text_R@5: 0.0097 image_to_text_R@10: 0.0167 text_to_image_mean_rank: 876.8693 text_to_image_median_rank: 625.0000 text_to_image_R@1: 0.0020 text_to_image_R@5: 0.0127 text_to_image_R@10: 0.0240 clip_val_loss: 6.5428 epoch: 28.0000 num_samples: 3000.0000
340
+ 2024-08-16,04:31:38 | INFO | Start epoch 28
341
+ 2024-08-16,04:31:41 | INFO | Train Epoch: 28 [ 256/27000 (1%)] Data (t): 1.474 Batch (t): 2.719, 94.1445/s, 94.1445/s/gpu LR: 0.000147 Logit Scale: 17.287 Contrastive_loss: 0.30942 (0.30942) Loss: 0.30942 (0.30942)
342
+ 2024-08-16,04:33:46 | INFO | Train Epoch: 28 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.480/s, 204.480/s/gpu LR: 0.000152 Logit Scale: 17.513 Contrastive_loss: 0.14982 (0.22962) Loss: 0.14982 (0.22962)
343
+ 2024-08-16,04:33:51 | INFO | Train Epoch: 28 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.602/s, 204.602/s/gpu LR: 0.000152 Logit Scale: 17.520 Contrastive_loss: 0.17185 (0.21036) Loss: 0.17185 (0.21036)
344
+ 2024-08-16,04:33:53 | INFO | Eval Epoch: 29 [256 / 3000] Clip Loss: 6.860646
345
+ 2024-08-16,04:33:57 | INFO | Eval Epoch: 29 image_to_text_mean_rank: 948.3863 image_to_text_median_rank: 727.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0110 image_to_text_R@10: 0.0230 text_to_image_mean_rank: 941.6530 text_to_image_median_rank: 707.0000 text_to_image_R@1: 0.0023 text_to_image_R@5: 0.0147 text_to_image_R@10: 0.0250 clip_val_loss: 6.7876 epoch: 29.0000 num_samples: 3000.0000
346
+ 2024-08-16,04:33:59 | INFO | Start epoch 29
347
+ 2024-08-16,04:34:01 | INFO | Train Epoch: 29 [ 256/27000 (1%)] Data (t): 1.484 Batch (t): 2.730, 93.7852/s, 93.7852/s/gpu LR: 0.000152 Logit Scale: 17.521 Contrastive_loss: 0.073008 (0.073008) Loss: 0.073008 (0.073008)
348
+ 2024-08-16,04:36:06 | INFO | Train Epoch: 29 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.650/s, 204.650/s/gpu LR: 0.000157 Logit Scale: 17.624 Contrastive_loss: 0.061296 (0.067152) Loss: 0.061296 (0.067152)
349
+ 2024-08-16,04:36:11 | INFO | Train Epoch: 29 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.599/s, 204.599/s/gpu LR: 0.000158 Logit Scale: 17.627 Contrastive_loss: 0.069361 (0.067888) Loss: 0.069361 (0.067888)
350
+ 2024-08-16,04:36:13 | INFO | Eval Epoch: 30 [256 / 3000] Clip Loss: 7.037732
351
+ 2024-08-16,04:36:18 | INFO | Eval Epoch: 30 image_to_text_mean_rank: 987.8290 image_to_text_median_rank: 769.0000 image_to_text_R@1: 0.0017 image_to_text_R@5: 0.0083 image_to_text_R@10: 0.0157 text_to_image_mean_rank: 982.1127 text_to_image_median_rank: 774.0000 text_to_image_R@1: 0.0037 text_to_image_R@5: 0.0110 text_to_image_R@10: 0.0200 clip_val_loss: 6.9645 epoch: 30.0000 num_samples: 3000.0000
352
+ 2024-08-16,04:36:19 | INFO | Start epoch 30
353
+ 2024-08-16,04:36:22 | INFO | Train Epoch: 30 [ 256/27000 (1%)] Data (t): 1.468 Batch (t): 2.714, 94.3415/s, 94.3415/s/gpu LR: 0.000158 Logit Scale: 17.628 Contrastive_loss: 0.040057 (0.040057) Loss: 0.040057 (0.040057)
354
+ 2024-08-16,04:38:27 | INFO | Train Epoch: 30 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.454/s, 204.454/s/gpu LR: 0.000163 Logit Scale: 17.701 Contrastive_loss: 0.035407 (0.037732) Loss: 0.035407 (0.037732)
355
+ 2024-08-16,04:38:32 | INFO | Train Epoch: 30 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.606/s, 204.606/s/gpu LR: 0.000163 Logit Scale: 17.704 Contrastive_loss: 0.046378 (0.040614) Loss: 0.046378 (0.040614)
356
+ 2024-08-16,04:38:34 | INFO | Eval Epoch: 31 [256 / 3000] Clip Loss: 7.103376
357
+ 2024-08-16,04:38:38 | INFO | Eval Epoch: 31 image_to_text_mean_rank: 1028.7160 image_to_text_median_rank: 827.0000 image_to_text_R@1: 0.0030 image_to_text_R@5: 0.0087 image_to_text_R@10: 0.0193 text_to_image_mean_rank: 1023.7647 text_to_image_median_rank: 810.0000 text_to_image_R@1: 0.0020 text_to_image_R@5: 0.0117 text_to_image_R@10: 0.0193 clip_val_loss: 7.0932 epoch: 31.0000 num_samples: 3000.0000
358
+ 2024-08-16,04:38:40 | INFO | Start epoch 31
359
+ 2024-08-16,04:38:42 | INFO | Train Epoch: 31 [ 256/27000 (1%)] Data (t): 1.446 Batch (t): 2.692, 95.1004/s, 95.1004/s/gpu LR: 0.000163 Logit Scale: 17.704 Contrastive_loss: 0.045176 (0.045176) Loss: 0.045176 (0.045176)
360
+ 2024-08-16,04:40:47 | INFO | Train Epoch: 31 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.631/s, 204.631/s/gpu LR: 0.000168 Logit Scale: 17.770 Contrastive_loss: 0.063451 (0.054314) Loss: 0.063451 (0.054314)
361
+ 2024-08-16,04:40:53 | INFO | Train Epoch: 31 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.576/s, 204.576/s/gpu LR: 0.000168 Logit Scale: 17.772 Contrastive_loss: 0.025344 (0.044657) Loss: 0.025344 (0.044657)
362
+ 2024-08-16,04:40:54 | INFO | Eval Epoch: 32 [256 / 3000] Clip Loss: 7.116255
363
+ 2024-08-16,04:40:59 | INFO | Eval Epoch: 32 image_to_text_mean_rank: 1041.3660 image_to_text_median_rank: 826.0000 image_to_text_R@1: 0.0013 image_to_text_R@5: 0.0103 image_to_text_R@10: 0.0180 text_to_image_mean_rank: 1038.6310 text_to_image_median_rank: 819.0000 text_to_image_R@1: 0.0020 text_to_image_R@5: 0.0107 text_to_image_R@10: 0.0223 clip_val_loss: 7.1470 epoch: 32.0000 num_samples: 3000.0000
364
+ 2024-08-16,04:41:00 | INFO | Start epoch 32
365
+ 2024-08-16,04:41:03 | INFO | Train Epoch: 32 [ 256/27000 (1%)] Data (t): 1.481 Batch (t): 2.728, 93.8307/s, 93.8307/s/gpu LR: 0.000168 Logit Scale: 17.773 Contrastive_loss: 0.024506 (0.024506) Loss: 0.024506 (0.024506)
366
+ 2024-08-16,04:43:08 | INFO | Train Epoch: 32 [25856/27000 (96%)] Data (t): 0.001 Batch (t): 1.251, 204.585/s, 204.585/s/gpu LR: 0.000173 Logit Scale: 17.837 Contrastive_loss: 0.042150 (0.033328) Loss: 0.042150 (0.033328)
367
+ 2024-08-16,04:43:13 | INFO | Train Epoch: 32 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.710/s, 204.710/s/gpu LR: 0.000173 Logit Scale: 17.840 Contrastive_loss: 0.021264 (0.029307) Loss: 0.021264 (0.029307)
368
+ 2024-08-16,04:43:15 | INFO | Eval Epoch: 33 [256 / 3000] Clip Loss: 7.302838
369
+ 2024-08-16,04:43:20 | INFO | Eval Epoch: 33 image_to_text_mean_rank: 1051.0060 image_to_text_median_rank: 844.0000 image_to_text_R@1: 0.0010 image_to_text_R@5: 0.0083 image_to_text_R@10: 0.0143 text_to_image_mean_rank: 1047.7423 text_to_image_median_rank: 858.0000 text_to_image_R@1: 0.0017 text_to_image_R@5: 0.0107 text_to_image_R@10: 0.0167 clip_val_loss: 7.2055 epoch: 33.0000 num_samples: 3000.0000
370
+ 2024-08-16,04:43:21 | INFO | Start epoch 33
371
+ 2024-08-16,04:43:24 | INFO | Train Epoch: 33 [ 256/27000 (1%)] Data (t): 1.471 Batch (t): 2.714, 94.3115/s, 94.3115/s/gpu LR: 0.000173 Logit Scale: 17.840 Contrastive_loss: 0.036071 (0.036071) Loss: 0.036071 (0.036071)
372
+ 2024-08-16,04:45:29 | INFO | Train Epoch: 33 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.581/s, 204.581/s/gpu LR: 0.000178 Logit Scale: 17.908 Contrastive_loss: 0.032550 (0.034310) Loss: 0.032550 (0.034310)
373
+ 2024-08-16,04:45:34 | INFO | Train Epoch: 33 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.538/s, 204.538/s/gpu LR: 0.000179 Logit Scale: 17.911 Contrastive_loss: 0.045624 (0.038082) Loss: 0.045624 (0.038082)
374
+ 2024-08-16,04:45:36 | INFO | Eval Epoch: 34 [256 / 3000] Clip Loss: 7.142300
375
+ 2024-08-16,04:45:40 | INFO | Eval Epoch: 34 image_to_text_mean_rank: 1043.7917 image_to_text_median_rank: 850.0000 image_to_text_R@1: 0.0017 image_to_text_R@5: 0.0097 image_to_text_R@10: 0.0183 text_to_image_mean_rank: 1041.0867 text_to_image_median_rank: 850.0000 text_to_image_R@1: 0.0027 text_to_image_R@5: 0.0113 text_to_image_R@10: 0.0200 clip_val_loss: 7.1728 epoch: 34.0000 num_samples: 3000.0000
376
+ 2024-08-16,04:45:42 | INFO | Start epoch 34
377
+ 2024-08-16,04:45:44 | INFO | Train Epoch: 34 [ 256/27000 (1%)] Data (t): 1.475 Batch (t): 2.721, 94.0990/s, 94.0990/s/gpu LR: 0.000179 Logit Scale: 17.911 Contrastive_loss: 0.030825 (0.030825) Loss: 0.030825 (0.030825)
378
+ 2024-08-16,04:47:49 | INFO | Train Epoch: 34 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.590/s, 204.590/s/gpu LR: 0.000184 Logit Scale: 17.981 Contrastive_loss: 0.046251 (0.038538) Loss: 0.046251 (0.038538)
379
+ 2024-08-16,04:47:54 | INFO | Train Epoch: 34 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.829/s, 204.829/s/gpu LR: 0.000184 Logit Scale: 17.984 Contrastive_loss: 0.041009 (0.039362) Loss: 0.041009 (0.039362)
380
+ 2024-08-16,04:47:56 | INFO | Eval Epoch: 35 [256 / 3000] Clip Loss: 7.147564
381
+ 2024-08-16,04:48:01 | INFO | Eval Epoch: 35 image_to_text_mean_rank: 1042.3890 image_to_text_median_rank: 852.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0090 image_to_text_R@10: 0.0173 text_to_image_mean_rank: 1035.9400 text_to_image_median_rank: 837.0000 text_to_image_R@1: 0.0020 text_to_image_R@5: 0.0110 text_to_image_R@10: 0.0207 clip_val_loss: 7.1603 epoch: 35.0000 num_samples: 3000.0000
382
+ 2024-08-16,04:48:02 | INFO | Start epoch 35
383
+ 2024-08-16,04:48:05 | INFO | Train Epoch: 35 [ 256/27000 (1%)] Data (t): 1.466 Batch (t): 2.711, 94.4327/s, 94.4327/s/gpu LR: 0.000184 Logit Scale: 17.985 Contrastive_loss: 0.032819 (0.032819) Loss: 0.032819 (0.032819)
384
+ 2024-08-16,04:50:10 | INFO | Train Epoch: 35 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.250, 204.457/s, 204.457/s/gpu LR: 0.000189 Logit Scale: 18.066 Contrastive_loss: 0.047232 (0.040025) Loss: 0.047232 (0.040025)
385
+ 2024-08-16,04:50:15 | INFO | Train Epoch: 35 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.589/s, 204.589/s/gpu LR: 0.000189 Logit Scale: 18.069 Contrastive_loss: 0.056291 (0.045447) Loss: 0.056291 (0.045447)
386
+ 2024-08-16,04:50:17 | INFO | Eval Epoch: 36 [256 / 3000] Clip Loss: 7.278274
387
+ 2024-08-16,04:50:22 | INFO | Eval Epoch: 36 image_to_text_mean_rank: 1058.5333 image_to_text_median_rank: 847.0000 image_to_text_R@1: 0.0030 image_to_text_R@5: 0.0147 image_to_text_R@10: 0.0223 text_to_image_mean_rank: 1052.5207 text_to_image_median_rank: 836.0000 text_to_image_R@1: 0.0023 text_to_image_R@5: 0.0133 text_to_image_R@10: 0.0233 clip_val_loss: 7.2311 epoch: 36.0000 num_samples: 3000.0000
388
+ 2024-08-16,04:50:23 | INFO | Start epoch 36
389
+ 2024-08-16,04:50:26 | INFO | Train Epoch: 36 [ 256/27000 (1%)] Data (t): 1.454 Batch (t): 2.699, 94.8579/s, 94.8579/s/gpu LR: 0.000189 Logit Scale: 18.070 Contrastive_loss: 0.030523 (0.030523) Loss: 0.030523 (0.030523)
390
+ 2024-08-16,04:52:31 | INFO | Train Epoch: 36 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.636/s, 204.636/s/gpu LR: 0.000194 Logit Scale: 18.156 Contrastive_loss: 0.043999 (0.037261) Loss: 0.043999 (0.037261)
391
+ 2024-08-16,04:52:36 | INFO | Train Epoch: 36 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.744/s, 204.744/s/gpu LR: 0.000194 Logit Scale: 18.160 Contrastive_loss: 0.037191 (0.037238) Loss: 0.037191 (0.037238)
392
+ 2024-08-16,04:52:38 | INFO | Eval Epoch: 37 [256 / 3000] Clip Loss: 7.176402
393
+ 2024-08-16,04:52:42 | INFO | Eval Epoch: 37 image_to_text_mean_rank: 1054.3230 image_to_text_median_rank: 875.0000 image_to_text_R@1: 0.0017 image_to_text_R@5: 0.0090 image_to_text_R@10: 0.0180 text_to_image_mean_rank: 1051.4680 text_to_image_median_rank: 860.0000 text_to_image_R@1: 0.0027 text_to_image_R@5: 0.0127 text_to_image_R@10: 0.0197 clip_val_loss: 7.2518 epoch: 37.0000 num_samples: 3000.0000
394
+ 2024-08-16,04:52:44 | INFO | Start epoch 37
395
+ 2024-08-16,04:52:46 | INFO | Train Epoch: 37 [ 256/27000 (1%)] Data (t): 1.471 Batch (t): 2.715, 94.2910/s, 94.2910/s/gpu LR: 0.000194 Logit Scale: 18.161 Contrastive_loss: 0.049741 (0.049741) Loss: 0.049741 (0.049741)
396
+ 2024-08-16,04:54:51 | INFO | Train Epoch: 37 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.555/s, 204.555/s/gpu LR: 0.000199 Logit Scale: 18.264 Contrastive_loss: 0.054978 (0.052359) Loss: 0.054978 (0.052359)
397
+ 2024-08-16,04:54:56 | INFO | Train Epoch: 37 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.389/s, 204.389/s/gpu LR: 0.000199 Logit Scale: 18.269 Contrastive_loss: 0.049099 (0.051273) Loss: 0.049099 (0.051273)
398
+ 2024-08-16,04:54:58 | INFO | Eval Epoch: 38 [256 / 3000] Clip Loss: 7.144962
399
+ 2024-08-16,04:55:03 | INFO | Eval Epoch: 38 image_to_text_mean_rank: 1057.0257 image_to_text_median_rank: 858.0000 image_to_text_R@1: 0.0023 image_to_text_R@5: 0.0100 image_to_text_R@10: 0.0217 text_to_image_mean_rank: 1049.7160 text_to_image_median_rank: 842.0000 text_to_image_R@1: 0.0027 text_to_image_R@5: 0.0123 text_to_image_R@10: 0.0223 clip_val_loss: 7.2849 epoch: 38.0000 num_samples: 3000.0000
400
+ 2024-08-16,04:55:04 | INFO | Start epoch 38
401
+ 2024-08-16,04:55:07 | INFO | Train Epoch: 38 [ 256/27000 (1%)] Data (t): 1.472 Batch (t): 2.718, 94.1733/s, 94.1733/s/gpu LR: 0.000200 Logit Scale: 18.270 Contrastive_loss: 0.066994 (0.066994) Loss: 0.066994 (0.066994)
402
+ 2024-08-16,04:57:12 | INFO | Train Epoch: 38 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.448/s, 204.448/s/gpu LR: 0.000205 Logit Scale: 18.393 Contrastive_loss: 0.075224 (0.071109) Loss: 0.075224 (0.071109)
403
+ 2024-08-16,04:57:17 | INFO | Train Epoch: 38 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.568/s, 204.568/s/gpu LR: 0.000205 Logit Scale: 18.399 Contrastive_loss: 0.057282 (0.066500) Loss: 0.057282 (0.066500)
404
+ 2024-08-16,04:57:19 | INFO | Eval Epoch: 39 [256 / 3000] Clip Loss: 7.157355
405
+ 2024-08-16,04:57:24 | INFO | Eval Epoch: 39 image_to_text_mean_rank: 1028.1930 image_to_text_median_rank: 819.0000 image_to_text_R@1: 0.0033 image_to_text_R@5: 0.0100 image_to_text_R@10: 0.0190 text_to_image_mean_rank: 1020.0573 text_to_image_median_rank: 827.0000 text_to_image_R@1: 0.0043 text_to_image_R@5: 0.0140 text_to_image_R@10: 0.0230 clip_val_loss: 7.2089 epoch: 39.0000 num_samples: 3000.0000
406
+ 2024-08-16,04:57:25 | INFO | Start epoch 39
407
+ 2024-08-16,04:57:28 | INFO | Train Epoch: 39 [ 256/27000 (1%)] Data (t): 1.482 Batch (t): 2.727, 93.8673/s, 93.8673/s/gpu LR: 0.000205 Logit Scale: 18.400 Contrastive_loss: 0.056646 (0.056646) Loss: 0.056646 (0.056646)
408
+ 2024-08-16,04:59:33 | INFO | Train Epoch: 39 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.481/s, 204.481/s/gpu LR: 0.000210 Logit Scale: 18.561 Contrastive_loss: 0.055771 (0.056209) Loss: 0.055771 (0.056209)
409
+ 2024-08-16,04:59:38 | INFO | Train Epoch: 39 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.646/s, 204.646/s/gpu LR: 0.000210 Logit Scale: 18.567 Contrastive_loss: 0.057177 (0.056531) Loss: 0.057177 (0.056531)
410
+ 2024-08-16,04:59:39 | INFO | Eval Epoch: 40 [256 / 3000] Clip Loss: 7.104994
411
+ 2024-08-16,04:59:44 | INFO | Eval Epoch: 40 image_to_text_mean_rank: 1018.7757 image_to_text_median_rank: 782.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0103 image_to_text_R@10: 0.0183 text_to_image_mean_rank: 1008.3393 text_to_image_median_rank: 782.0000 text_to_image_R@1: 0.0017 text_to_image_R@5: 0.0150 text_to_image_R@10: 0.0247 clip_val_loss: 7.2402 epoch: 40.0000 num_samples: 3000.0000
412
+ 2024-08-16,04:59:46 | INFO | Start epoch 40
413
+ 2024-08-16,04:59:48 | INFO | Train Epoch: 40 [ 256/27000 (1%)] Data (t): 1.463 Batch (t): 2.709, 94.4969/s, 94.4969/s/gpu LR: 0.000210 Logit Scale: 18.569 Contrastive_loss: 0.090862 (0.090862) Loss: 0.090862 (0.090862)
414
+ 2024-08-16,05:01:54 | INFO | Train Epoch: 40 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.537/s, 204.537/s/gpu LR: 0.000215 Logit Scale: 18.769 Contrastive_loss: 0.094119 (0.092491) Loss: 0.094119 (0.092491)
415
+ 2024-08-16,05:01:59 | INFO | Train Epoch: 40 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.528/s, 204.528/s/gpu LR: 0.000215 Logit Scale: 18.779 Contrastive_loss: 0.10408 (0.096353) Loss: 0.10408 (0.096353)
416
+ 2024-08-16,05:02:00 | INFO | Eval Epoch: 41 [256 / 3000] Clip Loss: 7.132548
417
+ 2024-08-16,05:02:05 | INFO | Eval Epoch: 41 image_to_text_mean_rank: 1011.8893 image_to_text_median_rank: 803.0000 image_to_text_R@1: 0.0030 image_to_text_R@5: 0.0137 image_to_text_R@10: 0.0220 text_to_image_mean_rank: 999.8167 text_to_image_median_rank: 784.0000 text_to_image_R@1: 0.0033 text_to_image_R@5: 0.0133 text_to_image_R@10: 0.0220 clip_val_loss: 7.2351 epoch: 41.0000 num_samples: 3000.0000
418
+ 2024-08-16,05:02:06 | INFO | Start epoch 41
419
+ 2024-08-16,05:02:09 | INFO | Train Epoch: 41 [ 256/27000 (1%)] Data (t): 1.459 Batch (t): 2.706, 94.6002/s, 94.6002/s/gpu LR: 0.000215 Logit Scale: 18.782 Contrastive_loss: 0.085834 (0.085834) Loss: 0.085834 (0.085834)
420
+ 2024-08-16,05:04:14 | INFO | Train Epoch: 41 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.523/s, 204.523/s/gpu LR: 0.000220 Logit Scale: 19.049 Contrastive_loss: 4.7974 (2.4416) Loss: 4.7974 (2.4416)
421
+ 2024-08-16,05:04:19 | INFO | Train Epoch: 41 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.451/s, 204.451/s/gpu LR: 0.000221 Logit Scale: 19.044 Contrastive_loss: 4.6688 (3.1840) Loss: 4.6688 (3.1840)
422
+ 2024-08-16,05:04:21 | INFO | Eval Epoch: 42 [256 / 3000] Clip Loss: 4.982571
423
+ 2024-08-16,05:04:26 | INFO | Eval Epoch: 42 image_to_text_mean_rank: 754.0190 image_to_text_median_rank: 534.0000 image_to_text_R@1: 0.0013 image_to_text_R@5: 0.0113 image_to_text_R@10: 0.0213 text_to_image_mean_rank: 667.4670 text_to_image_median_rank: 441.0000 text_to_image_R@1: 0.0027 text_to_image_R@5: 0.0110 text_to_image_R@10: 0.0227 clip_val_loss: 5.0287 epoch: 42.0000 num_samples: 3000.0000
424
+ 2024-08-16,05:04:27 | INFO | Start epoch 42
425
+ 2024-08-16,05:04:30 | INFO | Train Epoch: 42 [ 256/27000 (1%)] Data (t): 1.476 Batch (t): 2.722, 94.0346/s, 94.0346/s/gpu LR: 0.000221 Logit Scale: 19.043 Contrastive_loss: 4.5297 (4.5297) Loss: 4.5297 (4.5297)
426
+ 2024-08-16,05:06:35 | INFO | Train Epoch: 42 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.332/s, 204.332/s/gpu LR: 0.000226 Logit Scale: 19.079 Contrastive_loss: 2.9309 (3.7303) Loss: 2.9309 (3.7303)
427
+ 2024-08-16,05:06:40 | INFO | Train Epoch: 42 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.256/s, 204.256/s/gpu LR: 0.000226 Logit Scale: 19.079 Contrastive_loss: 3.1745 (3.5451) Loss: 3.1745 (3.5451)
428
+ 2024-08-16,05:06:42 | INFO | Eval Epoch: 43 [256 / 3000] Clip Loss: 5.689608
429
+ 2024-08-16,05:06:47 | INFO | Eval Epoch: 43 image_to_text_mean_rank: 761.4207 image_to_text_median_rank: 525.0000 image_to_text_R@1: 0.0027 image_to_text_R@5: 0.0130 image_to_text_R@10: 0.0227 text_to_image_mean_rank: 737.0500 text_to_image_median_rank: 503.0000 text_to_image_R@1: 0.0033 text_to_image_R@5: 0.0167 text_to_image_R@10: 0.0263 clip_val_loss: 5.9352 epoch: 43.0000 num_samples: 3000.0000
430
+ 2024-08-16,05:06:48 | INFO | Start epoch 43
431
+ 2024-08-16,05:06:51 | INFO | Train Epoch: 43 [ 256/27000 (1%)] Data (t): 1.538 Batch (t): 2.784, 91.9524/s, 91.9524/s/gpu LR: 0.000226 Logit Scale: 19.080 Contrastive_loss: 1.8070 (1.8070) Loss: 1.8070 (1.8070)
432
+ 2024-08-16,05:08:56 | INFO | Train Epoch: 43 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.467/s, 204.467/s/gpu LR: 0.000231 Logit Scale: 19.650 Contrastive_loss: 1.5848 (1.6959) Loss: 1.5848 (1.6959)
433
+ 2024-08-16,05:09:01 | INFO | Train Epoch: 43 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.253, 204.317/s, 204.317/s/gpu LR: 0.000231 Logit Scale: 19.674 Contrastive_loss: 1.6317 (1.6745) Loss: 1.6317 (1.6745)
434
+ 2024-08-16,05:09:03 | INFO | Eval Epoch: 44 [256 / 3000] Clip Loss: 6.758216
435
+ 2024-08-16,05:09:07 | INFO | Eval Epoch: 44 image_to_text_mean_rank: 887.7003 image_to_text_median_rank: 669.0000 image_to_text_R@1: 0.0010 image_to_text_R@5: 0.0093 image_to_text_R@10: 0.0173 text_to_image_mean_rank: 871.1403 text_to_image_median_rank: 657.0000 text_to_image_R@1: 0.0043 text_to_image_R@5: 0.0137 text_to_image_R@10: 0.0223 clip_val_loss: 6.9089 epoch: 44.0000 num_samples: 3000.0000
436
+ 2024-08-16,05:09:09 | INFO | Start epoch 44
437
+ 2024-08-16,05:09:11 | INFO | Train Epoch: 44 [ 256/27000 (1%)] Data (t): 1.472 Batch (t): 2.719, 94.1646/s, 94.1646/s/gpu LR: 0.000231 Logit Scale: 19.680 Contrastive_loss: 0.64078 (0.64078) Loss: 0.64078 (0.64078)
438
+ 2024-08-16,05:11:17 | INFO | Train Epoch: 44 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.252, 204.727/s, 204.727/s/gpu LR: 0.000236 Logit Scale: 20.250 Contrastive_loss: 0.27891 (0.45985) Loss: 0.27891 (0.45985)
439
+ 2024-08-16,05:11:22 | INFO | Train Epoch: 44 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.648/s, 204.648/s/gpu LR: 0.000236 Logit Scale: 20.270 Contrastive_loss: 0.23861 (0.38610) Loss: 0.23861 (0.38610)
440
+ 2024-08-16,05:11:23 | INFO | Eval Epoch: 45 [256 / 3000] Clip Loss: 7.229187
441
+ 2024-08-16,05:11:28 | INFO | Eval Epoch: 45 image_to_text_mean_rank: 923.1860 image_to_text_median_rank: 682.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0097 image_to_text_R@10: 0.0190 text_to_image_mean_rank: 912.5870 text_to_image_median_rank: 674.0000 text_to_image_R@1: 0.0027 text_to_image_R@5: 0.0107 text_to_image_R@10: 0.0200 clip_val_loss: 7.3445 epoch: 45.0000 num_samples: 3000.0000
442
+ 2024-08-16,05:11:29 | INFO | Start epoch 45
443
+ 2024-08-16,05:11:32 | INFO | Train Epoch: 45 [ 256/27000 (1%)] Data (t): 1.496 Batch (t): 2.740, 93.4189/s, 93.4189/s/gpu LR: 0.000236 Logit Scale: 20.275 Contrastive_loss: 0.11670 (0.11670) Loss: 0.11670 (0.11670)
444
+ 2024-08-16,05:13:37 | INFO | Train Epoch: 45 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.578/s, 204.578/s/gpu LR: 0.000241 Logit Scale: 20.541 Contrastive_loss: 0.10041 (0.10856) Loss: 0.10041 (0.10856)
445
+ 2024-08-16,05:13:42 | INFO | Train Epoch: 45 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.456/s, 204.456/s/gpu LR: 0.000242 Logit Scale: 20.549 Contrastive_loss: 0.039388 (0.085500) Loss: 0.039388 (0.085500)
446
+ 2024-08-16,05:13:44 | INFO | Eval Epoch: 46 [256 / 3000] Clip Loss: 7.590382
447
+ 2024-08-16,05:13:49 | INFO | Eval Epoch: 46 image_to_text_mean_rank: 955.1153 image_to_text_median_rank: 728.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0107 image_to_text_R@10: 0.0187 text_to_image_mean_rank: 954.1097 text_to_image_median_rank: 736.0000 text_to_image_R@1: 0.0013 text_to_image_R@5: 0.0130 text_to_image_R@10: 0.0230 clip_val_loss: 7.6622 epoch: 46.0000 num_samples: 3000.0000
448
+ 2024-08-16,05:13:50 | INFO | Start epoch 46
449
+ 2024-08-16,05:13:53 | INFO | Train Epoch: 46 [ 256/27000 (1%)] Data (t): 1.480 Batch (t): 2.723, 94.0089/s, 94.0089/s/gpu LR: 0.000242 Logit Scale: 20.551 Contrastive_loss: 0.056549 (0.056549) Loss: 0.056549 (0.056549)
450
+ 2024-08-16,05:15:58 | INFO | Train Epoch: 46 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.542/s, 204.542/s/gpu LR: 0.000247 Logit Scale: 20.695 Contrastive_loss: 0.031992 (0.044271) Loss: 0.031992 (0.044271)
451
+ 2024-08-16,05:16:03 | INFO | Train Epoch: 46 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.581/s, 204.581/s/gpu LR: 0.000247 Logit Scale: 20.701 Contrastive_loss: 0.046248 (0.044930) Loss: 0.046248 (0.044930)
452
+ 2024-08-16,05:16:05 | INFO | Eval Epoch: 47 [256 / 3000] Clip Loss: 7.512671
453
+ 2024-08-16,05:16:10 | INFO | Eval Epoch: 47 image_to_text_mean_rank: 973.5813 image_to_text_median_rank: 747.0000 image_to_text_R@1: 0.0017 image_to_text_R@5: 0.0097 image_to_text_R@10: 0.0177 text_to_image_mean_rank: 973.0503 text_to_image_median_rank: 743.0000 text_to_image_R@1: 0.0010 text_to_image_R@5: 0.0100 text_to_image_R@10: 0.0210 clip_val_loss: 7.7592 epoch: 47.0000 num_samples: 3000.0000
454
+ 2024-08-16,05:16:11 | INFO | Start epoch 47
455
+ 2024-08-16,05:16:14 | INFO | Train Epoch: 47 [ 256/27000 (1%)] Data (t): 1.489 Batch (t): 2.734, 93.6261/s, 93.6261/s/gpu LR: 0.000247 Logit Scale: 20.702 Contrastive_loss: 0.042419 (0.042419) Loss: 0.042419 (0.042419)
456
+ 2024-08-16,05:18:19 | INFO | Train Epoch: 47 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.251, 204.663/s, 204.663/s/gpu LR: 0.000252 Logit Scale: 20.817 Contrastive_loss: 0.037442 (0.039930) Loss: 0.037442 (0.039930)
457
+ 2024-08-16,05:18:24 | INFO | Train Epoch: 47 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.251, 204.630/s, 204.630/s/gpu LR: 0.000252 Logit Scale: 20.821 Contrastive_loss: 0.057514 (0.045792) Loss: 0.057514 (0.045792)
458
+ 2024-08-16,05:18:25 | INFO | Eval Epoch: 48 [256 / 3000] Clip Loss: 7.678425
459
+ 2024-08-16,05:18:30 | INFO | Eval Epoch: 48 image_to_text_mean_rank: 995.8417 image_to_text_median_rank: 768.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0120 image_to_text_R@10: 0.0183 text_to_image_mean_rank: 996.3487 text_to_image_median_rank: 770.0000 text_to_image_R@1: 0.0030 text_to_image_R@5: 0.0130 text_to_image_R@10: 0.0197 clip_val_loss: 7.8886 epoch: 48.0000 num_samples: 3000.0000
460
+ 2024-08-16,05:18:32 | INFO | Start epoch 48
461
+ 2024-08-16,05:18:34 | INFO | Train Epoch: 48 [ 256/27000 (1%)] Data (t): 1.486 Batch (t): 2.731, 93.7329/s, 93.7329/s/gpu LR: 0.000252 Logit Scale: 20.822 Contrastive_loss: 0.037203 (0.037203) Loss: 0.037203 (0.037203)
462
+ 2024-08-16,05:20:39 | INFO | Train Epoch: 48 [25856/27000 (96%)] Data (t): 0.000 Batch (t): 1.250, 204.601/s, 204.601/s/gpu LR: 0.000257 Logit Scale: 20.930 Contrastive_loss: 0.030419 (0.033811) Loss: 0.030419 (0.033811)
463
+ 2024-08-16,05:20:44 | INFO | Train Epoch: 48 [26880/27000 (100%)] Data (t): 0.002 Batch (t): 1.252, 204.401/s, 204.401/s/gpu LR: 0.000257 Logit Scale: 20.934 Contrastive_loss: 0.017202 (0.028274) Loss: 0.017202 (0.028274)
464
+ 2024-08-16,05:20:46 | INFO | Eval Epoch: 49 [256 / 3000] Clip Loss: 7.609719
465
+ 2024-08-16,05:20:51 | INFO | Eval Epoch: 49 image_to_text_mean_rank: 1004.3947 image_to_text_median_rank: 827.0000 image_to_text_R@1: 0.0020 image_to_text_R@5: 0.0107 image_to_text_R@10: 0.0193 text_to_image_mean_rank: 1001.3020 text_to_image_median_rank: 814.0000 text_to_image_R@1: 0.0020 text_to_image_R@5: 0.0107 text_to_image_R@10: 0.0200 clip_val_loss: 7.9246 epoch: 49.0000 num_samples: 3000.0000
466
+ 2024-08-16,05:20:52 | INFO | Start epoch 49
467
+ 2024-08-16,05:20:55 | INFO | Train Epoch: 49 [ 256/27000 (1%)] Data (t): 1.470 Batch (t): 2.716, 94.2600/s, 94.2600/s/gpu LR: 0.000257 Logit Scale: 20.936 Contrastive_loss: 0.031021 (0.031021) Loss: 0.031021 (0.031021)
checkpoint/test00/params.txt ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ accum_freq: 1
2
+ aug_cfg: {}
3
+ batch_size: 256
4
+ beta1: 0.9
5
+ beta2: 0.999
6
+ checkpoint_path: ./logs/2024_08_16-03_25_52-model_Align-fMRI-Encoder-small-lr_0.0005-b_256-j_4-p_amp/checkpoints
7
+ coca_caption_loss_weight: 2.0
8
+ coca_contrastive_loss_weight: 1.0
9
+ copy_codebase: False
10
+ csv_caption_key: title
11
+ csv_img_key: filepath
12
+ csv_separator: ,
13
+ dataset_resampled: False
14
+ dataset_type: auto
15
+ ddp_static_graph: False
16
+ debug: False
17
+ delete_previous_checkpoint: False
18
+ device: cuda:0
19
+ dist_backend: nccl
20
+ dist_url: env://
21
+ distill: False
22
+ distill_model: None
23
+ distill_pretrained: None
24
+ distributed: False
25
+ epochs: 100
26
+ epochs_cooldown: None
27
+ eps: 1e-08
28
+ force_custom_text: False
29
+ force_image_size: None
30
+ force_patch_dropout: None
31
+ force_quick_gelu: False
32
+ gather_with_grad: False
33
+ grad_checkpointing: False
34
+ grad_clip_norm: None
35
+ horovod: False
36
+ image_interpolation: None
37
+ image_mean: None
38
+ image_resize_mode: None
39
+ image_std: None
40
+ imagenet_v2: None
41
+ imagenet_val: None
42
+ local_loss: False
43
+ local_rank: 0
44
+ lock_image: False
45
+ lock_image_freeze_bn_stats: False
46
+ lock_image_unlocked_groups: 0
47
+ lock_text: True
48
+ lock_text_freeze_layer_norm: False
49
+ lock_text_unlocked_layers: 0
50
+ log_every_n_steps: 100
51
+ log_level: 20
52
+ log_local: False
53
+ log_path: ./logs/2024_08_16-03_25_52-model_Align-fMRI-Encoder-small-lr_0.0005-b_256-j_4-p_amp/out.log
54
+ logs: ./logs/
55
+ lr: 0.0005
56
+ lr_cooldown_end: 0.0
57
+ lr_cooldown_power: 1.0
58
+ lr_scheduler: cosine
59
+ model: Align-fMRI-Encoder-small
60
+ name: 2024_08_16-03_25_52-model_Align-fMRI-Encoder-small-lr_0.0005-b_256-j_4-p_amp
61
+ no_set_device_rank: False
62
+ precision: amp
63
+ pretrained:
64
+ pretrained_image: False
65
+ rank: 0
66
+ remote_sync: None
67
+ remote_sync_frequency: 300
68
+ remote_sync_protocol: s3
69
+ report_to:
70
+ resume: None
71
+ save_frequency: 1
72
+ save_most_recent: False
73
+ seed: 0
74
+ siglip: False
75
+ skip_scheduler: False
76
+ tensorboard: False
77
+ tensorboard_path:
78
+ torchcompile: False
79
+ torchscript: False
80
+ trace: False
81
+ train_data: /root/autodl-tmp/.autodl/Projects/fMRI2TextAligner/notebooks/train.csv
82
+ train_data_upsampling_factors: None
83
+ train_num_samples: None
84
+ use_bn_sync: False
85
+ use_bnb_linear: None
86
+ val_data: /root/autodl-tmp/.autodl/Projects/fMRI2TextAligner/notebooks/val.csv
87
+ val_frequency: 1
88
+ val_num_samples: None
89
+ wandb: False
90
+ wandb_notes:
91
+ wandb_project_name: open-clip
92
+ warmup: 10000
93
+ wd: 0.2
94
+ workers: 4
95
+ world_size: 1
96
+ zeroshot_frequency: 2