Create full_with_bank_1m_samples_output.txt

Browse files

Files changed (1) hide show

training_metrics/full_with_bank_1m_samples_output.txt +99 -0

training_metrics/full_with_bank_1m_samples_output.txt ADDED Viewed

	@@ -0,0 +1,99 @@

+=================================================================
+CO-TRAIN: Student + Alignment Bank (unfrozen)
+=================================================================
+  Device: cuda
+  LR encoder: 0.0001  LR bank: 0.0005
+  Bank weight: 0.2
+=================================================================
+PHASE 0: LOAD CACHED EMBEDDINGS
+=================================================================
+  bert: torch.Size([500000, 768])
+  modern: torch.Size([500000, 768])
+  roberta: torch.Size([500000, 768])
+  albert: torch.Size([500000, 768])
+  distil: torch.Size([500000, 768])
+  Captions: 500,000, using 500,000
+=================================================================
+PHASE 1: GPA ALIGNMENT
+=================================================================
+  GPA iter 1: delta=1.99174462
+  GPA iter 5: delta=0.00009400
+  GPA iter 10: delta=0.00001988
+  GPA iter 15: delta=0.00000849
+  cos(consensus, bert): 0.9880
+  cos(consensus, modern): 0.9831
+  cos(consensus, roberta): 0.9885
+  cos(consensus, albert): 0.9864
+  cos(consensus, distil): 0.9909
+  Consensus CV: 0.2543
+=================================================================
+PHASE 2: LOAD MODEL (unfrozen)
+=================================================================
+Loading weights: 100%
+ 112/112 [00:00<00:00, 3856.51it/s, Materializing param=token_emb.weight]
+  Encoder: 25,958,016 params
+  Bank: 6,466,944 params (present)
+  Total: 32,424,960 params (ALL unfrozen)
+=================================================================
+PHASE 3: TOKENIZE
+=================================================================
+  Tokenizing 500,000 captions...
+  Train: 495,000  Val: 5,000
+=================================================================
+PHASE 4: CO-TRAIN (encoder + bank)
+=================================================================
+  Tensorboard: runs/cotrain_20260313_072033
+E 1/2: 100%|██████████| 3868/3868 [11:28<00:00,  5.62batch/s, bank=0.2956, cos=0.901, loss=0.0720]
+  E 1: 688s  step=3868
+    Student: v_cos=0.8939±0.0407  v_acc=0.999  v_cv=0.2198  eff_dim=74.1
+    Losses:  nce=0.0086  mse=0.0003  bank=0.2956
+    Bank:    agr=0.000000  ortho=0.000002  entropy=2.8501  emb_cv=0.2118
+             exp_cos=0.535±0.001  disagree=0.000000  spread=0.01467
+    Context: geo_eff_dim=16.6  geo_cv=0.4483
+    ★ New best: v_cos=0.8939
+E 2/2: 100%|██████████| 3868/3868 [11:29<00:00,  5.61batch/s, bank=0.3114, cos=0.895, loss=0.0817]
+  E 2: 689s  step=7736
+    Student: v_cos=0.8917±0.0400  v_acc=0.999  v_cv=0.2086  eff_dim=73.7
+    Losses:  nce=0.0118  mse=0.0003  bank=0.3114
+    Bank:    agr=0.000000  ortho=0.000002  entropy=2.7060  emb_cv=0.1957
+             exp_cos=0.558±0.001  disagree=0.000000  spread=0.01583
+    Context: geo_eff_dim=15.8  geo_cv=0.5315
+=================================================================
+PHASE 5: VERIFICATION
+=================================================================
+  Enriched: torch.Size([10, 896])
+  Geo: {'expert_cos_mean': 0.5349530577659607, 'expert_cos_std': 0.001167053822427988, 'cross_expert_cos': 0.045003507286310196, 'cross_expert_cos_std': 0.03178434446454048, 'anchor_max_cos': 0.7455679774284363, 'anchor_mean_cos': -0.04277874901890755, 'disagreement_ratio': 0.0006186707178130746, 'norm_ratio_spread': 0.4806589186191559}
+  Pairwise cosines:
+    [0]↔[1]: 0.788  (A cat sitting on a windowsill ↔ A dog playing in the park)
+    [0]↔[2]: 0.622  (A cat sitting on a windowsill ↔ A still life painting with flo)
+    [0]↔[3]: 0.741  (A cat sitting on a windowsill ↔ A child riding a bicycle)
+    [1]↔[2]: 0.582  (A dog playing in the park ↔ A still life painting with flo)
+    [1]↔[3]: 0.851  (A dog playing in the park ↔ A child riding a bicycle)
+    [2]↔[3]: 0.639  (A still life painting with flo ↔ A child riding a bicycle)
+=================================================================
+SUMMARY
+=================================================================
+  Best v_cos:    0.8939
+  Final v_cv:    0.2029
+  Consensus CV:  0.2543
+  Val R@1:       0.999
+  Encoder LR:    0.0001
+  Bank LR:       0.0005
+  Bank weight:   0.2
+  Saved: cotrain_best.pt, cotrain_final.pt
+  Tensorboard: runs/cotrain_20260313_072033
+=================================================================
+DONE
+=================================================================