Upload checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins
Browse files
checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/wandb/offline-run-20260129_221543-vlm_gym_mental_rotation_3d_pad3_by_axis_one_img_lr2e_5_mse_only_ins-run0/files/output.log
CHANGED
|
@@ -850,6 +850,15 @@ wandb: For more information, check out the docs at: https://weave-docs.wandb.ai/
|
|
| 850 |
[[34m2026-01-30 02:20:25[39m] (step=0000839) Train Loss mse: 0.0258, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 851 |
[[34m2026-01-30 02:20:41[39m] (step=0000840) Train Loss mse: 0.0242, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 852 |
[[34m2026-01-30 02:20:58[39m] (step=0000841) Train Loss mse: 0.0250, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 853 |
FullyShardedDataParallel(
|
| 854 |
(_fsdp_wrapped_module): Bagel(
|
| 855 |
(language_model): Qwen2ForCausalLM(
|
|
@@ -1043,15 +1052,13 @@ Preparing Dataset vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce
|
|
| 1043 |
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 1044 |
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 1045 |
ce_avg: 0.0, mse_avg: 0.024334488436579704
|
| 1046 |
-
|
| 1047 |
-
|
| 1048 |
-
[
|
| 1049 |
-
[
|
| 1050 |
-
[
|
| 1051 |
-
[
|
| 1052 |
-
|
| 1053 |
-
[[34m2026-01-30 02:23:12[39m] (step=0000849) Train Loss mse: 0.0229, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 1054 |
-
[[34m2026-01-30 02:23:29[39m] (step=0000850) Train Loss mse: 0.0265, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 1055 |
[[34m2026-01-30 02:23:46[39m] (step=0000851) Train Loss mse: 0.0230, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 1056 |
[[34m2026-01-30 02:24:03[39m] (step=0000852) Train Loss mse: 0.0253, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 1057 |
[[34m2026-01-30 02:24:20[39m] (step=0000853) Train Loss mse: 0.0239, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
@@ -2200,6 +2207,20 @@ ce_avg: 0.0, mse_avg: 0.024334488436579704
|
|
| 2200 |
[[34m2026-01-30 07:47:18[39m] (step=0001996) Train Loss mse: 0.0242, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2201 |
[[34m2026-01-30 07:47:34[39m] (step=0001997) Train Loss mse: 0.0238, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2202 |
[[34m2026-01-30 07:47:51[39m] (step=0001998) Train Loss mse: 0.0238, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2203 |
[[34m2026-01-30 07:48:08[39m] (step=0001999) Train Loss mse: 0.0235, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2204 |
[[34m2026-01-30 07:49:58[39m] (step=0002000) Train Loss mse: 0.0214, Train Loss ce: 0.0000, Train Steps/Sec: 0.01,
|
| 2205 |
[[34m2026-01-30 07:50:15[39m] (step=0002001) Train Loss mse: 0.0226, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
@@ -2264,20 +2285,6 @@ ce_avg: 0.0, mse_avg: 0.024334488436579704
|
|
| 2264 |
[[34m2026-01-30 08:06:43[39m] (step=0002060) Train Loss mse: 0.0223, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2265 |
[[34m2026-01-30 08:06:59[39m] (step=0002061) Train Loss mse: 0.0213, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2266 |
[[34m2026-01-30 08:07:16[39m] (step=0002062) Train Loss mse: 0.0225, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2267 |
-
base_dir is /dev/shm/models/checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_mental_rotation_3d_pad3_by_axis_one_img_lr2e_5_mse_only_ins_step2000
|
| 2268 |
-
Preparing Dataset vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce/vlm_gym_mental_rotation_3d_pad3_by_axis_val
|
| 2269 |
-
[eval debug] first 3 batch fingerprints:
|
| 2270 |
-
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 2271 |
-
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 2272 |
-
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 2273 |
-
ce_avg: 0.0, mse_avg: 0.024451851844787598
|
| 2274 |
-
base_dir is /dev/shm/models/checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_mental_rotation_3d_pad3_by_axis_one_img_lr2e_5_mse_only_ins_step2500
|
| 2275 |
-
Preparing Dataset vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce/vlm_gym_mental_rotation_3d_pad3_by_axis_val
|
| 2276 |
-
[eval debug] first 3 batch fingerprints:
|
| 2277 |
-
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 2278 |
-
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 2279 |
-
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 2280 |
-
ce_avg: 0.0, mse_avg: 0.024314723908901215
|
| 2281 |
[[34m2026-01-30 08:07:33[39m] (step=0002063) Train Loss mse: 0.0232, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2282 |
[[34m2026-01-30 08:07:50[39m] (step=0002064) Train Loss mse: 0.0223, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2283 |
[[34m2026-01-30 08:08:07[39m] (step=0002065) Train Loss mse: 0.0223, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
@@ -3322,28 +3329,34 @@ ce_avg: 0.0, mse_avg: 0.024314723908901215
|
|
| 3322 |
[[34m2026-01-30 13:01:56[39m] (step=0003104) Train Loss mse: 0.0206, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3323 |
[[34m2026-01-30 13:02:13[39m] (step=0003105) Train Loss mse: 0.0218, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3324 |
[[34m2026-01-30 13:02:30[39m] (step=0003106) Train Loss mse: 0.0204, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3325 |
-
|
| 3326 |
-
|
| 3327 |
-
[
|
| 3328 |
-
|
| 3329 |
-
|
| 3330 |
-
|
| 3331 |
-
|
| 3332 |
-
|
| 3333 |
-
|
| 3334 |
-
[
|
| 3335 |
-
|
| 3336 |
-
|
| 3337 |
-
|
| 3338 |
-
|
| 3339 |
-
|
| 3340 |
-
|
| 3341 |
-
[
|
| 3342 |
-
|
| 3343 |
-
|
| 3344 |
-
|
| 3345 |
-
|
| 3346 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3347 |
[[34m2026-01-30 13:10:37[39m] (step=0003135) Train Loss mse: 0.0237, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3348 |
[[34m2026-01-30 13:10:54[39m] (step=0003136) Train Loss mse: 0.0221, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3349 |
[[34m2026-01-30 13:11:11[39m] (step=0003137) Train Loss mse: 0.0219, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
@@ -3364,6 +3377,20 @@ ce_avg: 0.0, mse_avg: 0.024172412231564522
|
|
| 3364 |
[[34m2026-01-30 13:15:23[39m] (step=0003152) Train Loss mse: 0.0230, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3365 |
[[34m2026-01-30 13:15:40[39m] (step=0003153) Train Loss mse: 0.0212, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3366 |
[[34m2026-01-30 13:15:56[39m] (step=0003154) Train Loss mse: 0.0249, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3367 |
[[34m2026-01-30 13:16:13[39m] (step=0003155) Train Loss mse: 0.0206, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3368 |
[[34m2026-01-30 13:16:30[39m] (step=0003156) Train Loss mse: 0.0216, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3369 |
[[34m2026-01-30 13:16:47[39m] (step=0003157) Train Loss mse: 0.0233, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
@@ -4389,20 +4416,6 @@ ce_avg: 0.0, mse_avg: 0.024172412231564522
|
|
| 4389 |
[[34m2026-01-30 18:05:06[39m] (step=0004177) Train Loss mse: 0.0218, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4390 |
[[34m2026-01-30 18:05:22[39m] (step=0004178) Train Loss mse: 0.0224, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4391 |
[[34m2026-01-30 18:05:39[39m] (step=0004179) Train Loss mse: 0.0202, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4392 |
-
base_dir is /dev/shm/models/checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_mental_rotation_3d_pad3_by_axis_one_img_lr2e_5_mse_only_ins_step4500
|
| 4393 |
-
Preparing Dataset vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce/vlm_gym_mental_rotation_3d_pad3_by_axis_val
|
| 4394 |
-
[eval debug] first 3 batch fingerprints:
|
| 4395 |
-
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 4396 |
-
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 4397 |
-
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 4398 |
-
ce_avg: 0.0, mse_avg: 0.02414196915924549
|
| 4399 |
-
base_dir is /dev/shm/models/checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_mental_rotation_3d_pad3_by_axis_one_img_lr2e_5_mse_only_ins_step5000
|
| 4400 |
-
Preparing Dataset vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce/vlm_gym_mental_rotation_3d_pad3_by_axis_val
|
| 4401 |
-
[eval debug] first 3 batch fingerprints:
|
| 4402 |
-
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 4403 |
-
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 4404 |
-
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 4405 |
-
ce_avg: 0.0, mse_avg: 0.024185948073863983
|
| 4406 |
[[34m2026-01-30 18:05:56[39m] (step=0004180) Train Loss mse: 0.0244, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4407 |
[[34m2026-01-30 18:06:13[39m] (step=0004181) Train Loss mse: 0.0221, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4408 |
[[34m2026-01-30 18:06:30[39m] (step=0004182) Train Loss mse: 0.0214, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
@@ -4420,7 +4433,21 @@ ce_avg: 0.0, mse_avg: 0.024185948073863983
|
|
| 4420 |
[[34m2026-01-30 18:09:51[39m] (step=0004194) Train Loss mse: 0.0225, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4421 |
[[34m2026-01-30 18:10:08[39m] (step=0004195) Train Loss mse: 0.0217, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4422 |
[[34m2026-01-30 18:10:25[39m] (step=0004196) Train Loss mse: 0.0208, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4423 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4424 |
[[34m2026-01-30 18:10:58[39m] (step=0004198) Train Loss mse: 0.0209, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4425 |
[[34m2026-01-30 18:11:15[39m] (step=0004199) Train Loss mse: 0.0245, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4426 |
[[34m2026-01-30 18:11:32[39m] (step=0004200) Train Loss mse: 0.0206, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
|
|
| 850 |
[[34m2026-01-30 02:20:25[39m] (step=0000839) Train Loss mse: 0.0258, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 851 |
[[34m2026-01-30 02:20:41[39m] (step=0000840) Train Loss mse: 0.0242, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 852 |
[[34m2026-01-30 02:20:58[39m] (step=0000841) Train Loss mse: 0.0250, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 853 |
+
[[34m2026-01-30 02:21:15[39m] (step=0000842) Train Loss mse: 0.0278, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 854 |
+
[[34m2026-01-30 02:21:32[39m] (step=0000843) Train Loss mse: 0.0222, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 855 |
+
[[34m2026-01-30 02:21:49[39m] (step=0000844) Train Loss mse: 0.0243, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 856 |
+
[[34m2026-01-30 02:22:05[39m] (step=0000845) Train Loss mse: 0.0232, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 857 |
+
[[34m2026-01-30 02:22:22[39m] (step=0000846) Train Loss mse: 0.0242, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 858 |
+
[[34m2026-01-30 02:22:39[39m] (step=0000847) Train Loss mse: 0.0240, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 859 |
+
[[34m2026-01-30 02:22:56[39m] (step=0000848) Train Loss mse: 0.0224, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 860 |
+
[[34m2026-01-30 02:23:12[39m] (step=0000849) Train Loss mse: 0.0229, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 861 |
+
[[34m2026-01-30 02:23:29[39m] (step=0000850) Train Loss mse: 0.0265, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 862 |
FullyShardedDataParallel(
|
| 863 |
(_fsdp_wrapped_module): Bagel(
|
| 864 |
(language_model): Qwen2ForCausalLM(
|
|
|
|
| 1052 |
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 1053 |
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 1054 |
ce_avg: 0.0, mse_avg: 0.024334488436579704
|
| 1055 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_mental_rotation_3d_pad3_by_axis_one_img_lr2e_5_mse_only_ins_step2000
|
| 1056 |
+
Preparing Dataset vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce/vlm_gym_mental_rotation_3d_pad3_by_axis_val
|
| 1057 |
+
[eval debug] first 3 batch fingerprints:
|
| 1058 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 1059 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 1060 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 1061 |
+
ce_avg: 0.0, mse_avg: 0.024451851844787598
|
|
|
|
|
|
|
| 1062 |
[[34m2026-01-30 02:23:46[39m] (step=0000851) Train Loss mse: 0.0230, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 1063 |
[[34m2026-01-30 02:24:03[39m] (step=0000852) Train Loss mse: 0.0253, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 1064 |
[[34m2026-01-30 02:24:20[39m] (step=0000853) Train Loss mse: 0.0239, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
|
|
| 2207 |
[[34m2026-01-30 07:47:18[39m] (step=0001996) Train Loss mse: 0.0242, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2208 |
[[34m2026-01-30 07:47:34[39m] (step=0001997) Train Loss mse: 0.0238, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2209 |
[[34m2026-01-30 07:47:51[39m] (step=0001998) Train Loss mse: 0.0238, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2210 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_mental_rotation_3d_pad3_by_axis_one_img_lr2e_5_mse_only_ins_step2500
|
| 2211 |
+
Preparing Dataset vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce/vlm_gym_mental_rotation_3d_pad3_by_axis_val
|
| 2212 |
+
[eval debug] first 3 batch fingerprints:
|
| 2213 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 2214 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 2215 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 2216 |
+
ce_avg: 0.0, mse_avg: 0.024314723908901215
|
| 2217 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_mental_rotation_3d_pad3_by_axis_one_img_lr2e_5_mse_only_ins_step3000
|
| 2218 |
+
Preparing Dataset vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce/vlm_gym_mental_rotation_3d_pad3_by_axis_val
|
| 2219 |
+
[eval debug] first 3 batch fingerprints:
|
| 2220 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 2221 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 2222 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 2223 |
+
ce_avg: 0.0, mse_avg: 0.024401195347309113
|
| 2224 |
[[34m2026-01-30 07:48:08[39m] (step=0001999) Train Loss mse: 0.0235, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2225 |
[[34m2026-01-30 07:49:58[39m] (step=0002000) Train Loss mse: 0.0214, Train Loss ce: 0.0000, Train Steps/Sec: 0.01,
|
| 2226 |
[[34m2026-01-30 07:50:15[39m] (step=0002001) Train Loss mse: 0.0226, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
|
|
| 2285 |
[[34m2026-01-30 08:06:43[39m] (step=0002060) Train Loss mse: 0.0223, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2286 |
[[34m2026-01-30 08:06:59[39m] (step=0002061) Train Loss mse: 0.0213, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2287 |
[[34m2026-01-30 08:07:16[39m] (step=0002062) Train Loss mse: 0.0225, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2288 |
[[34m2026-01-30 08:07:33[39m] (step=0002063) Train Loss mse: 0.0232, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2289 |
[[34m2026-01-30 08:07:50[39m] (step=0002064) Train Loss mse: 0.0223, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 2290 |
[[34m2026-01-30 08:08:07[39m] (step=0002065) Train Loss mse: 0.0223, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
|
|
| 3329 |
[[34m2026-01-30 13:01:56[39m] (step=0003104) Train Loss mse: 0.0206, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3330 |
[[34m2026-01-30 13:02:13[39m] (step=0003105) Train Loss mse: 0.0218, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3331 |
[[34m2026-01-30 13:02:30[39m] (step=0003106) Train Loss mse: 0.0204, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3332 |
+
[[34m2026-01-30 13:02:47[39m] (step=0003107) Train Loss mse: 0.0242, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3333 |
+
[[34m2026-01-30 13:03:04[39m] (step=0003108) Train Loss mse: 0.0230, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3334 |
+
[[34m2026-01-30 13:03:20[39m] (step=0003109) Train Loss mse: 0.0224, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3335 |
+
[[34m2026-01-30 13:03:37[39m] (step=0003110) Train Loss mse: 0.0218, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3336 |
+
[[34m2026-01-30 13:03:54[39m] (step=0003111) Train Loss mse: 0.0227, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3337 |
+
[[34m2026-01-30 13:04:11[39m] (step=0003112) Train Loss mse: 0.0232, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3338 |
+
[[34m2026-01-30 13:04:28[39m] (step=0003113) Train Loss mse: 0.0216, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3339 |
+
[[34m2026-01-30 13:04:44[39m] (step=0003114) Train Loss mse: 0.0245, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3340 |
+
[[34m2026-01-30 13:05:01[39m] (step=0003115) Train Loss mse: 0.0215, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3341 |
+
[[34m2026-01-30 13:05:18[39m] (step=0003116) Train Loss mse: 0.0194, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3342 |
+
[[34m2026-01-30 13:05:35[39m] (step=0003117) Train Loss mse: 0.0217, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3343 |
+
[[34m2026-01-30 13:05:51[39m] (step=0003118) Train Loss mse: 0.0233, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3344 |
+
[[34m2026-01-30 13:06:08[39m] (step=0003119) Train Loss mse: 0.0210, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3345 |
+
[[34m2026-01-30 13:06:25[39m] (step=0003120) Train Loss mse: 0.0239, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3346 |
+
[[34m2026-01-30 13:06:42[39m] (step=0003121) Train Loss mse: 0.0241, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3347 |
+
[[34m2026-01-30 13:06:59[39m] (step=0003122) Train Loss mse: 0.0211, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3348 |
+
[[34m2026-01-30 13:07:16[39m] (step=0003123) Train Loss mse: 0.0229, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3349 |
+
[[34m2026-01-30 13:07:33[39m] (step=0003124) Train Loss mse: 0.0223, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3350 |
+
[[34m2026-01-30 13:07:50[39m] (step=0003125) Train Loss mse: 0.0215, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3351 |
+
[[34m2026-01-30 13:08:06[39m] (step=0003126) Train Loss mse: 0.0209, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3352 |
+
[[34m2026-01-30 13:08:23[39m] (step=0003127) Train Loss mse: 0.0225, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3353 |
+
[[34m2026-01-30 13:08:40[39m] (step=0003128) Train Loss mse: 0.0219, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3354 |
+
[[34m2026-01-30 13:08:57[39m] (step=0003129) Train Loss mse: 0.0218, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3355 |
+
[[34m2026-01-30 13:09:13[39m] (step=0003130) Train Loss mse: 0.0230, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3356 |
+
[[34m2026-01-30 13:09:30[39m] (step=0003131) Train Loss mse: 0.0237, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3357 |
+
[[34m2026-01-30 13:09:47[39m] (step=0003132) Train Loss mse: 0.0203, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3358 |
+
[[34m2026-01-30 13:10:04[39m] (step=0003133) Train Loss mse: 0.0233, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3359 |
+
[[34m2026-01-30 13:10:20[39m] (step=0003134) Train Loss mse: 0.0218, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3360 |
[[34m2026-01-30 13:10:37[39m] (step=0003135) Train Loss mse: 0.0237, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3361 |
[[34m2026-01-30 13:10:54[39m] (step=0003136) Train Loss mse: 0.0221, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3362 |
[[34m2026-01-30 13:11:11[39m] (step=0003137) Train Loss mse: 0.0219, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
|
|
| 3377 |
[[34m2026-01-30 13:15:23[39m] (step=0003152) Train Loss mse: 0.0230, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3378 |
[[34m2026-01-30 13:15:40[39m] (step=0003153) Train Loss mse: 0.0212, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3379 |
[[34m2026-01-30 13:15:56[39m] (step=0003154) Train Loss mse: 0.0249, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3380 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_mental_rotation_3d_pad3_by_axis_one_img_lr2e_5_mse_only_ins_step3500
|
| 3381 |
+
Preparing Dataset vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce/vlm_gym_mental_rotation_3d_pad3_by_axis_val
|
| 3382 |
+
[eval debug] first 3 batch fingerprints:
|
| 3383 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 3384 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 3385 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 3386 |
+
ce_avg: 0.0, mse_avg: 0.024277793243527412
|
| 3387 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_mental_rotation_3d_pad3_by_axis_one_img_lr2e_5_mse_only_ins_step4000
|
| 3388 |
+
Preparing Dataset vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce/vlm_gym_mental_rotation_3d_pad3_by_axis_val
|
| 3389 |
+
[eval debug] first 3 batch fingerprints:
|
| 3390 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 3391 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 3392 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 3393 |
+
ce_avg: 0.0, mse_avg: 0.024172412231564522
|
| 3394 |
[[34m2026-01-30 13:16:13[39m] (step=0003155) Train Loss mse: 0.0206, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3395 |
[[34m2026-01-30 13:16:30[39m] (step=0003156) Train Loss mse: 0.0216, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 3396 |
[[34m2026-01-30 13:16:47[39m] (step=0003157) Train Loss mse: 0.0233, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
|
|
| 4416 |
[[34m2026-01-30 18:05:06[39m] (step=0004177) Train Loss mse: 0.0218, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4417 |
[[34m2026-01-30 18:05:22[39m] (step=0004178) Train Loss mse: 0.0224, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4418 |
[[34m2026-01-30 18:05:39[39m] (step=0004179) Train Loss mse: 0.0202, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4419 |
[[34m2026-01-30 18:05:56[39m] (step=0004180) Train Loss mse: 0.0244, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4420 |
[[34m2026-01-30 18:06:13[39m] (step=0004181) Train Loss mse: 0.0221, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4421 |
[[34m2026-01-30 18:06:30[39m] (step=0004182) Train Loss mse: 0.0214, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
|
|
|
| 4433 |
[[34m2026-01-30 18:09:51[39m] (step=0004194) Train Loss mse: 0.0225, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4434 |
[[34m2026-01-30 18:10:08[39m] (step=0004195) Train Loss mse: 0.0217, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4435 |
[[34m2026-01-30 18:10:25[39m] (step=0004196) Train Loss mse: 0.0208, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4436 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_mental_rotation_3d_pad3_by_axis_one_img_lr2e_5_mse_only_ins_step4500
|
| 4437 |
+
Preparing Dataset vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce/vlm_gym_mental_rotation_3d_pad3_by_axis_val
|
| 4438 |
+
[eval debug] first 3 batch fingerprints:
|
| 4439 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 4440 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 4441 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 4442 |
+
ce_avg: 0.0, mse_avg: 0.02414196915924549
|
| 4443 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_mental_rotation_3d_pad3_by_axis_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_mental_rotation_3d_pad3_by_axis_one_img_lr2e_5_mse_only_ins_step5000
|
| 4444 |
+
Preparing Dataset vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce/vlm_gym_mental_rotation_3d_pad3_by_axis_val
|
| 4445 |
+
[eval debug] first 3 batch fingerprints:
|
| 4446 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 4447 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 4448 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_mental_rotation_3d_pad3_by_axis_mse_loss_only_evalonce'}]
|
| 4449 |
+
ce_avg: 0.0, mse_avg: 0.024185948073863983
|
| 4450 |
+
[34m2026-01-30 18:10:41[39m] (step=0004197) Train Loss mse: 0.0209, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4451 |
[[34m2026-01-30 18:10:58[39m] (step=0004198) Train Loss mse: 0.0209, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4452 |
[[34m2026-01-30 18:11:15[39m] (step=0004199) Train Loss mse: 0.0245, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4453 |
[[34m2026-01-30 18:11:32[39m] (step=0004200) Train Loss mse: 0.0206, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|