File size: 35,264 Bytes
d504ab6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 |
[[34m2025-04-14 23:49:16[0m] Experiment directory created at /nvme-data/Komal/documents/results/reasoning [[34m2025-04-14 23:49:21[0m] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [[34m2025-04-14 23:50:31[0m] Dataset contains 11 [[34m2025-04-14 23:50:31[0m] Training for 2000 epochs... [[34m2025-04-14 23:50:31[0m] Beginning epoch 0... [[34m2025-04-14 23:54:27[0m] Experiment directory created at /nvme-data/Komal/documents/results/reasoning [[34m2025-04-14 23:54:32[0m] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [[34m2025-04-14 23:55:41[0m] Dataset contains 11 [[34m2025-04-14 23:55:41[0m] Training for 2000 epochs... [[34m2025-04-14 23:55:41[0m] Beginning epoch 0... [[34m2025-04-14 23:57:33[0m] (step=0000001) Train Loss: 0.2672, Train Steps/Sec: 0.01, Epoch: 0.8, LR: 0.001 [[34m2025-04-14 23:57:51[0m] Beginning epoch 1... [[34m2025-04-14 23:58:58[0m] (step=0000002) Train Loss: 0.3903, Train Steps/Sec: 0.01, Epoch: 1.6, LR: 0.001 [[34m2025-04-15 00:00:11[0m] Beginning epoch 2... [[34m2025-04-15 00:01:21[0m] (step=0000003) Train Loss: 0.1595, Train Steps/Sec: 0.01, Epoch: 2.4, LR: 0.001 [[34m2025-04-15 00:02:38[0m] Beginning epoch 3... [[34m2025-04-15 00:02:49[0m] (step=0000004) Train Loss: 0.4530, Train Steps/Sec: 0.01, Epoch: 3.2, LR: 0.001 [[34m2025-04-15 00:04:55[0m] (step=0000005) Train Loss: 0.1531, Train Steps/Sec: 0.01, Epoch: 4.0, LR: 0.001 [[34m2025-04-15 00:04:55[0m] Beginning epoch 4... [[34m2025-04-15 00:06:29[0m] (step=0000006) Train Loss: 0.3319, Train Steps/Sec: 0.01, Epoch: 4.8, LR: 0.001 [[34m2025-04-15 00:07:00[0m] Beginning epoch 5... [[34m2025-04-15 00:08:30[0m] (step=0000007) Train Loss: 0.3095, Train Steps/Sec: 0.01, Epoch: 5.6, LR: 0.001 [[34m2025-04-15 00:09:19[0m] Beginning epoch 6... [[34m2025-04-15 00:10:30[0m] (step=0000008) Train Loss: 0.2363, Train Steps/Sec: 0.01, Epoch: 6.4, LR: 0.001 [[34m2025-04-15 00:11:39[0m] Beginning epoch 7... [[34m2025-04-15 00:11:52[0m] (step=0000009) Train Loss: 0.2768, Train Steps/Sec: 0.01, Epoch: 7.2, LR: 0.001 [[34m2025-04-15 00:13:50[0m] (step=0000010) Train Loss: 0.2952, Train Steps/Sec: 0.01, Epoch: 8.0, LR: 0.001 [[34m2025-04-15 00:13:50[0m] Beginning epoch 8... [[34m2025-04-15 00:15:45[0m] (step=0000011) Train Loss: 0.2245, Train Steps/Sec: 0.01, Epoch: 8.8, LR: 0.001 [[34m2025-04-15 00:15:54[0m] Beginning epoch 9... [[34m2025-04-15 00:17:25[0m] (step=0000012) Train Loss: 0.3390, Train Steps/Sec: 0.01, Epoch: 9.6, LR: 0.001 [[34m2025-04-15 00:17:48[0m] Beginning epoch 10... [[34m2025-04-15 00:18:58[0m] (step=0000013) Train Loss: 0.3779, Train Steps/Sec: 0.01, Epoch: 10.4, LR: 0.001 [[34m2025-04-15 00:19:56[0m] Beginning epoch 11... [[34m2025-04-15 00:20:09[0m] (step=0000014) Train Loss: 0.3431, Train Steps/Sec: 0.01, Epoch: 11.2, LR: 0.001 [[34m2025-04-15 00:22:15[0m] (step=0000015) Train Loss: 0.2042, Train Steps/Sec: 0.01, Epoch: 12.0, LR: 0.001 [[34m2025-04-15 00:22:15[0m] Beginning epoch 12... [[34m2025-04-15 00:24:06[0m] (step=0000016) Train Loss: 0.3091, Train Steps/Sec: 0.01, Epoch: 12.8, LR: 0.001 [[34m2025-04-15 00:24:42[0m] Beginning epoch 13... [[34m2025-04-15 00:26:02[0m] (step=0000017) Train Loss: 0.1912, Train Steps/Sec: 0.01, Epoch: 13.6, LR: 0.001 [[34m2025-04-15 00:26:42[0m] Beginning epoch 14... [[34m2025-04-15 00:27:24[0m] (step=0000018) Train Loss: 0.2614, Train Steps/Sec: 0.01, Epoch: 14.4, LR: 0.001 [[34m2025-04-15 00:28:22[0m] Beginning epoch 15... [[34m2025-04-15 00:28:57[0m] (step=0000019) Train Loss: 0.3927, Train Steps/Sec: 0.01, Epoch: 15.2, LR: 0.001 [[34m2025-04-15 00:30:16[0m] (step=0000020) Train Loss: 0.3333, Train Steps/Sec: 0.01, Epoch: 16.0, LR: 0.001 [[34m2025-04-15 00:30:16[0m] Beginning epoch 16... [[34m2025-04-15 00:31:50[0m] (step=0000021) Train Loss: 0.3344, Train Steps/Sec: 0.01, Epoch: 16.8, LR: 0.001 [[34m2025-04-15 00:32:26[0m] Beginning epoch 17... [[34m2025-04-15 00:34:08[0m] (step=0000022) Train Loss: 0.1887, Train Steps/Sec: 0.01, Epoch: 17.6, LR: 0.001 [[34m2025-04-15 00:34:35[0m] Beginning epoch 18... [[34m2025-04-15 00:35:57[0m] (step=0000023) Train Loss: 0.1543, Train Steps/Sec: 0.01, Epoch: 18.4, LR: 0.001 [[34m2025-04-15 00:36:33[0m] Beginning epoch 19... [[34m2025-04-15 00:36:44[0m] (step=0000024) Train Loss: 0.4895, Train Steps/Sec: 0.02, Epoch: 19.2, LR: 0.001 [[34m2025-04-15 00:38:31[0m] (step=0000025) Train Loss: 0.2012, Train Steps/Sec: 0.01, Epoch: 20.0, LR: 0.001 [[34m2025-04-15 00:38:31[0m] Beginning epoch 20... [[34m2025-04-15 00:40:27[0m] (step=0000026) Train Loss: 0.3008, Train Steps/Sec: 0.01, Epoch: 20.8, LR: 0.001 [[34m2025-04-15 00:40:58[0m] Beginning epoch 21... [[34m2025-04-15 00:42:11[0m] (step=0000027) Train Loss: 0.2562, Train Steps/Sec: 0.01, Epoch: 21.6, LR: 0.001 [[34m2025-04-15 00:43:05[0m] Beginning epoch 22... [[34m2025-04-15 00:44:21[0m] (step=0000028) Train Loss: 0.3388, Train Steps/Sec: 0.01, Epoch: 22.4, LR: 0.001 [[34m2025-04-15 00:45:32[0m] Beginning epoch 23... [[34m2025-04-15 00:45:43[0m] (step=0000029) Train Loss: 0.2314, Train Steps/Sec: 0.01, Epoch: 23.2, LR: 0.001 [[34m2025-04-15 00:47:23[0m] (step=0000030) Train Loss: 0.2929, Train Steps/Sec: 0.01, Epoch: 24.0, LR: 0.001 [[34m2025-04-15 00:47:23[0m] Beginning epoch 24... [[34m2025-04-15 00:49:00[0m] (step=0000031) Train Loss: 0.3094, Train Steps/Sec: 0.01, Epoch: 24.8, LR: 0.001 [[34m2025-04-15 00:49:31[0m] Beginning epoch 25... [[34m2025-04-15 00:50:45[0m] (step=0000032) Train Loss: 0.2451, Train Steps/Sec: 0.01, Epoch: 25.6, LR: 0.001 [[34m2025-04-15 00:51:41[0m] Beginning epoch 26... [[34m2025-04-15 00:52:24[0m] (step=0000033) Train Loss: 0.3336, Train Steps/Sec: 0.01, Epoch: 26.4, LR: 0.001 [[34m2025-04-15 00:53:45[0m] Beginning epoch 27... [[34m2025-04-15 00:54:25[0m] (step=0000034) Train Loss: 0.2826, Train Steps/Sec: 0.01, Epoch: 27.2, LR: 0.001 [[34m2025-04-15 00:55:49[0m] (step=0000035) Train Loss: 0.2395, Train Steps/Sec: 0.01, Epoch: 28.0, LR: 0.001 [[34m2025-04-15 00:55:49[0m] Beginning epoch 28... [[34m2025-04-15 00:57:39[0m] (step=0000036) Train Loss: 0.1711, Train Steps/Sec: 0.01, Epoch: 28.8, LR: 0.001 [[34m2025-04-15 00:58:15[0m] Beginning epoch 29... [[34m2025-04-15 00:59:34[0m] (step=0000037) Train Loss: 0.2347, Train Steps/Sec: 0.01, Epoch: 29.6, LR: 0.001 [[34m2025-04-15 01:00:41[0m] Beginning epoch 30... [[34m2025-04-15 01:01:38[0m] (step=0000038) Train Loss: 0.3853, Train Steps/Sec: 0.01, Epoch: 30.4, LR: 0.001 [[34m2025-04-15 01:02:39[0m] Beginning epoch 31... [[34m2025-04-15 01:02:52[0m] (step=0000039) Train Loss: 0.2006, Train Steps/Sec: 0.01, Epoch: 31.2, LR: 0.001 [[34m2025-04-15 01:04:50[0m] (step=0000040) Train Loss: 0.1824, Train Steps/Sec: 0.01, Epoch: 32.0, LR: 0.001 [[34m2025-04-15 01:04:50[0m] Beginning epoch 32... [[34m2025-04-15 01:06:19[0m] (step=0000041) Train Loss: 0.2093, Train Steps/Sec: 0.01, Epoch: 32.8, LR: 0.001 [[34m2025-04-15 01:06:56[0m] Beginning epoch 33... [[34m2025-04-15 01:08:08[0m] (step=0000042) Train Loss: 0.3087, Train Steps/Sec: 0.01, Epoch: 33.6, LR: 0.001 [[34m2025-04-15 01:09:21[0m] Beginning epoch 34... [[34m2025-04-15 01:10:31[0m] (step=0000043) Train Loss: 0.2557, Train Steps/Sec: 0.01, Epoch: 34.4, LR: 0.001 [[34m2025-04-15 01:11:24[0m] Beginning epoch 35... [[34m2025-04-15 01:11:29[0m] (step=0000044) Train Loss: 0.1965, Train Steps/Sec: 0.02, Epoch: 35.2, LR: 0.001 [[34m2025-04-15 01:13:27[0m] (step=0000045) Train Loss: 0.3648, Train Steps/Sec: 0.01, Epoch: 36.0, LR: 0.001 [[34m2025-04-15 01:13:27[0m] Beginning epoch 36... [[34m2025-04-15 01:15:28[0m] (step=0000046) Train Loss: 0.3300, Train Steps/Sec: 0.01, Epoch: 36.8, LR: 0.001 [[34m2025-04-15 01:15:46[0m] Beginning epoch 37... [[34m2025-04-15 01:17:23[0m] (step=0000047) Train Loss: 0.2603, Train Steps/Sec: 0.01, Epoch: 37.6, LR: 0.001 [[34m2025-04-15 01:17:50[0m] Beginning epoch 38... [[34m2025-04-15 01:18:38[0m] (step=0000048) Train Loss: 0.4495, Train Steps/Sec: 0.01, Epoch: 38.4, LR: 0.001 [[34m2025-04-15 01:19:59[0m] Beginning epoch 39... [[34m2025-04-15 01:20:39[0m] (step=0000049) Train Loss: 0.2713, Train Steps/Sec: 0.01, Epoch: 39.2, LR: 0.001 [[34m2025-04-15 01:22:09[0m] (step=0000050) Train Loss: 0.2648, Train Steps/Sec: 0.01, Epoch: 40.0, LR: 0.001 [[34m2025-04-15 01:22:09[0m] Beginning epoch 40... [[34m2025-04-15 01:24:04[0m] (step=0000051) Train Loss: 0.2900, Train Steps/Sec: 0.01, Epoch: 40.8, LR: 0.001 [[34m2025-04-15 01:24:35[0m] Beginning epoch 41... [[34m2025-04-15 01:25:33[0m] (step=0000052) Train Loss: 0.2112, Train Steps/Sec: 0.01, Epoch: 41.6, LR: 0.001 [[34m2025-04-15 01:26:41[0m] Beginning epoch 42... [[34m2025-04-15 01:27:06[0m] (step=0000053) Train Loss: 0.2525, Train Steps/Sec: 0.01, Epoch: 42.4, LR: 0.001 [[34m2025-04-15 01:28:44[0m] Beginning epoch 43... [[34m2025-04-15 01:29:00[0m] (step=0000054) Train Loss: 0.1994, Train Steps/Sec: 0.01, Epoch: 43.2, LR: 0.001 [[34m2025-04-15 01:30:53[0m] (step=0000055) Train Loss: 0.2839, Train Steps/Sec: 0.01, Epoch: 44.0, LR: 0.001 [[34m2025-04-15 01:30:53[0m] Beginning epoch 44... [[34m2025-04-15 01:32:26[0m] (step=0000056) Train Loss: 0.2093, Train Steps/Sec: 0.01, Epoch: 44.8, LR: 0.001 [[34m2025-04-15 01:33:02[0m] Beginning epoch 45... [[34m2025-04-15 01:34:07[0m] (step=0000057) Train Loss: 0.2409, Train Steps/Sec: 0.01, Epoch: 45.6, LR: 0.001 [[34m2025-04-15 01:35:20[0m] Beginning epoch 46... [[34m2025-04-15 01:36:31[0m] (step=0000058) Train Loss: 0.2261, Train Steps/Sec: 0.01, Epoch: 46.4, LR: 0.001 [[34m2025-04-15 01:37:20[0m] Beginning epoch 47... [[34m2025-04-15 01:37:59[0m] (step=0000059) Train Loss: 0.2391, Train Steps/Sec: 0.01, Epoch: 47.2, LR: 0.001 [[34m2025-04-15 01:39:25[0m] (step=0000060) Train Loss: 0.2063, Train Steps/Sec: 0.01, Epoch: 48.0, LR: 0.001 [[34m2025-04-15 01:39:25[0m] Beginning epoch 48... [[34m2025-04-15 01:41:20[0m] (step=0000061) Train Loss: 0.3564, Train Steps/Sec: 0.01, Epoch: 48.8, LR: 0.001 [[34m2025-04-15 01:41:22[0m] Beginning epoch 49... [[34m2025-04-15 01:42:47[0m] (step=0000062) Train Loss: 0.2492, Train Steps/Sec: 0.01, Epoch: 49.6, LR: 0.001 [[34m2025-04-15 01:43:07[0m] Beginning epoch 50... [[34m2025-04-15 01:43:58[0m] (step=0000063) Train Loss: 0.2369, Train Steps/Sec: 0.01, Epoch: 50.4, LR: 0.001 [[34m2025-04-15 01:45:41[0m] Beginning epoch 51... [[34m2025-04-15 01:46:20[0m] (step=0000064) Train Loss: 0.2297, Train Steps/Sec: 0.01, Epoch: 51.2, LR: 0.001 [[34m2025-04-15 01:48:07[0m] (step=0000065) Train Loss: 0.2419, Train Steps/Sec: 0.01, Epoch: 52.0, LR: 0.001 [[34m2025-04-15 01:48:07[0m] Beginning epoch 52... [[34m2025-04-15 01:49:58[0m] (step=0000066) Train Loss: 0.2183, Train Steps/Sec: 0.01, Epoch: 52.8, LR: 0.001 [[34m2025-04-15 01:50:34[0m] Beginning epoch 53... [[34m2025-04-15 01:51:59[0m] (step=0000067) Train Loss: 0.4026, Train Steps/Sec: 0.01, Epoch: 53.6, LR: 0.001 [[34m2025-04-15 01:53:01[0m] Beginning epoch 54... [[34m2025-04-15 01:53:45[0m] (step=0000068) Train Loss: 0.1955, Train Steps/Sec: 0.01, Epoch: 54.4, LR: 0.001 [[34m2025-04-15 01:55:10[0m] Beginning epoch 55... [[34m2025-04-15 01:55:30[0m] (step=0000069) Train Loss: 0.2061, Train Steps/Sec: 0.01, Epoch: 55.2, LR: 0.001 [[34m2025-04-15 01:56:56[0m] (step=0000070) Train Loss: 0.2585, Train Steps/Sec: 0.01, Epoch: 56.0, LR: 0.001 [[34m2025-04-15 01:56:56[0m] Beginning epoch 56... [[34m2025-04-15 01:58:35[0m] (step=0000071) Train Loss: 0.2579, Train Steps/Sec: 0.01, Epoch: 56.8, LR: 0.001 [[34m2025-04-15 01:59:06[0m] Beginning epoch 57... [[34m2025-04-15 02:00:10[0m] (step=0000072) Train Loss: 0.2481, Train Steps/Sec: 0.01, Epoch: 57.6, LR: 0.001 [[34m2025-04-15 02:01:05[0m] Beginning epoch 58... [[34m2025-04-15 02:01:57[0m] (step=0000073) Train Loss: 0.0944, Train Steps/Sec: 0.01, Epoch: 58.4, LR: 0.001 [[34m2025-04-15 02:03:13[0m] Beginning epoch 59... [[34m2025-04-15 02:03:52[0m] (step=0000074) Train Loss: 0.3327, Train Steps/Sec: 0.01, Epoch: 59.2, LR: 0.001 [[34m2025-04-15 02:05:24[0m] (step=0000075) Train Loss: 0.2456, Train Steps/Sec: 0.01, Epoch: 60.0, LR: 0.001 [[34m2025-04-15 02:05:24[0m] Beginning epoch 60... [[34m2025-04-15 02:07:20[0m] (step=0000076) Train Loss: 0.1806, Train Steps/Sec: 0.01, Epoch: 60.8, LR: 0.001 [[34m2025-04-15 02:07:29[0m] Beginning epoch 61... [[34m2025-04-15 02:09:09[0m] (step=0000077) Train Loss: 0.4123, Train Steps/Sec: 0.01, Epoch: 61.6, LR: 0.001 [[34m2025-04-15 02:09:49[0m] Beginning epoch 62... [[34m2025-04-15 02:10:45[0m] (step=0000078) Train Loss: 0.2114, Train Steps/Sec: 0.01, Epoch: 62.4, LR: 0.001 [[34m2025-04-15 02:12:01[0m] Beginning epoch 63... [[34m2025-04-15 02:12:13[0m] (step=0000079) Train Loss: 0.3019, Train Steps/Sec: 0.01, Epoch: 63.2, LR: 0.001 [[34m2025-04-15 02:14:06[0m] (step=0000080) Train Loss: 0.2305, Train Steps/Sec: 0.01, Epoch: 64.0, LR: 0.001 [[34m2025-04-15 02:14:06[0m] Beginning epoch 64... [[34m2025-04-15 02:15:40[0m] (step=0000081) Train Loss: 0.2290, Train Steps/Sec: 0.01, Epoch: 64.8, LR: 0.001 [[34m2025-04-15 02:16:11[0m] Beginning epoch 65... [[34m2025-04-15 02:17:41[0m] (step=0000082) Train Loss: 0.2106, Train Steps/Sec: 0.01, Epoch: 65.6, LR: 0.001 [[34m2025-04-15 02:18:22[0m] Beginning epoch 66... [[34m2025-04-15 02:18:51[0m] (step=0000083) Train Loss: 0.2686, Train Steps/Sec: 0.01, Epoch: 66.4, LR: 0.001 [[34m2025-04-15 02:20:40[0m] Beginning epoch 67... [[34m2025-04-15 02:21:25[0m] (step=0000084) Train Loss: 0.1892, Train Steps/Sec: 0.01, Epoch: 67.2, LR: 0.001 [[34m2025-04-15 02:22:59[0m] (step=0000085) Train Loss: 0.2743, Train Steps/Sec: 0.01, Epoch: 68.0, LR: 0.001 [[34m2025-04-15 02:22:59[0m] Beginning epoch 68... [[34m2025-04-15 02:24:50[0m] (step=0000086) Train Loss: 0.2092, Train Steps/Sec: 0.01, Epoch: 68.8, LR: 0.001 [[34m2025-04-15 02:25:27[0m] Beginning epoch 69... [[34m2025-04-15 02:26:47[0m] (step=0000087) Train Loss: 0.2941, Train Steps/Sec: 0.01, Epoch: 69.6, LR: 0.001 [[34m2025-04-15 02:27:32[0m] Beginning epoch 70... [[34m2025-04-15 02:28:15[0m] (step=0000088) Train Loss: 0.3616, Train Steps/Sec: 0.01, Epoch: 70.4, LR: 0.001 [[34m2025-04-15 02:29:42[0m] Beginning epoch 71... [[34m2025-04-15 02:29:56[0m] (step=0000089) Train Loss: 0.2385, Train Steps/Sec: 0.01, Epoch: 71.2, LR: 0.001 [[34m2025-04-15 02:31:30[0m] (step=0000090) Train Loss: 0.2272, Train Steps/Sec: 0.01, Epoch: 72.0, LR: 0.001 [[34m2025-04-15 02:31:30[0m] Beginning epoch 72... [[34m2025-04-15 02:33:34[0m] (step=0000091) Train Loss: 0.2758, Train Steps/Sec: 0.01, Epoch: 72.8, LR: 0.001 [[34m2025-04-15 02:34:05[0m] Beginning epoch 73... [[34m2025-04-15 02:35:25[0m] (step=0000092) Train Loss: 0.2010, Train Steps/Sec: 0.01, Epoch: 73.6, LR: 0.001 [[34m2025-04-15 02:36:11[0m] Beginning epoch 74... [[34m2025-04-15 02:37:08[0m] (step=0000093) Train Loss: 0.3143, Train Steps/Sec: 0.01, Epoch: 74.4, LR: 0.001 [[34m2025-04-15 02:38:23[0m] Beginning epoch 75... [[34m2025-04-15 02:38:27[0m] (step=0000094) Train Loss: 0.1236, Train Steps/Sec: 0.01, Epoch: 75.2, LR: 0.001 [[34m2025-04-15 02:40:28[0m] (step=0000095) Train Loss: 0.3191, Train Steps/Sec: 0.01, Epoch: 76.0, LR: 0.001 [[34m2025-04-15 02:40:28[0m] Beginning epoch 76... [[34m2025-04-15 02:42:08[0m] (step=0000096) Train Loss: 0.2954, Train Steps/Sec: 0.01, Epoch: 76.8, LR: 0.001 [[34m2025-04-15 02:42:40[0m] Beginning epoch 77... [[34m2025-04-15 02:44:14[0m] (step=0000097) Train Loss: 0.1790, Train Steps/Sec: 0.01, Epoch: 77.6, LR: 0.001 [[34m2025-04-15 02:44:59[0m] Beginning epoch 78... [[34m2025-04-15 02:45:43[0m] (step=0000098) Train Loss: 0.3077, Train Steps/Sec: 0.01, Epoch: 78.4, LR: 0.001 [[34m2025-04-15 02:47:27[0m] Beginning epoch 79... [[34m2025-04-15 02:48:06[0m] (step=0000099) Train Loss: 0.2787, Train Steps/Sec: 0.01, Epoch: 79.2, LR: 0.001 [[34m2025-04-15 02:49:40[0m] (step=0000100) Train Loss: 0.2578, Train Steps/Sec: 0.01, Epoch: 80.0, LR: 0.001 [[34m2025-04-15 02:49:40[0m] Beginning epoch 80... [[34m2025-04-15 02:51:29[0m] (step=0000101) Train Loss: 0.2697, Train Steps/Sec: 0.01, Epoch: 80.8, LR: 0.001 [[34m2025-04-15 02:52:00[0m] Beginning epoch 81... [[34m2025-04-15 02:53:31[0m] (step=0000102) Train Loss: 0.2689, Train Steps/Sec: 0.01, Epoch: 81.6, LR: 0.001 [[34m2025-04-15 02:54:20[0m] Beginning epoch 82... [[34m2025-04-15 02:55:02[0m] (step=0000103) Train Loss: 0.3377, Train Steps/Sec: 0.01, Epoch: 82.4, LR: 0.001 [[34m2025-04-15 02:56:29[0m] Beginning epoch 83... [[34m2025-04-15 02:56:51[0m] (step=0000104) Train Loss: 0.2699, Train Steps/Sec: 0.01, Epoch: 83.2, LR: 0.001 [[34m2025-04-15 02:58:28[0m] (step=0000105) Train Loss: 0.2513, Train Steps/Sec: 0.01, Epoch: 84.0, LR: 0.001 [[34m2025-04-15 02:58:28[0m] Beginning epoch 84... [[34m2025-04-15 03:00:29[0m] (step=0000106) Train Loss: 0.2509, Train Steps/Sec: 0.01, Epoch: 84.8, LR: 0.001 [[34m2025-04-15 03:00:38[0m] Beginning epoch 85... [[34m2025-04-15 03:01:57[0m] (step=0000107) Train Loss: 0.2658, Train Steps/Sec: 0.01, Epoch: 85.6, LR: 0.001 [[34m2025-04-15 03:02:38[0m] Beginning epoch 86... [[34m2025-04-15 03:03:26[0m] (step=0000108) Train Loss: 0.2827, Train Steps/Sec: 0.01, Epoch: 86.4, LR: 0.001 [[34m2025-04-15 03:05:04[0m] Beginning epoch 87... [[34m2025-04-15 03:05:16[0m] (step=0000109) Train Loss: 0.2019, Train Steps/Sec: 0.01, Epoch: 87.2, LR: 0.001 [[34m2025-04-15 03:07:31[0m] (step=0000110) Train Loss: 0.2477, Train Steps/Sec: 0.01, Epoch: 88.0, LR: 0.001 [[34m2025-04-15 03:07:31[0m] Beginning epoch 88... [[34m2025-04-15 03:09:21[0m] (step=0000111) Train Loss: 0.3240, Train Steps/Sec: 0.01, Epoch: 88.8, LR: 0.001 [[34m2025-04-15 03:09:57[0m] Beginning epoch 89... [[34m2025-04-15 03:10:55[0m] (step=0000112) Train Loss: 0.1754, Train Steps/Sec: 0.01, Epoch: 89.6, LR: 0.001 [[34m2025-04-15 03:12:02[0m] Beginning epoch 90... [[34m2025-04-15 03:13:13[0m] (step=0000113) Train Loss: 0.3208, Train Steps/Sec: 0.01, Epoch: 90.4, LR: 0.001 [[34m2025-04-15 03:14:29[0m] Beginning epoch 91... [[34m2025-04-15 03:15:05[0m] (step=0000114) Train Loss: 0.2123, Train Steps/Sec: 0.01, Epoch: 91.2, LR: 0.001 [[34m2025-04-15 03:16:56[0m] (step=0000115) Train Loss: 0.2754, Train Steps/Sec: 0.01, Epoch: 92.0, LR: 0.001 [[34m2025-04-15 03:16:56[0m] Beginning epoch 92... [[34m2025-04-15 03:18:58[0m] (step=0000116) Train Loss: 0.1969, Train Steps/Sec: 0.01, Epoch: 92.8, LR: 0.001 [[34m2025-04-15 03:19:16[0m] Beginning epoch 93... [[34m2025-04-15 03:20:43[0m] (step=0000117) Train Loss: 0.2010, Train Steps/Sec: 0.01, Epoch: 93.6, LR: 0.001 [[34m2025-04-15 03:21:51[0m] Beginning epoch 94... [[34m2025-04-15 03:22:43[0m] (step=0000118) Train Loss: 0.3363, Train Steps/Sec: 0.01, Epoch: 94.4, LR: 0.001 [[34m2025-04-15 03:24:26[0m] Beginning epoch 95... [[34m2025-04-15 03:24:48[0m] (step=0000119) Train Loss: 0.3105, Train Steps/Sec: 0.01, Epoch: 95.2, LR: 0.001 [[34m2025-04-15 03:26:46[0m] (step=0000120) Train Loss: 0.2346, Train Steps/Sec: 0.01, Epoch: 96.0, LR: 0.001 [[34m2025-04-15 03:26:46[0m] Beginning epoch 96... [[34m2025-04-15 03:28:20[0m] (step=0000121) Train Loss: 0.2713, Train Steps/Sec: 0.01, Epoch: 96.8, LR: 0.001 [[34m2025-04-15 03:28:51[0m] Beginning epoch 97... [[34m2025-04-15 03:30:44[0m] (step=0000122) Train Loss: 0.2567, Train Steps/Sec: 0.01, Epoch: 97.6, LR: 0.001 [[34m2025-04-15 03:31:11[0m] Beginning epoch 98... [[34m2025-04-15 03:31:55[0m] (step=0000123) Train Loss: 0.2453, Train Steps/Sec: 0.01, Epoch: 98.4, LR: 0.001 [[34m2025-04-15 03:32:54[0m] Beginning epoch 99... [[34m2025-04-15 03:33:41[0m] (step=0000124) Train Loss: 0.3643, Train Steps/Sec: 0.01, Epoch: 99.2, LR: 0.001 [[34m2025-04-15 03:35:11[0m] (step=0000125) Train Loss: 0.2543, Train Steps/Sec: 0.01, Epoch: 100.0, LR: 0.001 [[34m2025-04-15 03:35:11[0m] Beginning epoch 100... [[34m2025-04-15 03:37:11[0m] (step=0000126) Train Loss: 0.2647, Train Steps/Sec: 0.01, Epoch: 100.8, LR: 0.001 [[34m2025-04-15 03:37:47[0m] Beginning epoch 101... [[34m2025-04-15 03:39:29[0m] (step=0000127) Train Loss: 0.3946, Train Steps/Sec: 0.01, Epoch: 101.6, LR: 0.001 [[34m2025-04-15 03:40:18[0m] Beginning epoch 102... [[34m2025-04-15 03:41:06[0m] (step=0000128) Train Loss: 0.0589, Train Steps/Sec: 0.01, Epoch: 102.4, LR: 0.001 [[34m2025-04-15 03:42:00[0m] Beginning epoch 103... [[34m2025-04-15 03:42:13[0m] (step=0000129) Train Loss: 0.3102, Train Steps/Sec: 0.01, Epoch: 103.2, LR: 0.001 [[34m2025-04-15 03:44:11[0m] (step=0000130) Train Loss: 0.2297, Train Steps/Sec: 0.01, Epoch: 104.0, LR: 0.001 [[34m2025-04-15 03:44:11[0m] Beginning epoch 104... [[34m2025-04-15 03:46:16[0m] (step=0000131) Train Loss: 0.1871, Train Steps/Sec: 0.01, Epoch: 104.8, LR: 0.001 [[34m2025-04-15 03:46:47[0m] Beginning epoch 105... [[34m2025-04-15 03:48:34[0m] (step=0000132) Train Loss: 0.2079, Train Steps/Sec: 0.01, Epoch: 105.6, LR: 0.001 [[34m2025-04-15 03:49:23[0m] Beginning epoch 106... [[34m2025-04-15 03:49:52[0m] (step=0000133) Train Loss: 0.2651, Train Steps/Sec: 0.01, Epoch: 106.4, LR: 0.001 [[34m2025-04-15 03:51:35[0m] Beginning epoch 107... [[34m2025-04-15 03:52:10[0m] (step=0000134) Train Loss: 0.1823, Train Steps/Sec: 0.01, Epoch: 107.2, LR: 0.001 [[34m2025-04-15 03:54:11[0m] (step=0000135) Train Loss: 0.2668, Train Steps/Sec: 0.01, Epoch: 108.0, LR: 0.001 [[34m2025-04-15 03:54:11[0m] Beginning epoch 108... [[34m2025-04-15 03:56:14[0m] (step=0000136) Train Loss: 0.2457, Train Steps/Sec: 0.01, Epoch: 108.8, LR: 0.001 [[34m2025-04-15 03:56:23[0m] Beginning epoch 109... [[34m2025-04-15 03:58:18[0m] (step=0000137) Train Loss: 0.2163, Train Steps/Sec: 0.01, Epoch: 109.6, LR: 0.001 [[34m2025-04-15 03:58:37[0m] Beginning epoch 110... [[34m2025-04-15 03:59:55[0m] (step=0000138) Train Loss: 0.3361, Train Steps/Sec: 0.01, Epoch: 110.4, LR: 0.001 [[34m2025-04-15 04:00:51[0m] Beginning epoch 111... [[34m2025-04-15 04:01:31[0m] (step=0000139) Train Loss: 0.3424, Train Steps/Sec: 0.01, Epoch: 111.2, LR: 0.001 [[34m2025-04-15 04:02:52[0m] (step=0000140) Train Loss: 0.2292, Train Steps/Sec: 0.01, Epoch: 112.0, LR: 0.001 [[34m2025-04-15 04:02:52[0m] Beginning epoch 112... [[34m2025-04-15 04:04:26[0m] (step=0000141) Train Loss: 0.2516, Train Steps/Sec: 0.01, Epoch: 112.8, LR: 0.001 [[34m2025-04-15 04:05:03[0m] Beginning epoch 113... [[34m2025-04-15 04:06:52[0m] (step=0000142) Train Loss: 0.2228, Train Steps/Sec: 0.01, Epoch: 113.6, LR: 0.001 [[34m2025-04-15 04:07:42[0m] Beginning epoch 114... [[34m2025-04-15 04:08:26[0m] (step=0000143) Train Loss: 0.2709, Train Steps/Sec: 0.01, Epoch: 114.4, LR: 0.001 [[34m2025-04-15 04:09:46[0m] Beginning epoch 115... [[34m2025-04-15 04:09:57[0m] (step=0000144) Train Loss: 0.2673, Train Steps/Sec: 0.01, Epoch: 115.2, LR: 0.001 [[34m2025-04-15 04:11:48[0m] (step=0000145) Train Loss: 0.2573, Train Steps/Sec: 0.01, Epoch: 116.0, LR: 0.001 [[34m2025-04-15 04:11:48[0m] Beginning epoch 116... [[34m2025-04-15 04:13:22[0m] (step=0000146) Train Loss: 0.3600, Train Steps/Sec: 0.01, Epoch: 116.8, LR: 0.001 [[34m2025-04-15 04:13:54[0m] Beginning epoch 117... [[34m2025-04-15 04:14:50[0m] (step=0000147) Train Loss: 0.1737, Train Steps/Sec: 0.01, Epoch: 117.6, LR: 0.001 [[34m2025-04-15 04:15:57[0m] Beginning epoch 118... [[34m2025-04-15 04:16:41[0m] (step=0000148) Train Loss: 0.2470, Train Steps/Sec: 0.01, Epoch: 118.4, LR: 0.001 [[34m2025-04-15 04:18:25[0m] Beginning epoch 119... [[34m2025-04-15 04:18:38[0m] (step=0000149) Train Loss: 0.2519, Train Steps/Sec: 0.01, Epoch: 119.2, LR: 0.001 [[34m2025-04-15 04:20:36[0m] (step=0000150) Train Loss: 0.2159, Train Steps/Sec: 0.01, Epoch: 120.0, LR: 0.001 [[34m2025-04-15 04:20:36[0m] Beginning epoch 120... [[34m2025-04-15 04:22:19[0m] (step=0000151) Train Loss: 0.2553, Train Steps/Sec: 0.01, Epoch: 120.8, LR: 0.001 [[34m2025-04-15 04:22:28[0m] Beginning epoch 121... [[34m2025-04-15 04:24:10[0m] (step=0000152) Train Loss: 0.3431, Train Steps/Sec: 0.01, Epoch: 121.6, LR: 0.001 [[34m2025-04-15 04:24:55[0m] Beginning epoch 122... [[34m2025-04-15 04:25:59[0m] (step=0000153) Train Loss: 0.2349, Train Steps/Sec: 0.01, Epoch: 122.4, LR: 0.001 [[34m2025-04-15 04:26:53[0m] Beginning epoch 123... [[34m2025-04-15 04:27:06[0m] (step=0000154) Train Loss: 0.2885, Train Steps/Sec: 0.01, Epoch: 123.2, LR: 0.001 [[34m2025-04-15 04:28:58[0m] (step=0000155) Train Loss: 0.2485, Train Steps/Sec: 0.01, Epoch: 124.0, LR: 0.001 [[34m2025-04-15 04:28:58[0m] Beginning epoch 124... [[34m2025-04-15 04:30:59[0m] (step=0000156) Train Loss: 0.2129, Train Steps/Sec: 0.01, Epoch: 124.8, LR: 0.001 [[34m2025-04-15 04:31:32[0m] Beginning epoch 125... [[34m2025-04-15 04:32:32[0m] (step=0000157) Train Loss: 0.2395, Train Steps/Sec: 0.01, Epoch: 125.6, LR: 0.001 [[34m2025-04-15 04:33:47[0m] Beginning epoch 126... [[34m2025-04-15 04:35:05[0m] (step=0000158) Train Loss: 0.1318, Train Steps/Sec: 0.01, Epoch: 126.4, LR: 0.001 [[34m2025-04-15 04:35:37[0m] Beginning epoch 127... [[34m2025-04-15 04:35:50[0m] (step=0000159) Train Loss: 0.3105, Train Steps/Sec: 0.02, Epoch: 127.2, LR: 0.001 [[34m2025-04-15 04:37:48[0m] (step=0000160) Train Loss: 0.2897, Train Steps/Sec: 0.01, Epoch: 128.0, LR: 0.001 [[34m2025-04-15 04:37:48[0m] Beginning epoch 128... [[34m2025-04-15 04:39:38[0m] (step=0000161) Train Loss: 0.2824, Train Steps/Sec: 0.01, Epoch: 128.8, LR: 0.001 [[34m2025-04-15 04:40:15[0m] Beginning epoch 129... [[34m2025-04-15 04:41:36[0m] (step=0000162) Train Loss: 0.2271, Train Steps/Sec: 0.01, Epoch: 129.6, LR: 0.001 [[34m2025-04-15 04:42:38[0m] Beginning epoch 130... [[34m2025-04-15 04:43:33[0m] (step=0000163) Train Loss: 0.2675, Train Steps/Sec: 0.01, Epoch: 130.4, LR: 0.001 [[34m2025-04-15 04:44:38[0m] Beginning epoch 131... [[34m2025-04-15 04:45:12[0m] (step=0000164) Train Loss: 0.1517, Train Steps/Sec: 0.01, Epoch: 131.2, LR: 0.001 [[34m2025-04-15 04:47:08[0m] (step=0000165) Train Loss: 0.2210, Train Steps/Sec: 0.01, Epoch: 132.0, LR: 0.001 [[34m2025-04-15 04:47:08[0m] Beginning epoch 132... [[34m2025-04-15 04:48:24[0m] (step=0000166) Train Loss: 0.3368, Train Steps/Sec: 0.01, Epoch: 132.8, LR: 0.001 [[34m2025-04-15 04:49:06[0m] Beginning epoch 133... [[34m2025-04-15 04:50:37[0m] (step=0000167) Train Loss: 0.2628, Train Steps/Sec: 0.01, Epoch: 133.6, LR: 0.001 [[34m2025-04-15 04:51:46[0m] Beginning epoch 134... [[34m2025-04-15 04:53:09[0m] (step=0000168) Train Loss: 0.1799, Train Steps/Sec: 0.01, Epoch: 134.4, LR: 0.001 [[34m2025-04-15 04:54:01[0m] Beginning epoch 135... [[34m2025-04-15 04:54:12[0m] (step=0000169) Train Loss: 0.3507, Train Steps/Sec: 0.02, Epoch: 135.2, LR: 0.001 [[34m2025-04-15 04:56:13[0m] (step=0000170) Train Loss: 0.1818, Train Steps/Sec: 0.01, Epoch: 136.0, LR: 0.001 [[34m2025-04-15 04:56:13[0m] Beginning epoch 136... [[34m2025-04-15 04:57:58[0m] (step=0000171) Train Loss: 0.2477, Train Steps/Sec: 0.01, Epoch: 136.8, LR: 0.001 [[34m2025-04-15 04:58:35[0m] Beginning epoch 137... [[34m2025-04-15 05:00:31[0m] (step=0000172) Train Loss: 0.1745, Train Steps/Sec: 0.01, Epoch: 137.6, LR: 0.001 [[34m2025-04-15 05:00:51[0m] Beginning epoch 138... [[34m2025-04-15 05:01:46[0m] (step=0000173) Train Loss: 0.2535, Train Steps/Sec: 0.01, Epoch: 138.4, LR: 0.001 [[34m2025-04-15 05:03:04[0m] Beginning epoch 139... [[34m2025-04-15 05:03:45[0m] (step=0000174) Train Loss: 0.3238, Train Steps/Sec: 0.01, Epoch: 139.2, LR: 0.001 [[34m2025-04-15 05:05:36[0m] (step=0000175) Train Loss: 0.2956, Train Steps/Sec: 0.01, Epoch: 140.0, LR: 0.001 [[34m2025-04-15 05:05:36[0m] Beginning epoch 140... [[34m2025-04-15 05:07:15[0m] (step=0000176) Train Loss: 0.2344, Train Steps/Sec: 0.01, Epoch: 140.8, LR: 0.001 [[34m2025-04-15 05:07:24[0m] Beginning epoch 141... [[34m2025-04-15 05:08:50[0m] (step=0000177) Train Loss: 0.2148, Train Steps/Sec: 0.01, Epoch: 141.6, LR: 0.001 [[34m2025-04-15 05:09:31[0m] Beginning epoch 142... [[34m2025-04-15 05:10:21[0m] (step=0000178) Train Loss: 0.2683, Train Steps/Sec: 0.01, Epoch: 142.4, LR: 0.001 [[34m2025-04-15 05:11:21[0m] Beginning epoch 143... [[34m2025-04-15 05:11:36[0m] (step=0000179) Train Loss: 0.2384, Train Steps/Sec: 0.01, Epoch: 143.2, LR: 0.001 [[34m2025-04-15 05:13:24[0m] (step=0000180) Train Loss: 0.3070, Train Steps/Sec: 0.01, Epoch: 144.0, LR: 0.001 [[34m2025-04-15 05:13:24[0m] Beginning epoch 144... [[34m2025-04-15 05:15:20[0m] (step=0000181) Train Loss: 0.1952, Train Steps/Sec: 0.01, Epoch: 144.8, LR: 0.001 [[34m2025-04-15 05:15:29[0m] Beginning epoch 145... [[34m2025-04-15 05:16:29[0m] (step=0000182) Train Loss: 0.2972, Train Steps/Sec: 0.01, Epoch: 145.6, LR: 0.001 [[34m2025-04-15 05:17:10[0m] Beginning epoch 146... [[34m2025-04-15 05:18:10[0m] (step=0000183) Train Loss: 0.2085, Train Steps/Sec: 0.01, Epoch: 146.4, LR: 0.001 [[34m2025-04-15 05:19:10[0m] Beginning epoch 147... [[34m2025-04-15 05:19:20[0m] (step=0000184) Train Loss: 0.3408, Train Steps/Sec: 0.01, Epoch: 147.2, LR: 0.001 [[34m2025-04-15 05:20:56[0m] (step=0000185) Train Loss: 0.2597, Train Steps/Sec: 0.01, Epoch: 148.0, LR: 0.001 [[34m2025-04-15 05:20:56[0m] Beginning epoch 148... [[34m2025-04-15 05:22:56[0m] (step=0000186) Train Loss: 0.2107, Train Steps/Sec: 0.01, Epoch: 148.8, LR: 0.001 [[34m2025-04-15 05:23:32[0m] Beginning epoch 149... [[34m2025-04-15 05:25:13[0m] (step=0000187) Train Loss: 0.1497, Train Steps/Sec: 0.01, Epoch: 149.6, LR: 0.001 [[34m2025-04-15 05:25:39[0m] Beginning epoch 150... [[34m2025-04-15 05:26:43[0m] (step=0000188) Train Loss: 0.2783, Train Steps/Sec: 0.01, Epoch: 150.4, LR: 0.001 [[34m2025-04-15 05:27:58[0m] Beginning epoch 151... [[34m2025-04-15 05:28:11[0m] (step=0000189) Train Loss: 0.2441, Train Steps/Sec: 0.01, Epoch: 151.2, LR: 0.001 [[34m2025-04-15 05:30:14[0m] (step=0000190) Train Loss: 0.2093, Train Steps/Sec: 0.01, Epoch: 152.0, LR: 0.001 [[34m2025-04-15 05:30:14[0m] Beginning epoch 152... [[34m2025-04-15 05:37:46[0m] (step=0000191) Train Loss: 0.2585, Train Steps/Sec: 0.00, Epoch: 152.8, LR: 0.001 [[34m2025-04-15 05:38:28[0m] Beginning epoch 153... [[34m2025-04-15 05:39:21[0m] (step=0000192) Train Loss: 0.2389, Train Steps/Sec: 0.01, Epoch: 153.6, LR: 0.001 [[34m2025-04-15 05:40:29[0m] Beginning epoch 154... [[34m2025-04-15 05:41:57[0m] (step=0000193) Train Loss: 0.1764, Train Steps/Sec: 0.01, Epoch: 154.4, LR: 0.001 [[34m2025-04-15 05:42:34[0m] Beginning epoch 155... [[34m2025-04-15 05:42:46[0m] (step=0000194) Train Loss: 0.3061, Train Steps/Sec: 0.02, Epoch: 155.2, LR: 0.001 [[34m2025-04-15 05:44:53[0m] (step=0000195) Train Loss: 0.1910, Train Steps/Sec: 0.01, Epoch: 156.0, LR: 0.001 [[34m2025-04-15 05:44:53[0m] Beginning epoch 156... [[34m2025-04-15 05:47:11[0m] (step=0000196) Train Loss: 0.2148, Train Steps/Sec: 0.01, Epoch: 156.8, LR: 0.001 [[34m2025-04-15 05:47:20[0m] Beginning epoch 157... [[34m2025-04-15 05:48:53[0m] (step=0000197) Train Loss: 0.2280, Train Steps/Sec: 0.01, Epoch: 157.6, LR: 0.001 [[34m2025-04-15 05:49:56[0m] Beginning epoch 158... [[34m2025-04-15 05:51:07[0m] (step=0000198) Train Loss: 0.1983, Train Steps/Sec: 0.01, Epoch: 158.4, LR: 0.001 [[34m2025-04-15 05:51:56[0m] Beginning epoch 159... [[34m2025-04-15 05:52:42[0m] (step=0000199) Train Loss: 0.1970, Train Steps/Sec: 0.01, Epoch: 159.2, LR: 0.001 [[34m2025-04-15 05:54:16[0m] (step=0000200) Train Loss: 0.2408, Train Steps/Sec: 0.01, Epoch: 160.0, LR: 0.001 [[34m2025-04-15 05:54:16[0m] Beginning epoch 160... [[34m2025-04-15 05:56:13[0m] (step=0000201) Train Loss: 0.2592, Train Steps/Sec: 0.01, Epoch: 160.8, LR: 0.001 [[34m2025-04-15 05:56:32[0m] Beginning epoch 161... [[34m2025-04-15 05:57:24[0m] (step=0000202) Train Loss: 0.2491, Train Steps/Sec: 0.01, Epoch: 161.6, LR: 0.001 [[34m2025-04-15 05:58:05[0m] Beginning epoch 162... [[34m2025-04-15 05:59:15[0m] (step=0000203) Train Loss: 0.1980, Train Steps/Sec: 0.01, Epoch: 162.4, LR: 0.001 [[34m2025-04-15 06:00:32[0m] Beginning epoch 163... [[34m2025-04-15 06:01:07[0m] (step=0000204) Train Loss: 0.2904, Train Steps/Sec: 0.01, Epoch: 163.2, LR: 0.001 [[34m2025-04-15 06:02:15[0m] (step=0000205) Train Loss: 0.2480, Train Steps/Sec: 0.01, Epoch: 164.0, LR: 0.001 [[34m2025-04-15 06:02:15[0m] Beginning epoch 164... [[34m2025-04-15 06:03:44[0m] (step=0000206) Train Loss: 0.2942, Train Steps/Sec: 0.01, Epoch: 164.8, LR: 0.001 [[34m2025-04-15 06:03:54[0m] Beginning epoch 165... [[34m2025-04-15 06:05:15[0m] (step=0000207) Train Loss: 0.2712, Train Steps/Sec: 0.01, Epoch: 165.6, LR: 0.001 [[34m2025-04-15 06:05:55[0m] Beginning epoch 166... [[34m2025-04-15 06:07:01[0m] (step=0000208) Train Loss: 0.2337, Train Steps/Sec: 0.01, Epoch: 166.4, LR: 0.001 [[34m2025-04-15 06:10:37[0m] Beginning epoch 167... [[34m2025-04-15 06:11:27[0m] (step=0000209) Train Loss: 0.2630, Train Steps/Sec: 0.00, Epoch: 167.2, LR: 0.001 [[34m2025-04-15 06:12:58[0m] (step=0000210) Train Loss: 0.1926, Train Steps/Sec: 0.01, Epoch: 168.0, LR: 0.001 [[34m2025-04-15 06:12:58[0m] Beginning epoch 168... [[34m2025-04-15 06:15:16[0m] (step=0000211) Train Loss: 0.1995, Train Steps/Sec: 0.01, Epoch: 168.8, LR: 0.001 [[34m2025-04-15 06:15:26[0m] Beginning epoch 169... [[34m2025-04-15 06:16:55[0m] (step=0000212) Train Loss: 0.2848, Train Steps/Sec: 0.01, Epoch: 169.6, LR: 0.001 [[34m2025-04-15 06:17:35[0m] Beginning epoch 170... [[34m2025-04-15 06:18:31[0m] (step=0000213) Train Loss: 0.2331, Train Steps/Sec: 0.01, Epoch: 170.4, LR: 0.001 [[34m2025-04-15 06:19:29[0m] Beginning epoch 171... [[34m2025-04-15 06:19:33[0m] (step=0000214) Train Loss: 0.1270, Train Steps/Sec: 0.02, Epoch: 171.2, LR: 0.001 [[34m2025-04-15 06:21:34[0m] (step=0000215) Train Loss: 0.3472, Train Steps/Sec: 0.01, Epoch: 172.0, LR: 0.001 [[34m2025-04-15 06:21:34[0m] Beginning epoch 172... [[34m2025-04-15 06:22:56[0m] (step=0000216) Train Loss: 0.2669, Train Steps/Sec: 0.01, Epoch: 172.8, LR: 0.001 [[34m2025-04-15 06:23:27[0m] Beginning epoch 173... [[34m2025-04-15 06:25:09[0m] (step=0000217) Train Loss: 0.2187, Train Steps/Sec: 0.01, Epoch: 173.6, LR: 0.001 [[34m2025-04-15 06:25:32[0m] Beginning epoch 174... [[34m2025-04-15 06:26:16[0m] (step=0000218) Train Loss: 0.2630, Train Steps/Sec: 0.02, Epoch: 174.4, LR: 0.001 [[34m2025-04-15 06:27:25[0m] Beginning epoch 175... [[34m2025-04-15 06:27:36[0m] (step=0000219) Train Loss: 0.3476, Train Steps/Sec: 0.01, Epoch: 175.2, LR: 0.001 [[34m2025-04-15 06:29:24[0m] (step=0000220) Train Loss: 0.1469, Train Steps/Sec: 0.01, Epoch: 176.0, LR: 0.001 [[34m2025-04-15 06:29:24[0m] Beginning epoch 176... [[34m2025-04-15 06:30:39[0m] (step=0000221) Train Loss: 0.2685, Train Steps/Sec: 0.01, Epoch: 176.8, LR: 0.001 [[34m2025-04-15 06:31:16[0m] Beginning epoch 177... [[34m2025-04-15 06:31:49[0m] (step=0000222) Train Loss: 0.2199, Train Steps/Sec: 0.01, Epoch: 177.6, LR: 0.001 |