|
Training 1/1 epoch (loss 2.0025): 0%| | 0/1250 [00:05<?, ?it/s]
Training 1/1 epoch (loss 2.0025): 0%| | 1/1250 [00:05<1:51:04, 5.34s/it]
Training 1/1 epoch (loss 2.0061): 0%| | 1/1250 [00:06<1:51:04, 5.34s/it]
Training 1/1 epoch (loss 2.0061): 0%| | 2/1250 [00:06<1:05:15, 3.14s/it]
Training 1/1 epoch (loss 2.1095): 0%| | 2/1250 [00:07<1:05:15, 3.14s/it]
Training 1/1 epoch (loss 2.1095): 0%| | 3/1250 [00:07<38:18, 1.84s/it]
Training 1/1 epoch (loss 2.1373): 0%| | 3/1250 [00:07<38:18, 1.84s/it]
Training 1/1 epoch (loss 2.1373): 0%| | 4/1250 [00:07<26:42, 1.29s/it]
Training 1/1 epoch (loss 2.0339): 0%| | 4/1250 [00:08<26:42, 1.29s/it]
Training 1/1 epoch (loss 2.0339): 0%| | 5/1250 [00:08<20:36, 1.01it/s]
Training 1/1 epoch (loss 2.1314): 0%| | 5/1250 [00:08<20:36, 1.01it/s]
Training 1/1 epoch (loss 2.1314): 0%| | 6/1250 [00:08<16:47, 1.23it/s]
Training 1/1 epoch (loss 1.9370): 0%| | 6/1250 [00:08<16:47, 1.23it/s]
Training 1/1 epoch (loss 1.9370): 1%| | 7/1250 [00:08<13:21, 1.55it/s]
Training 1/1 epoch (loss 2.0154): 1%| | 7/1250 [00:09<13:21, 1.55it/s]
Training 1/1 epoch (loss 2.0154): 1%| | 8/1250 [00:09<11:36, 1.78it/s]
Training 1/1 epoch (loss 2.0167): 1%| | 8/1250 [00:09<11:36, 1.78it/s]
Training 1/1 epoch (loss 2.0167): 1%| | 9/1250 [00:09<10:00, 2.07it/s]
Training 1/1 epoch (loss 2.0118): 1%| | 9/1250 [00:09<10:00, 2.07it/s]
Training 1/1 epoch (loss 2.0118): 1%| | 10/1250 [00:09<09:13, 2.24it/s]
Training 1/1 epoch (loss 1.9870): 1%| | 10/1250 [00:10<09:13, 2.24it/s]
Training 1/1 epoch (loss 1.9870): 1%| | 11/1250 [00:10<08:29, 2.43it/s]
Training 1/1 epoch (loss 1.9650): 1%| | 11/1250 [00:10<08:29, 2.43it/s]
Training 1/1 epoch (loss 1.9650): 1%| | 12/1250 [00:10<07:47, 2.65it/s]
Training 1/1 epoch (loss 2.0230): 1%| | 12/1250 [00:10<07:47, 2.65it/s]
Training 1/1 epoch (loss 2.0230): 1%| | 13/1250 [00:10<07:26, 2.77it/s]
Training 1/1 epoch (loss 2.1126): 1%| | 13/1250 [00:11<07:26, 2.77it/s]
Training 1/1 epoch (loss 2.1126): 1%| | 14/1250 [00:11<07:01, 2.93it/s]
Training 1/1 epoch (loss 2.0582): 1%| | 14/1250 [00:11<07:01, 2.93it/s]
Training 1/1 epoch (loss 2.0582): 1%| | 15/1250 [00:11<06:54, 2.98it/s]
Training 1/1 epoch (loss 2.1321): 1%| | 15/1250 [00:11<06:54, 2.98it/s]
Training 1/1 epoch (loss 2.1321): 1%|β | 16/1250 [00:11<07:04, 2.90it/s]
Training 1/1 epoch (loss 2.0962): 1%|β | 16/1250 [00:12<07:04, 2.90it/s]
Training 1/1 epoch (loss 2.0962): 1%|β | 17/1250 [00:12<07:07, 2.89it/s]
Training 1/1 epoch (loss 2.0088): 1%|β | 17/1250 [00:12<07:07, 2.89it/s]
Training 1/1 epoch (loss 2.0088): 1%|β | 18/1250 [00:12<06:50, 3.00it/s]
Training 1/1 epoch (loss 2.0530): 1%|β | 18/1250 [00:12<06:50, 3.00it/s]
Training 1/1 epoch (loss 2.0530): 2%|β | 19/1250 [00:12<06:44, 3.04it/s]
Training 1/1 epoch (loss 1.9629): 2%|β | 19/1250 [00:13<06:44, 3.04it/s]
Training 1/1 epoch (loss 1.9629): 2%|β | 20/1250 [00:13<06:39, 3.08it/s]
Training 1/1 epoch (loss 2.0804): 2%|β | 20/1250 [00:13<06:39, 3.08it/s]
Training 1/1 epoch (loss 2.0804): 2%|β | 21/1250 [00:13<06:31, 3.14it/s]
Training 1/1 epoch (loss 1.9713): 2%|β | 21/1250 [00:13<06:31, 3.14it/s]
Training 1/1 epoch (loss 1.9713): 2%|β | 22/1250 [00:13<06:49, 3.00it/s]
Training 1/1 epoch (loss 2.0580): 2%|β | 22/1250 [00:14<06:49, 3.00it/s]
Training 1/1 epoch (loss 2.0580): 2%|β | 23/1250 [00:14<06:55, 2.95it/s]
Training 1/1 epoch (loss 2.0814): 2%|β | 23/1250 [00:14<06:55, 2.95it/s]
Training 1/1 epoch (loss 2.0814): 2%|β | 24/1250 [00:14<06:55, 2.95it/s]
Training 1/1 epoch (loss 1.9873): 2%|β | 24/1250 [00:14<06:55, 2.95it/s]
Training 1/1 epoch (loss 1.9873): 2%|β | 25/1250 [00:14<06:52, 2.97it/s]
Training 1/1 epoch (loss 1.9861): 2%|β | 25/1250 [00:15<06:52, 2.97it/s]
Training 1/1 epoch (loss 1.9861): 2%|β | 26/1250 [00:15<06:44, 3.03it/s]
Training 1/1 epoch (loss 1.9038): 2%|β | 26/1250 [00:15<06:44, 3.03it/s]
Training 1/1 epoch (loss 1.9038): 2%|β | 27/1250 [00:15<06:39, 3.06it/s]
Training 1/1 epoch (loss 2.0115): 2%|β | 27/1250 [00:15<06:39, 3.06it/s]
Training 1/1 epoch (loss 2.0115): 2%|β | 28/1250 [00:15<06:36, 3.08it/s]
Training 1/1 epoch (loss 2.0269): 2%|β | 28/1250 [00:16<06:36, 3.08it/s]
Training 1/1 epoch (loss 2.0269): 2%|β | 29/1250 [00:16<07:03, 2.88it/s]
Training 1/1 epoch (loss 1.9070): 2%|β | 29/1250 [00:16<07:03, 2.88it/s]
Training 1/1 epoch (loss 1.9070): 2%|β | 30/1250 [00:16<06:50, 2.97it/s]
Training 1/1 epoch (loss 2.0913): 2%|β | 30/1250 [00:16<06:50, 2.97it/s]
Training 1/1 epoch (loss 2.0913): 2%|β | 31/1250 [00:16<06:39, 3.05it/s]
Training 1/1 epoch (loss 2.0652): 2%|β | 31/1250 [00:17<06:39, 3.05it/s]
Training 1/1 epoch (loss 2.0652): 3%|β | 32/1250 [00:17<06:43, 3.02it/s]
Training 1/1 epoch (loss 1.8499): 3%|β | 32/1250 [00:17<06:43, 3.02it/s]
Training 1/1 epoch (loss 1.8499): 3%|β | 33/1250 [00:17<06:35, 3.08it/s]
Training 1/1 epoch (loss 1.7884): 3%|β | 33/1250 [00:17<06:35, 3.08it/s]
Training 1/1 epoch (loss 1.7884): 3%|β | 34/1250 [00:17<06:36, 3.07it/s]
Training 1/1 epoch (loss 1.9903): 3%|β | 34/1250 [00:18<06:36, 3.07it/s]
Training 1/1 epoch (loss 1.9903): 3%|β | 35/1250 [00:18<06:42, 3.02it/s]
Training 1/1 epoch (loss 1.9684): 3%|β | 35/1250 [00:18<06:42, 3.02it/s]
Training 1/1 epoch (loss 1.9684): 3%|β | 36/1250 [00:18<06:36, 3.07it/s]
Training 1/1 epoch (loss 1.9977): 3%|β | 36/1250 [00:18<06:36, 3.07it/s]
Training 1/1 epoch (loss 1.9977): 3%|β | 37/1250 [00:18<06:37, 3.05it/s]
Training 1/1 epoch (loss 1.8858): 3%|β | 37/1250 [00:19<06:37, 3.05it/s]
Training 1/1 epoch (loss 1.8858): 3%|β | 38/1250 [00:19<06:33, 3.08it/s]
Training 1/1 epoch (loss 1.8441): 3%|β | 38/1250 [00:19<06:33, 3.08it/s]
Training 1/1 epoch (loss 1.8441): 3%|β | 39/1250 [00:19<06:29, 3.11it/s]
Training 1/1 epoch (loss 1.8456): 3%|β | 39/1250 [00:19<06:29, 3.11it/s]
Training 1/1 epoch (loss 1.8456): 3%|β | 40/1250 [00:19<06:38, 3.04it/s]
Training 1/1 epoch (loss 1.8507): 3%|β | 40/1250 [00:20<06:38, 3.04it/s]
Training 1/1 epoch (loss 1.8507): 3%|β | 41/1250 [00:20<06:50, 2.94it/s]
Training 1/1 epoch (loss 1.9277): 3%|β | 41/1250 [00:20<06:50, 2.94it/s]
Training 1/1 epoch (loss 1.9277): 3%|β | 42/1250 [00:20<06:40, 3.02it/s]
Training 1/1 epoch (loss 1.8635): 3%|β | 42/1250 [00:20<06:40, 3.02it/s]
Training 1/1 epoch (loss 1.8635): 3%|β | 43/1250 [00:20<06:33, 3.07it/s]
Training 1/1 epoch (loss 1.8961): 3%|β | 43/1250 [00:21<06:33, 3.07it/s]
Training 1/1 epoch (loss 1.8961): 4%|β | 44/1250 [00:21<06:24, 3.14it/s]
Training 1/1 epoch (loss 1.9990): 4%|β | 44/1250 [00:21<06:24, 3.14it/s]
Training 1/1 epoch (loss 1.9990): 4%|β | 45/1250 [00:21<06:18, 3.19it/s]
Training 1/1 epoch (loss 1.8932): 4%|β | 45/1250 [00:21<06:18, 3.19it/s]
Training 1/1 epoch (loss 1.8932): 4%|β | 46/1250 [00:21<06:21, 3.15it/s]
Training 1/1 epoch (loss 1.8435): 4%|β | 46/1250 [00:22<06:21, 3.15it/s]
Training 1/1 epoch (loss 1.8435): 4%|β | 47/1250 [00:22<06:28, 3.09it/s]
Training 1/1 epoch (loss 1.8206): 4%|β | 47/1250 [00:22<06:28, 3.09it/s]
Training 1/1 epoch (loss 1.8206): 4%|β | 48/1250 [00:22<06:35, 3.04it/s]
Training 1/1 epoch (loss 1.7980): 4%|β | 48/1250 [00:22<06:35, 3.04it/s]
Training 1/1 epoch (loss 1.7980): 4%|β | 49/1250 [00:22<06:23, 3.13it/s]
Training 1/1 epoch (loss 1.8185): 4%|β | 49/1250 [00:23<06:23, 3.13it/s]
Training 1/1 epoch (loss 1.8185): 4%|β | 50/1250 [00:23<06:19, 3.16it/s]
Training 1/1 epoch (loss 1.7631): 4%|β | 50/1250 [00:23<06:19, 3.16it/s]
Training 1/1 epoch (loss 1.7631): 4%|β | 51/1250 [00:23<06:22, 3.13it/s]
Training 1/1 epoch (loss 1.9100): 4%|β | 51/1250 [00:23<06:22, 3.13it/s]
Training 1/1 epoch (loss 1.9100): 4%|β | 52/1250 [00:23<06:23, 3.13it/s]
Training 1/1 epoch (loss 1.9352): 4%|β | 52/1250 [00:24<06:23, 3.13it/s]
Training 1/1 epoch (loss 1.9352): 4%|β | 53/1250 [00:24<06:40, 2.99it/s]
Training 1/1 epoch (loss 1.9555): 4%|β | 53/1250 [00:24<06:40, 2.99it/s]
Training 1/1 epoch (loss 1.9555): 4%|β | 54/1250 [00:24<06:47, 2.94it/s]
Training 1/1 epoch (loss 1.9252): 4%|β | 54/1250 [00:24<06:47, 2.94it/s]
Training 1/1 epoch (loss 1.9252): 4%|β | 55/1250 [00:24<06:36, 3.01it/s]
Training 1/1 epoch (loss 1.8979): 4%|β | 55/1250 [00:25<06:36, 3.01it/s]
Training 1/1 epoch (loss 1.8979): 4%|β | 56/1250 [00:25<06:33, 3.04it/s]
Training 1/1 epoch (loss 1.6819): 4%|β | 56/1250 [00:25<06:33, 3.04it/s]
Training 1/1 epoch (loss 1.6819): 5%|β | 57/1250 [00:25<06:24, 3.11it/s]
Training 1/1 epoch (loss 1.7166): 5%|β | 57/1250 [00:25<06:24, 3.11it/s]
Training 1/1 epoch (loss 1.7166): 5%|β | 58/1250 [00:25<06:15, 3.18it/s]
Training 1/1 epoch (loss 1.7133): 5%|β | 58/1250 [00:25<06:15, 3.18it/s]
Training 1/1 epoch (loss 1.7133): 5%|β | 59/1250 [00:25<06:28, 3.07it/s]
Training 1/1 epoch (loss 1.7444): 5%|β | 59/1250 [00:26<06:28, 3.07it/s]
Training 1/1 epoch (loss 1.7444): 5%|β | 60/1250 [00:26<06:41, 2.96it/s]
Training 1/1 epoch (loss 1.8374): 5%|β | 60/1250 [00:26<06:41, 2.96it/s]
Training 1/1 epoch (loss 1.8374): 5%|β | 61/1250 [00:26<06:39, 2.98it/s]
Training 1/1 epoch (loss 1.8440): 5%|β | 61/1250 [00:26<06:39, 2.98it/s]
Training 1/1 epoch (loss 1.8440): 5%|β | 62/1250 [00:26<06:30, 3.04it/s]
Training 1/1 epoch (loss 1.8713): 5%|β | 62/1250 [00:27<06:30, 3.04it/s]
Training 1/1 epoch (loss 1.8713): 5%|β | 63/1250 [00:27<06:24, 3.09it/s]
Training 1/1 epoch (loss 1.8771): 5%|β | 63/1250 [00:27<06:24, 3.09it/s]
Training 1/1 epoch (loss 1.8771): 5%|β | 64/1250 [00:27<06:21, 3.11it/s]
Training 1/1 epoch (loss 1.7059): 5%|β | 64/1250 [00:27<06:21, 3.11it/s]
Training 1/1 epoch (loss 1.7059): 5%|β | 65/1250 [00:27<06:25, 3.08it/s]
Training 1/1 epoch (loss 1.8282): 5%|β | 65/1250 [00:28<06:25, 3.08it/s]
Training 1/1 epoch (loss 1.8282): 5%|β | 66/1250 [00:28<06:42, 2.94it/s]
Training 1/1 epoch (loss 1.7099): 5%|β | 66/1250 [00:28<06:42, 2.94it/s]
Training 1/1 epoch (loss 1.7099): 5%|β | 67/1250 [00:28<06:30, 3.03it/s]
Training 1/1 epoch (loss 1.7050): 5%|β | 67/1250 [00:28<06:30, 3.03it/s]
Training 1/1 epoch (loss 1.7050): 5%|β | 68/1250 [00:28<06:24, 3.07it/s]
Training 1/1 epoch (loss 1.7280): 5%|β | 68/1250 [00:29<06:24, 3.07it/s]
Training 1/1 epoch (loss 1.7280): 6%|β | 69/1250 [00:29<06:25, 3.06it/s]
Training 1/1 epoch (loss 1.6351): 6%|β | 69/1250 [00:29<06:25, 3.06it/s]
Training 1/1 epoch (loss 1.6351): 6%|β | 70/1250 [00:29<06:21, 3.09it/s]
Training 1/1 epoch (loss 1.5403): 6%|β | 70/1250 [00:29<06:21, 3.09it/s]
Training 1/1 epoch (loss 1.5403): 6%|β | 71/1250 [00:29<06:32, 3.01it/s]
Training 1/1 epoch (loss 1.8749): 6%|β | 71/1250 [00:30<06:32, 3.01it/s]
Training 1/1 epoch (loss 1.8749): 6%|β | 72/1250 [00:30<06:38, 2.96it/s]
Training 1/1 epoch (loss 1.7015): 6%|β | 72/1250 [00:30<06:38, 2.96it/s]
Training 1/1 epoch (loss 1.7015): 6%|β | 73/1250 [00:30<06:32, 3.00it/s]
Training 1/1 epoch (loss 1.7549): 6%|β | 73/1250 [00:30<06:32, 3.00it/s]
Training 1/1 epoch (loss 1.7549): 6%|β | 74/1250 [00:30<06:22, 3.07it/s]
Training 1/1 epoch (loss 1.7548): 6%|β | 74/1250 [00:31<06:22, 3.07it/s]
Training 1/1 epoch (loss 1.7548): 6%|β | 75/1250 [00:31<06:16, 3.12it/s]
Training 1/1 epoch (loss 1.8467): 6%|β | 75/1250 [00:31<06:16, 3.12it/s]
Training 1/1 epoch (loss 1.8467): 6%|β | 76/1250 [00:31<06:15, 3.12it/s]
Training 1/1 epoch (loss 1.6975): 6%|β | 76/1250 [00:31<06:15, 3.12it/s]
Training 1/1 epoch (loss 1.6975): 6%|β | 77/1250 [00:31<06:11, 3.16it/s]
Training 1/1 epoch (loss 1.6910): 6%|β | 77/1250 [00:32<06:11, 3.16it/s]
Training 1/1 epoch (loss 1.6910): 6%|β | 78/1250 [00:32<06:27, 3.02it/s]
Training 1/1 epoch (loss 1.8507): 6%|β | 78/1250 [00:32<06:27, 3.02it/s]
Training 1/1 epoch (loss 1.8507): 6%|β | 79/1250 [00:32<06:27, 3.03it/s]
Training 1/1 epoch (loss 1.6599): 6%|β | 79/1250 [00:32<06:27, 3.03it/s]
Training 1/1 epoch (loss 1.6599): 6%|β | 80/1250 [00:32<06:22, 3.06it/s]
Training 1/1 epoch (loss 1.7499): 6%|β | 80/1250 [00:33<06:22, 3.06it/s]
Training 1/1 epoch (loss 1.7499): 6%|β | 81/1250 [00:33<06:17, 3.10it/s]
Training 1/1 epoch (loss 1.7430): 6%|β | 81/1250 [00:33<06:17, 3.10it/s]
Training 1/1 epoch (loss 1.7430): 7%|β | 82/1250 [00:33<06:28, 3.01it/s]
Training 1/1 epoch (loss 1.7564): 7%|β | 82/1250 [00:33<06:28, 3.01it/s]
Training 1/1 epoch (loss 1.7564): 7%|β | 83/1250 [00:33<06:31, 2.98it/s]
Training 1/1 epoch (loss 1.8237): 7%|β | 83/1250 [00:34<06:31, 2.98it/s]
Training 1/1 epoch (loss 1.8237): 7%|β | 84/1250 [00:34<06:37, 2.94it/s]
Training 1/1 epoch (loss 1.7649): 7%|β | 84/1250 [00:34<06:37, 2.94it/s]
Training 1/1 epoch (loss 1.7649): 7%|β | 85/1250 [00:34<06:32, 2.97it/s]
Training 1/1 epoch (loss 1.7802): 7%|β | 85/1250 [00:34<06:32, 2.97it/s]
Training 1/1 epoch (loss 1.7802): 7%|β | 86/1250 [00:34<06:24, 3.03it/s]
Training 1/1 epoch (loss 1.6137): 7%|β | 86/1250 [00:35<06:24, 3.03it/s]
Training 1/1 epoch (loss 1.6137): 7%|β | 87/1250 [00:35<06:17, 3.08it/s]
Training 1/1 epoch (loss 1.7510): 7%|β | 87/1250 [00:35<06:17, 3.08it/s]
Training 1/1 epoch (loss 1.7510): 7%|β | 88/1250 [00:35<06:18, 3.07it/s]
Training 1/1 epoch (loss 1.7444): 7%|β | 88/1250 [00:35<06:18, 3.07it/s]
Training 1/1 epoch (loss 1.7444): 7%|β | 89/1250 [00:35<06:16, 3.08it/s]
Training 1/1 epoch (loss 1.7187): 7%|β | 89/1250 [00:36<06:16, 3.08it/s]
Training 1/1 epoch (loss 1.7187): 7%|β | 90/1250 [00:36<06:09, 3.14it/s]
Training 1/1 epoch (loss 1.7448): 7%|β | 90/1250 [00:36<06:09, 3.14it/s]
Training 1/1 epoch (loss 1.7448): 7%|β | 91/1250 [00:36<06:08, 3.14it/s]
Training 1/1 epoch (loss 1.6124): 7%|β | 91/1250 [00:36<06:08, 3.14it/s]
Training 1/1 epoch (loss 1.6124): 7%|β | 92/1250 [00:36<06:22, 3.03it/s]
Training 1/1 epoch (loss 1.7273): 7%|β | 92/1250 [00:37<06:22, 3.03it/s]
Training 1/1 epoch (loss 1.7273): 7%|β | 93/1250 [00:37<06:13, 3.10it/s]
Training 1/1 epoch (loss 1.7605): 7%|β | 93/1250 [00:37<06:13, 3.10it/s]
Training 1/1 epoch (loss 1.7605): 8%|β | 94/1250 [00:37<06:11, 3.11it/s]
Training 1/1 epoch (loss 1.7138): 8%|β | 94/1250 [00:37<06:11, 3.11it/s]
Training 1/1 epoch (loss 1.7138): 8%|β | 95/1250 [00:37<06:03, 3.18it/s]
Training 1/1 epoch (loss 1.7958): 8%|β | 95/1250 [00:38<06:03, 3.18it/s]
Training 1/1 epoch (loss 1.7958): 8%|β | 96/1250 [00:38<06:23, 3.01it/s]
Training 1/1 epoch (loss 1.7015): 8%|β | 96/1250 [00:38<06:23, 3.01it/s]
Training 1/1 epoch (loss 1.7015): 8%|β | 97/1250 [00:38<06:35, 2.91it/s]
Training 1/1 epoch (loss 1.7864): 8%|β | 97/1250 [00:38<06:35, 2.91it/s]
Training 1/1 epoch (loss 1.7864): 8%|β | 98/1250 [00:38<06:35, 2.91it/s]
Training 1/1 epoch (loss 1.6665): 8%|β | 98/1250 [00:39<06:35, 2.91it/s]
Training 1/1 epoch (loss 1.6665): 8%|β | 99/1250 [00:39<06:22, 3.01it/s]
Training 1/1 epoch (loss 1.5827): 8%|β | 99/1250 [00:39<06:22, 3.01it/s]
Training 1/1 epoch (loss 1.5827): 8%|β | 100/1250 [00:39<06:14, 3.07it/s]
Training 1/1 epoch (loss 1.5312): 8%|β | 100/1250 [00:39<06:14, 3.07it/s]
Training 1/1 epoch (loss 1.5312): 8%|β | 101/1250 [00:39<06:19, 3.03it/s]
Training 1/1 epoch (loss 1.7691): 8%|β | 101/1250 [00:40<06:19, 3.03it/s]
Training 1/1 epoch (loss 1.7691): 8%|β | 102/1250 [00:40<06:26, 2.97it/s]
Training 1/1 epoch (loss 1.7201): 8%|β | 102/1250 [00:40<06:26, 2.97it/s]
Training 1/1 epoch (loss 1.7201): 8%|β | 103/1250 [00:40<06:29, 2.95it/s]
Training 1/1 epoch (loss 1.7312): 8%|β | 103/1250 [00:40<06:29, 2.95it/s]
Training 1/1 epoch (loss 1.7312): 8%|β | 104/1250 [00:40<06:24, 2.98it/s]
Training 1/1 epoch (loss 1.8135): 8%|β | 104/1250 [00:41<06:24, 2.98it/s]
Training 1/1 epoch (loss 1.8135): 8%|β | 105/1250 [00:41<06:16, 3.04it/s]
Training 1/1 epoch (loss 1.7244): 8%|β | 105/1250 [00:41<06:16, 3.04it/s]
Training 1/1 epoch (loss 1.7244): 8%|β | 106/1250 [00:41<06:06, 3.12it/s]
Training 1/1 epoch (loss 1.6460): 8%|β | 106/1250 [00:41<06:06, 3.12it/s]
Training 1/1 epoch (loss 1.6460): 9%|β | 107/1250 [00:41<06:06, 3.12it/s]
Training 1/1 epoch (loss 1.7568): 9%|β | 107/1250 [00:42<06:06, 3.12it/s]
Training 1/1 epoch (loss 1.7568): 9%|β | 108/1250 [00:42<06:10, 3.08it/s]
Training 1/1 epoch (loss 1.6587): 9%|β | 108/1250 [00:42<06:10, 3.08it/s]
Training 1/1 epoch (loss 1.6587): 9%|β | 109/1250 [00:42<06:21, 2.99it/s]
Training 1/1 epoch (loss 1.7155): 9%|β | 109/1250 [00:42<06:21, 2.99it/s]
Training 1/1 epoch (loss 1.7155): 9%|β | 110/1250 [00:42<06:22, 2.98it/s]
Training 1/1 epoch (loss 1.6509): 9%|β | 110/1250 [00:43<06:22, 2.98it/s]
Training 1/1 epoch (loss 1.6509): 9%|β | 111/1250 [00:43<06:13, 3.05it/s]
Training 1/1 epoch (loss 1.6643): 9%|β | 111/1250 [00:43<06:13, 3.05it/s]
Training 1/1 epoch (loss 1.6643): 9%|β | 112/1250 [00:43<06:14, 3.04it/s]
Training 1/1 epoch (loss 1.6104): 9%|β | 112/1250 [00:43<06:14, 3.04it/s]
Training 1/1 epoch (loss 1.6104): 9%|β | 113/1250 [00:43<06:13, 3.04it/s]
Training 1/1 epoch (loss 1.6168): 9%|β | 113/1250 [00:44<06:13, 3.04it/s]
Training 1/1 epoch (loss 1.6168): 9%|β | 114/1250 [00:44<06:09, 3.07it/s]
Training 1/1 epoch (loss 1.6033): 9%|β | 114/1250 [00:44<06:09, 3.07it/s]
Training 1/1 epoch (loss 1.6033): 9%|β | 115/1250 [00:44<06:17, 3.01it/s]
Training 1/1 epoch (loss 1.6847): 9%|β | 115/1250 [00:44<06:17, 3.01it/s]
Training 1/1 epoch (loss 1.6847): 9%|β | 116/1250 [00:44<06:15, 3.02it/s]
Training 1/1 epoch (loss 1.7735): 9%|β | 116/1250 [00:45<06:15, 3.02it/s]
Training 1/1 epoch (loss 1.7735): 9%|β | 117/1250 [00:45<06:41, 2.82it/s]
Training 1/1 epoch (loss 1.7128): 9%|β | 117/1250 [00:45<06:41, 2.82it/s]
Training 1/1 epoch (loss 1.7128): 9%|β | 118/1250 [00:45<06:28, 2.91it/s]
Training 1/1 epoch (loss 1.7020): 9%|β | 118/1250 [00:45<06:28, 2.91it/s]
Training 1/1 epoch (loss 1.7020): 10%|β | 119/1250 [00:45<06:27, 2.92it/s]
Training 1/1 epoch (loss 1.6180): 10%|β | 119/1250 [00:46<06:27, 2.92it/s]
Training 1/1 epoch (loss 1.6180): 10%|β | 120/1250 [00:46<06:52, 2.74it/s]
Training 1/1 epoch (loss 1.7101): 10%|β | 120/1250 [00:46<06:52, 2.74it/s]
Training 1/1 epoch (loss 1.7101): 10%|β | 121/1250 [00:46<06:34, 2.86it/s]
Training 1/1 epoch (loss 1.7494): 10%|β | 121/1250 [00:46<06:34, 2.86it/s]
Training 1/1 epoch (loss 1.7494): 10%|β | 122/1250 [00:46<06:31, 2.88it/s]
Training 1/1 epoch (loss 1.6206): 10%|β | 122/1250 [00:47<06:31, 2.88it/s]
Training 1/1 epoch (loss 1.6206): 10%|β | 123/1250 [00:47<06:27, 2.91it/s]
Training 1/1 epoch (loss 1.6345): 10%|β | 123/1250 [00:47<06:27, 2.91it/s]
Training 1/1 epoch (loss 1.6345): 10%|β | 124/1250 [00:47<06:12, 3.02it/s]
Training 1/1 epoch (loss 1.7739): 10%|β | 124/1250 [00:47<06:12, 3.02it/s]
Training 1/1 epoch (loss 1.7739): 10%|β | 125/1250 [00:47<06:07, 3.06it/s]
Training 1/1 epoch (loss 1.6055): 10%|β | 125/1250 [00:48<06:07, 3.06it/s]
Training 1/1 epoch (loss 1.6055): 10%|β | 126/1250 [00:48<06:26, 2.91it/s]
Training 1/1 epoch (loss 1.7526): 10%|β | 126/1250 [00:48<06:26, 2.91it/s]
Training 1/1 epoch (loss 1.7526): 10%|β | 127/1250 [00:48<06:24, 2.92it/s]
Training 1/1 epoch (loss 1.6757): 10%|β | 127/1250 [00:48<06:24, 2.92it/s]
Training 1/1 epoch (loss 1.6757): 10%|β | 128/1250 [00:48<06:20, 2.95it/s]
Training 1/1 epoch (loss 1.7244): 10%|β | 128/1250 [00:49<06:20, 2.95it/s]
Training 1/1 epoch (loss 1.7244): 10%|β | 129/1250 [00:49<06:17, 2.97it/s]
Training 1/1 epoch (loss 1.6343): 10%|β | 129/1250 [00:49<06:17, 2.97it/s]
Training 1/1 epoch (loss 1.6343): 10%|β | 130/1250 [00:49<06:13, 3.00it/s]
Training 1/1 epoch (loss 1.6563): 10%|β | 130/1250 [00:49<06:13, 3.00it/s]
Training 1/1 epoch (loss 1.6563): 10%|β | 131/1250 [00:49<06:12, 3.01it/s]
Training 1/1 epoch (loss 1.6773): 10%|β | 131/1250 [00:50<06:12, 3.01it/s]
Training 1/1 epoch (loss 1.6773): 11%|β | 132/1250 [00:50<06:26, 2.90it/s]
Training 1/1 epoch (loss 1.7196): 11%|β | 132/1250 [00:50<06:26, 2.90it/s]
Training 1/1 epoch (loss 1.7196): 11%|β | 133/1250 [00:50<06:21, 2.93it/s]
Training 1/1 epoch (loss 1.6251): 11%|β | 133/1250 [00:50<06:21, 2.93it/s]
Training 1/1 epoch (loss 1.6251): 11%|β | 134/1250 [00:50<06:06, 3.05it/s]
Training 1/1 epoch (loss 1.7006): 11%|β | 134/1250 [00:51<06:06, 3.05it/s]
Training 1/1 epoch (loss 1.7006): 11%|β | 135/1250 [00:51<06:04, 3.06it/s]
Training 1/1 epoch (loss 1.7195): 11%|β | 135/1250 [00:51<06:04, 3.06it/s]
Training 1/1 epoch (loss 1.7195): 11%|β | 136/1250 [00:51<06:02, 3.07it/s]
Training 1/1 epoch (loss 1.6672): 11%|β | 136/1250 [00:51<06:02, 3.07it/s]
Training 1/1 epoch (loss 1.6672): 11%|β | 137/1250 [00:51<06:04, 3.05it/s]
Training 1/1 epoch (loss 1.5522): 11%|β | 137/1250 [00:52<06:04, 3.05it/s]
Training 1/1 epoch (loss 1.5522): 11%|β | 138/1250 [00:52<06:03, 3.06it/s]
Training 1/1 epoch (loss 1.5814): 11%|β | 138/1250 [00:52<06:03, 3.06it/s]
Training 1/1 epoch (loss 1.5814): 11%|β | 139/1250 [00:52<06:04, 3.05it/s]
Training 1/1 epoch (loss 1.7004): 11%|β | 139/1250 [00:52<06:04, 3.05it/s]
Training 1/1 epoch (loss 1.7004): 11%|β | 140/1250 [00:52<05:58, 3.10it/s]
Training 1/1 epoch (loss 1.7011): 11%|β | 140/1250 [00:53<05:58, 3.10it/s]
Training 1/1 epoch (loss 1.7011): 11%|ββ | 141/1250 [00:53<05:52, 3.15it/s]
Training 1/1 epoch (loss 1.6075): 11%|ββ | 141/1250 [00:53<05:52, 3.15it/s]
Training 1/1 epoch (loss 1.6075): 11%|ββ | 142/1250 [00:53<05:57, 3.10it/s]
Training 1/1 epoch (loss 1.7217): 11%|ββ | 142/1250 [00:53<05:57, 3.10it/s]
Training 1/1 epoch (loss 1.7217): 11%|ββ | 143/1250 [00:53<05:53, 3.13it/s]
Training 1/1 epoch (loss 1.6726): 11%|ββ | 143/1250 [00:54<05:53, 3.13it/s]
Training 1/1 epoch (loss 1.6726): 12%|ββ | 144/1250 [00:54<06:05, 3.03it/s]
Training 1/1 epoch (loss 1.5462): 12%|ββ | 144/1250 [00:54<06:05, 3.03it/s]
Training 1/1 epoch (loss 1.5462): 12%|ββ | 145/1250 [00:54<06:06, 3.01it/s]
Training 1/1 epoch (loss 1.6133): 12%|ββ | 145/1250 [00:54<06:06, 3.01it/s]
Training 1/1 epoch (loss 1.6133): 12%|ββ | 146/1250 [00:54<06:00, 3.06it/s]
Training 1/1 epoch (loss 1.7072): 12%|ββ | 146/1250 [00:55<06:00, 3.06it/s]
Training 1/1 epoch (loss 1.7072): 12%|ββ | 147/1250 [00:55<05:55, 3.10it/s]
Training 1/1 epoch (loss 1.5206): 12%|ββ | 147/1250 [00:55<05:55, 3.10it/s]
Training 1/1 epoch (loss 1.5206): 12%|ββ | 148/1250 [00:55<05:52, 3.13it/s]
Training 1/1 epoch (loss 1.5996): 12%|ββ | 148/1250 [00:55<05:52, 3.13it/s]
Training 1/1 epoch (loss 1.5996): 12%|ββ | 149/1250 [00:55<06:05, 3.01it/s]
Training 1/1 epoch (loss 1.7479): 12%|ββ | 149/1250 [00:56<06:05, 3.01it/s]
Training 1/1 epoch (loss 1.7479): 12%|ββ | 150/1250 [00:56<06:01, 3.04it/s]
Training 1/1 epoch (loss 1.6373): 12%|ββ | 150/1250 [00:56<06:01, 3.04it/s]
Training 1/1 epoch (loss 1.6373): 12%|ββ | 151/1250 [00:56<06:04, 3.02it/s]
Training 1/1 epoch (loss 1.6293): 12%|ββ | 151/1250 [00:56<06:04, 3.02it/s]
Training 1/1 epoch (loss 1.6293): 12%|ββ | 152/1250 [00:56<06:04, 3.01it/s]
Training 1/1 epoch (loss 1.5753): 12%|ββ | 152/1250 [00:57<06:04, 3.01it/s]
Training 1/1 epoch (loss 1.5753): 12%|ββ | 153/1250 [00:57<06:06, 2.99it/s]
Training 1/1 epoch (loss 1.6842): 12%|ββ | 153/1250 [00:57<06:06, 2.99it/s]
Training 1/1 epoch (loss 1.6842): 12%|ββ | 154/1250 [00:57<06:02, 3.02it/s]
Training 1/1 epoch (loss 1.6403): 12%|ββ | 154/1250 [00:57<06:02, 3.02it/s]
Training 1/1 epoch (loss 1.6403): 12%|ββ | 155/1250 [00:57<06:16, 2.91it/s]
Training 1/1 epoch (loss 1.7937): 12%|ββ | 155/1250 [00:58<06:16, 2.91it/s]
Training 1/1 epoch (loss 1.7937): 12%|ββ | 156/1250 [00:58<06:22, 2.86it/s]
Training 1/1 epoch (loss 1.6188): 12%|ββ | 156/1250 [00:58<06:22, 2.86it/s]
Training 1/1 epoch (loss 1.6188): 13%|ββ | 157/1250 [00:58<06:17, 2.89it/s]
Training 1/1 epoch (loss 1.6577): 13%|ββ | 157/1250 [00:58<06:17, 2.89it/s]
Training 1/1 epoch (loss 1.6577): 13%|ββ | 158/1250 [00:58<06:07, 2.97it/s]
Training 1/1 epoch (loss 1.7033): 13%|ββ | 158/1250 [00:59<06:07, 2.97it/s]
Training 1/1 epoch (loss 1.7033): 13%|ββ | 159/1250 [00:59<06:00, 3.02it/s]
Training 1/1 epoch (loss 1.6625): 13%|ββ | 159/1250 [00:59<06:00, 3.02it/s]
Training 1/1 epoch (loss 1.6625): 13%|ββ | 160/1250 [00:59<05:59, 3.03it/s]
Training 1/1 epoch (loss 1.7135): 13%|ββ | 160/1250 [00:59<05:59, 3.03it/s]
Training 1/1 epoch (loss 1.7135): 13%|ββ | 161/1250 [00:59<05:53, 3.08it/s]
Training 1/1 epoch (loss 1.6255): 13%|ββ | 161/1250 [01:00<05:53, 3.08it/s]
Training 1/1 epoch (loss 1.6255): 13%|ββ | 162/1250 [01:00<06:02, 3.00it/s]
Training 1/1 epoch (loss 1.6038): 13%|ββ | 162/1250 [01:00<06:02, 3.00it/s]
Training 1/1 epoch (loss 1.6038): 13%|ββ | 163/1250 [01:00<06:10, 2.93it/s]
Training 1/1 epoch (loss 1.6483): 13%|ββ | 163/1250 [01:00<06:10, 2.93it/s]
Training 1/1 epoch (loss 1.6483): 13%|ββ | 164/1250 [01:00<06:25, 2.82it/s]
Training 1/1 epoch (loss 1.7052): 13%|ββ | 164/1250 [01:01<06:25, 2.82it/s]
Training 1/1 epoch (loss 1.7052): 13%|ββ | 165/1250 [01:01<06:12, 2.91it/s]
Training 1/1 epoch (loss 1.6509): 13%|ββ | 165/1250 [01:01<06:12, 2.91it/s]
Training 1/1 epoch (loss 1.6509): 13%|ββ | 166/1250 [01:01<06:00, 3.00it/s]
Training 1/1 epoch (loss 1.6489): 13%|ββ | 166/1250 [01:01<06:00, 3.00it/s]
Training 1/1 epoch (loss 1.6489): 13%|ββ | 167/1250 [01:01<05:55, 3.05it/s]
Training 1/1 epoch (loss 1.7147): 13%|ββ | 167/1250 [01:02<05:55, 3.05it/s]
Training 1/1 epoch (loss 1.7147): 13%|ββ | 168/1250 [01:02<06:06, 2.96it/s]
Training 1/1 epoch (loss 1.5914): 13%|ββ | 168/1250 [01:02<06:06, 2.96it/s]
Training 1/1 epoch (loss 1.5914): 14%|ββ | 169/1250 [01:02<06:11, 2.91it/s]
Training 1/1 epoch (loss 1.7178): 14%|ββ | 169/1250 [01:02<06:11, 2.91it/s]
Training 1/1 epoch (loss 1.7178): 14%|ββ | 170/1250 [01:02<06:04, 2.97it/s]
Training 1/1 epoch (loss 1.5484): 14%|ββ | 170/1250 [01:03<06:04, 2.97it/s]
Training 1/1 epoch (loss 1.5484): 14%|ββ | 171/1250 [01:03<05:53, 3.05it/s]
Training 1/1 epoch (loss 1.6309): 14%|ββ | 171/1250 [01:03<05:53, 3.05it/s]
Training 1/1 epoch (loss 1.6309): 14%|ββ | 172/1250 [01:03<05:51, 3.07it/s]
Training 1/1 epoch (loss 1.4909): 14%|ββ | 172/1250 [01:03<05:51, 3.07it/s]
Training 1/1 epoch (loss 1.4909): 14%|ββ | 173/1250 [01:03<05:52, 3.05it/s]
Training 1/1 epoch (loss 1.5342): 14%|ββ | 173/1250 [01:04<05:52, 3.05it/s]
Training 1/1 epoch (loss 1.5342): 14%|ββ | 174/1250 [01:04<05:51, 3.06it/s]
Training 1/1 epoch (loss 1.6587): 14%|ββ | 174/1250 [01:04<05:51, 3.06it/s]
Training 1/1 epoch (loss 1.6587): 14%|ββ | 175/1250 [01:04<05:55, 3.02it/s]
Training 1/1 epoch (loss 1.6396): 14%|ββ | 175/1250 [01:04<05:55, 3.02it/s]
Training 1/1 epoch (loss 1.6396): 14%|ββ | 176/1250 [01:04<05:55, 3.02it/s]
Training 1/1 epoch (loss 1.6741): 14%|ββ | 176/1250 [01:05<05:55, 3.02it/s]
Training 1/1 epoch (loss 1.6741): 14%|ββ | 177/1250 [01:05<05:51, 3.05it/s]
Training 1/1 epoch (loss 1.6257): 14%|ββ | 177/1250 [01:05<05:51, 3.05it/s]
Training 1/1 epoch (loss 1.6257): 14%|ββ | 178/1250 [01:05<05:41, 3.14it/s]
Training 1/1 epoch (loss 1.5932): 14%|ββ | 178/1250 [01:05<05:41, 3.14it/s]
Training 1/1 epoch (loss 1.5932): 14%|ββ | 179/1250 [01:05<05:43, 3.11it/s]
Training 1/1 epoch (loss 1.6069): 14%|ββ | 179/1250 [01:06<05:43, 3.11it/s]
Training 1/1 epoch (loss 1.6069): 14%|ββ | 180/1250 [01:06<06:02, 2.95it/s]
Training 1/1 epoch (loss 1.6143): 14%|ββ | 180/1250 [01:06<06:02, 2.95it/s]
Training 1/1 epoch (loss 1.6143): 14%|ββ | 181/1250 [01:06<06:00, 2.96it/s]
Training 1/1 epoch (loss 1.7559): 14%|ββ | 181/1250 [01:06<06:00, 2.96it/s]
Training 1/1 epoch (loss 1.7559): 15%|ββ | 182/1250 [01:06<05:53, 3.02it/s]
Training 1/1 epoch (loss 1.6504): 15%|ββ | 182/1250 [01:07<05:53, 3.02it/s]
Training 1/1 epoch (loss 1.6504): 15%|ββ | 183/1250 [01:07<05:45, 3.09it/s]
Training 1/1 epoch (loss 1.7136): 15%|ββ | 183/1250 [01:07<05:45, 3.09it/s]
Training 1/1 epoch (loss 1.7136): 15%|ββ | 184/1250 [01:07<05:55, 3.00it/s]
Training 1/1 epoch (loss 1.6592): 15%|ββ | 184/1250 [01:07<05:55, 3.00it/s]
Training 1/1 epoch (loss 1.6592): 15%|ββ | 185/1250 [01:07<05:48, 3.05it/s]
Training 1/1 epoch (loss 1.7109): 15%|ββ | 185/1250 [01:08<05:48, 3.05it/s]
Training 1/1 epoch (loss 1.7109): 15%|ββ | 186/1250 [01:08<05:47, 3.06it/s]
Training 1/1 epoch (loss 1.6405): 15%|ββ | 186/1250 [01:08<05:47, 3.06it/s]
Training 1/1 epoch (loss 1.6405): 15%|ββ | 187/1250 [01:08<05:53, 3.01it/s]
Training 1/1 epoch (loss 1.6203): 15%|ββ | 187/1250 [01:08<05:53, 3.01it/s]
Training 1/1 epoch (loss 1.6203): 15%|ββ | 188/1250 [01:08<05:52, 3.02it/s]
Training 1/1 epoch (loss 1.6863): 15%|ββ | 188/1250 [01:09<05:52, 3.02it/s]
Training 1/1 epoch (loss 1.6863): 15%|ββ | 189/1250 [01:09<05:51, 3.02it/s]
Training 1/1 epoch (loss 1.6963): 15%|ββ | 189/1250 [01:09<05:51, 3.02it/s]
Training 1/1 epoch (loss 1.6963): 15%|ββ | 190/1250 [01:09<05:43, 3.09it/s]
Training 1/1 epoch (loss 1.6401): 15%|ββ | 190/1250 [01:09<05:43, 3.09it/s]
Training 1/1 epoch (loss 1.6401): 15%|ββ | 191/1250 [01:09<05:34, 3.17it/s]
Training 1/1 epoch (loss 1.6719): 15%|ββ | 191/1250 [01:10<05:34, 3.17it/s]
Training 1/1 epoch (loss 1.6719): 15%|ββ | 192/1250 [01:10<05:46, 3.05it/s]
Training 1/1 epoch (loss 1.5388): 15%|ββ | 192/1250 [01:10<05:46, 3.05it/s]
Training 1/1 epoch (loss 1.5388): 15%|ββ | 193/1250 [01:10<05:51, 3.00it/s]
Training 1/1 epoch (loss 1.5542): 15%|ββ | 193/1250 [01:10<05:51, 3.00it/s]
Training 1/1 epoch (loss 1.5542): 16%|ββ | 194/1250 [01:10<05:48, 3.03it/s]
Training 1/1 epoch (loss 1.6488): 16%|ββ | 194/1250 [01:11<05:48, 3.03it/s]
Training 1/1 epoch (loss 1.6488): 16%|ββ | 195/1250 [01:11<05:47, 3.03it/s]
Training 1/1 epoch (loss 1.6269): 16%|ββ | 195/1250 [01:11<05:47, 3.03it/s]
Training 1/1 epoch (loss 1.6269): 16%|ββ | 196/1250 [01:11<05:45, 3.05it/s]
Training 1/1 epoch (loss 1.6381): 16%|ββ | 196/1250 [01:11<05:45, 3.05it/s]
Training 1/1 epoch (loss 1.6381): 16%|ββ | 197/1250 [01:11<05:43, 3.06it/s]
Training 1/1 epoch (loss 1.6572): 16%|ββ | 197/1250 [01:12<05:43, 3.06it/s]
Training 1/1 epoch (loss 1.6572): 16%|ββ | 198/1250 [01:12<05:50, 3.00it/s]
Training 1/1 epoch (loss 1.7127): 16%|ββ | 198/1250 [01:12<05:50, 3.00it/s]
Training 1/1 epoch (loss 1.7127): 16%|ββ | 199/1250 [01:12<05:55, 2.95it/s]
Training 1/1 epoch (loss 1.5168): 16%|ββ | 199/1250 [01:12<05:55, 2.95it/s]
Training 1/1 epoch (loss 1.5168): 16%|ββ | 200/1250 [01:12<05:50, 2.99it/s]
Training 1/1 epoch (loss 1.5615): 16%|ββ | 200/1250 [01:13<05:50, 2.99it/s]
Training 1/1 epoch (loss 1.5615): 16%|ββ | 201/1250 [01:13<05:46, 3.03it/s]
Training 1/1 epoch (loss 1.5241): 16%|ββ | 201/1250 [01:13<05:46, 3.03it/s]
Training 1/1 epoch (loss 1.5241): 16%|ββ | 202/1250 [01:13<05:37, 3.10it/s]
Training 1/1 epoch (loss 1.6296): 16%|ββ | 202/1250 [01:13<05:37, 3.10it/s]
Training 1/1 epoch (loss 1.6296): 16%|ββ | 203/1250 [01:13<05:40, 3.08it/s]
Training 1/1 epoch (loss 1.6001): 16%|ββ | 203/1250 [01:13<05:40, 3.08it/s]
Training 1/1 epoch (loss 1.6001): 16%|ββ | 204/1250 [01:13<05:42, 3.06it/s]
Training 1/1 epoch (loss 1.5036): 16%|ββ | 204/1250 [01:14<05:42, 3.06it/s]
Training 1/1 epoch (loss 1.5036): 16%|ββ | 205/1250 [01:14<06:06, 2.85it/s]
Training 1/1 epoch (loss 1.7072): 16%|ββ | 205/1250 [01:14<06:06, 2.85it/s]
Training 1/1 epoch (loss 1.7072): 16%|ββ | 206/1250 [01:14<06:24, 2.72it/s]
Training 1/1 epoch (loss 1.6386): 16%|ββ | 206/1250 [01:15<06:24, 2.72it/s]
Training 1/1 epoch (loss 1.6386): 17%|ββ | 207/1250 [01:15<06:20, 2.74it/s]
Training 1/1 epoch (loss 1.6602): 17%|ββ | 207/1250 [01:15<06:20, 2.74it/s]
Training 1/1 epoch (loss 1.6602): 17%|ββ | 208/1250 [01:15<06:07, 2.83it/s]
Training 1/1 epoch (loss 1.6150): 17%|ββ | 208/1250 [01:15<06:07, 2.83it/s]
Training 1/1 epoch (loss 1.6150): 17%|ββ | 209/1250 [01:15<06:03, 2.86it/s]
Training 1/1 epoch (loss 1.6153): 17%|ββ | 209/1250 [01:16<06:03, 2.86it/s]
Training 1/1 epoch (loss 1.6153): 17%|ββ | 210/1250 [01:16<06:05, 2.84it/s]
Training 1/1 epoch (loss 1.6349): 17%|ββ | 210/1250 [01:16<06:05, 2.84it/s]
Training 1/1 epoch (loss 1.6349): 17%|ββ | 211/1250 [01:16<06:13, 2.78it/s]
Training 1/1 epoch (loss 1.6174): 17%|ββ | 211/1250 [01:16<06:13, 2.78it/s]
Training 1/1 epoch (loss 1.6174): 17%|ββ | 212/1250 [01:16<05:55, 2.92it/s]
Training 1/1 epoch (loss 1.6462): 17%|ββ | 212/1250 [01:17<05:55, 2.92it/s]
Training 1/1 epoch (loss 1.6462): 17%|ββ | 213/1250 [01:17<05:45, 3.00it/s]
Training 1/1 epoch (loss 1.5557): 17%|ββ | 213/1250 [01:17<05:45, 3.00it/s]
Training 1/1 epoch (loss 1.5557): 17%|ββ | 214/1250 [01:17<05:47, 2.98it/s]
Training 1/1 epoch (loss 1.6338): 17%|ββ | 214/1250 [01:17<05:47, 2.98it/s]
Training 1/1 epoch (loss 1.6338): 17%|ββ | 215/1250 [01:17<05:45, 3.00it/s]
Training 1/1 epoch (loss 1.6456): 17%|ββ | 215/1250 [01:18<05:45, 3.00it/s]
Training 1/1 epoch (loss 1.6456): 17%|ββ | 216/1250 [01:18<06:05, 2.83it/s]
Training 1/1 epoch (loss 1.7086): 17%|ββ | 216/1250 [01:18<06:05, 2.83it/s]
Training 1/1 epoch (loss 1.7086): 17%|ββ | 217/1250 [01:18<05:55, 2.91it/s]
Training 1/1 epoch (loss 1.6360): 17%|ββ | 217/1250 [01:18<05:55, 2.91it/s]
Training 1/1 epoch (loss 1.6360): 17%|ββ | 218/1250 [01:18<05:43, 3.01it/s]
Training 1/1 epoch (loss 1.6479): 17%|ββ | 218/1250 [01:19<05:43, 3.01it/s]
Training 1/1 epoch (loss 1.6479): 18%|ββ | 219/1250 [01:19<05:35, 3.07it/s]
Training 1/1 epoch (loss 1.6044): 18%|ββ | 219/1250 [01:19<05:35, 3.07it/s]
Training 1/1 epoch (loss 1.6044): 18%|ββ | 220/1250 [01:19<05:27, 3.15it/s]
Training 1/1 epoch (loss 1.4782): 18%|ββ | 220/1250 [01:19<05:27, 3.15it/s]
Training 1/1 epoch (loss 1.4782): 18%|ββ | 221/1250 [01:19<05:47, 2.96it/s]
Training 1/1 epoch (loss 1.6171): 18%|ββ | 221/1250 [01:20<05:47, 2.96it/s]
Training 1/1 epoch (loss 1.6171): 18%|ββ | 222/1250 [01:20<05:56, 2.88it/s]
Training 1/1 epoch (loss 1.6052): 18%|ββ | 222/1250 [01:20<05:56, 2.88it/s]
Training 1/1 epoch (loss 1.6052): 18%|ββ | 223/1250 [01:20<05:50, 2.93it/s]
Training 1/1 epoch (loss 1.7536): 18%|ββ | 223/1250 [01:20<05:50, 2.93it/s]
Training 1/1 epoch (loss 1.7536): 18%|ββ | 224/1250 [01:20<05:42, 3.00it/s]
Training 1/1 epoch (loss 1.7144): 18%|ββ | 224/1250 [01:21<05:42, 3.00it/s]
Training 1/1 epoch (loss 1.7144): 18%|ββ | 225/1250 [01:21<05:41, 3.01it/s]
Training 1/1 epoch (loss 1.5455): 18%|ββ | 225/1250 [01:21<05:41, 3.01it/s]
Training 1/1 epoch (loss 1.5455): 18%|ββ | 226/1250 [01:21<05:34, 3.06it/s]
Training 1/1 epoch (loss 1.6433): 18%|ββ | 226/1250 [01:21<05:34, 3.06it/s]
Training 1/1 epoch (loss 1.6433): 18%|ββ | 227/1250 [01:21<05:38, 3.02it/s]
Training 1/1 epoch (loss 1.7469): 18%|ββ | 227/1250 [01:22<05:38, 3.02it/s]
Training 1/1 epoch (loss 1.7469): 18%|ββ | 228/1250 [01:22<06:20, 2.68it/s]
Training 1/1 epoch (loss 1.5943): 18%|ββ | 228/1250 [01:22<06:20, 2.68it/s]
Training 1/1 epoch (loss 1.5943): 18%|ββ | 229/1250 [01:22<06:00, 2.83it/s]
Training 1/1 epoch (loss 1.7100): 18%|ββ | 229/1250 [01:22<06:00, 2.83it/s]
Training 1/1 epoch (loss 1.7100): 18%|ββ | 230/1250 [01:22<05:44, 2.96it/s]
Training 1/1 epoch (loss 1.5117): 18%|ββ | 230/1250 [01:23<05:44, 2.96it/s]
Training 1/1 epoch (loss 1.5117): 18%|ββ | 231/1250 [01:23<05:33, 3.06it/s]
Training 1/1 epoch (loss 1.6362): 18%|ββ | 231/1250 [01:23<05:33, 3.06it/s]
Training 1/1 epoch (loss 1.6362): 19%|ββ | 232/1250 [01:23<05:46, 2.94it/s]
Training 1/1 epoch (loss 1.6859): 19%|ββ | 232/1250 [01:23<05:46, 2.94it/s]
Training 1/1 epoch (loss 1.6859): 19%|ββ | 233/1250 [01:23<05:39, 2.99it/s]
Training 1/1 epoch (loss 1.5388): 19%|ββ | 233/1250 [01:24<05:39, 2.99it/s]
Training 1/1 epoch (loss 1.5388): 19%|ββ | 234/1250 [01:24<05:51, 2.89it/s]
Training 1/1 epoch (loss 1.4877): 19%|ββ | 234/1250 [01:24<05:51, 2.89it/s]
Training 1/1 epoch (loss 1.4877): 19%|ββ | 235/1250 [01:24<05:45, 2.93it/s]
Training 1/1 epoch (loss 1.6822): 19%|ββ | 235/1250 [01:24<05:45, 2.93it/s]
Training 1/1 epoch (loss 1.6822): 19%|ββ | 236/1250 [01:24<05:33, 3.04it/s]
Training 1/1 epoch (loss 1.5888): 19%|ββ | 236/1250 [01:25<05:33, 3.04it/s]
Training 1/1 epoch (loss 1.5888): 19%|ββ | 237/1250 [01:25<05:25, 3.12it/s]
Training 1/1 epoch (loss 1.7018): 19%|ββ | 237/1250 [01:25<05:25, 3.12it/s]
Training 1/1 epoch (loss 1.7018): 19%|ββ | 238/1250 [01:25<05:22, 3.14it/s]
Training 1/1 epoch (loss 1.5647): 19%|ββ | 238/1250 [01:25<05:22, 3.14it/s]
Training 1/1 epoch (loss 1.5647): 19%|ββ | 239/1250 [01:25<05:26, 3.09it/s]
Training 1/1 epoch (loss 1.6897): 19%|ββ | 239/1250 [01:26<05:26, 3.09it/s]
Training 1/1 epoch (loss 1.6897): 19%|ββ | 240/1250 [01:26<05:48, 2.90it/s]
Training 1/1 epoch (loss 1.6972): 19%|ββ | 240/1250 [01:26<05:48, 2.90it/s]
Training 1/1 epoch (loss 1.6972): 19%|ββ | 241/1250 [01:26<05:47, 2.91it/s]
Training 1/1 epoch (loss 1.5131): 19%|ββ | 241/1250 [01:26<05:47, 2.91it/s]
Training 1/1 epoch (loss 1.5131): 19%|ββ | 242/1250 [01:26<05:45, 2.92it/s]
Training 1/1 epoch (loss 1.6371): 19%|ββ | 242/1250 [01:27<05:45, 2.92it/s]
Training 1/1 epoch (loss 1.6371): 19%|ββ | 243/1250 [01:27<05:34, 3.01it/s]
Training 1/1 epoch (loss 1.5939): 19%|ββ | 243/1250 [01:27<05:34, 3.01it/s]
Training 1/1 epoch (loss 1.5939): 20%|ββ | 244/1250 [01:27<05:29, 3.06it/s]
Training 1/1 epoch (loss 1.5540): 20%|ββ | 244/1250 [01:27<05:29, 3.06it/s]
Training 1/1 epoch (loss 1.5540): 20%|ββ | 245/1250 [01:27<05:49, 2.88it/s]
Training 1/1 epoch (loss 1.6070): 20%|ββ | 245/1250 [01:28<05:49, 2.88it/s]
Training 1/1 epoch (loss 1.6070): 20%|ββ | 246/1250 [01:28<06:14, 2.68it/s]
Training 1/1 epoch (loss 1.6770): 20%|ββ | 246/1250 [01:28<06:14, 2.68it/s]
Training 1/1 epoch (loss 1.6770): 20%|ββ | 247/1250 [01:28<05:51, 2.86it/s]
Training 1/1 epoch (loss 1.6093): 20%|ββ | 247/1250 [01:29<05:51, 2.86it/s]
Training 1/1 epoch (loss 1.6093): 20%|ββ | 248/1250 [01:29<05:47, 2.88it/s]
Training 1/1 epoch (loss 1.6457): 20%|ββ | 248/1250 [01:29<05:47, 2.88it/s]
Training 1/1 epoch (loss 1.6457): 20%|ββ | 249/1250 [01:29<05:33, 3.00it/s]
Training 1/1 epoch (loss 1.6275): 20%|ββ | 249/1250 [01:29<05:33, 3.00it/s]
Training 1/1 epoch (loss 1.6275): 20%|ββ | 250/1250 [01:29<05:26, 3.06it/s]
Training 1/1 epoch (loss 1.6736): 20%|ββ | 250/1250 [01:30<05:26, 3.06it/s]
Training 1/1 epoch (loss 1.6736): 20%|ββ | 251/1250 [01:30<05:35, 2.98it/s]
Training 1/1 epoch (loss 1.7277): 20%|ββ | 251/1250 [01:30<05:35, 2.98it/s]
Training 1/1 epoch (loss 1.7277): 20%|ββ | 252/1250 [01:30<05:40, 2.93it/s]
Training 1/1 epoch (loss 1.6705): 20%|ββ | 252/1250 [01:30<05:40, 2.93it/s]
Training 1/1 epoch (loss 1.6705): 20%|ββ | 253/1250 [01:30<05:31, 3.00it/s]
Training 1/1 epoch (loss 1.5868): 20%|ββ | 253/1250 [01:31<05:31, 3.00it/s]
Training 1/1 epoch (loss 1.5868): 20%|ββ | 254/1250 [01:31<05:41, 2.92it/s]
Training 1/1 epoch (loss 1.6084): 20%|ββ | 254/1250 [01:31<05:41, 2.92it/s]
Training 1/1 epoch (loss 1.6084): 20%|ββ | 255/1250 [01:31<05:40, 2.93it/s]
Training 1/1 epoch (loss 1.5542): 20%|ββ | 255/1250 [01:31<05:40, 2.93it/s]
Training 1/1 epoch (loss 1.5542): 20%|ββ | 256/1250 [01:31<05:28, 3.02it/s]
Training 1/1 epoch (loss 1.6128): 20%|ββ | 256/1250 [01:32<05:28, 3.02it/s]
Training 1/1 epoch (loss 1.6128): 21%|ββ | 257/1250 [01:32<05:33, 2.98it/s]
Training 1/1 epoch (loss 1.5427): 21%|ββ | 257/1250 [01:32<05:33, 2.98it/s]
Training 1/1 epoch (loss 1.5427): 21%|ββ | 258/1250 [01:32<05:45, 2.87it/s]
Training 1/1 epoch (loss 1.5273): 21%|ββ | 258/1250 [01:32<05:45, 2.87it/s]
Training 1/1 epoch (loss 1.5273): 21%|ββ | 259/1250 [01:32<05:37, 2.94it/s]
Training 1/1 epoch (loss 1.7221): 21%|ββ | 259/1250 [01:33<05:37, 2.94it/s]
Training 1/1 epoch (loss 1.7221): 21%|ββ | 260/1250 [01:33<05:26, 3.04it/s]
Training 1/1 epoch (loss 1.5717): 21%|ββ | 260/1250 [01:33<05:26, 3.04it/s]
Training 1/1 epoch (loss 1.5717): 21%|ββ | 261/1250 [01:33<05:16, 3.12it/s]
Training 1/1 epoch (loss 1.5828): 21%|ββ | 261/1250 [01:33<05:16, 3.12it/s]
Training 1/1 epoch (loss 1.5828): 21%|ββ | 262/1250 [01:33<05:14, 3.14it/s]
Training 1/1 epoch (loss 1.7143): 21%|ββ | 262/1250 [01:33<05:14, 3.14it/s]
Training 1/1 epoch (loss 1.7143): 21%|ββ | 263/1250 [01:33<05:19, 3.09it/s]
Training 1/1 epoch (loss 1.6139): 21%|ββ | 263/1250 [01:34<05:19, 3.09it/s]
Training 1/1 epoch (loss 1.6139): 21%|ββ | 264/1250 [01:34<05:29, 2.99it/s]
Training 1/1 epoch (loss 1.5215): 21%|ββ | 264/1250 [01:34<05:29, 2.99it/s]
Training 1/1 epoch (loss 1.5215): 21%|ββ | 265/1250 [01:34<05:25, 3.03it/s]
Training 1/1 epoch (loss 1.5978): 21%|ββ | 265/1250 [01:35<05:25, 3.03it/s]
Training 1/1 epoch (loss 1.5978): 21%|βββ | 266/1250 [01:35<05:47, 2.83it/s]
Training 1/1 epoch (loss 1.5593): 21%|βββ | 266/1250 [01:35<05:47, 2.83it/s]
Training 1/1 epoch (loss 1.5593): 21%|βββ | 267/1250 [01:35<06:06, 2.68it/s]
Training 1/1 epoch (loss 1.6031): 21%|βββ | 267/1250 [01:35<06:06, 2.68it/s]
Training 1/1 epoch (loss 1.6031): 21%|βββ | 268/1250 [01:35<06:08, 2.67it/s]
Training 1/1 epoch (loss 1.5090): 21%|βββ | 268/1250 [01:36<06:08, 2.67it/s]
Training 1/1 epoch (loss 1.5090): 22%|βββ | 269/1250 [01:36<05:52, 2.78it/s]
Training 1/1 epoch (loss 1.5928): 22%|βββ | 269/1250 [01:36<05:52, 2.78it/s]
Training 1/1 epoch (loss 1.5928): 22%|βββ | 270/1250 [01:36<05:44, 2.84it/s]
Training 1/1 epoch (loss 1.6158): 22%|βββ | 270/1250 [01:36<05:44, 2.84it/s]
Training 1/1 epoch (loss 1.6158): 22%|βββ | 271/1250 [01:36<05:32, 2.94it/s]
Training 1/1 epoch (loss 1.6039): 22%|βββ | 271/1250 [01:37<05:32, 2.94it/s]
Training 1/1 epoch (loss 1.6039): 22%|βββ | 272/1250 [01:37<05:25, 3.01it/s]
Training 1/1 epoch (loss 1.5105): 22%|βββ | 272/1250 [01:37<05:25, 3.01it/s]
Training 1/1 epoch (loss 1.5105): 22%|βββ | 273/1250 [01:37<05:36, 2.90it/s]
Training 1/1 epoch (loss 1.5658): 22%|βββ | 273/1250 [01:37<05:36, 2.90it/s]
Training 1/1 epoch (loss 1.5658): 22%|βββ | 274/1250 [01:37<05:35, 2.91it/s]
Training 1/1 epoch (loss 1.5938): 22%|βββ | 274/1250 [01:38<05:35, 2.91it/s]
Training 1/1 epoch (loss 1.5938): 22%|βββ | 275/1250 [01:38<05:36, 2.89it/s]
Training 1/1 epoch (loss 1.5544): 22%|βββ | 275/1250 [01:38<05:36, 2.89it/s]
Training 1/1 epoch (loss 1.5544): 22%|βββ | 276/1250 [01:38<05:33, 2.92it/s]
Training 1/1 epoch (loss 1.5722): 22%|βββ | 276/1250 [01:38<05:33, 2.92it/s]
Training 1/1 epoch (loss 1.5722): 22%|βββ | 277/1250 [01:38<05:27, 2.97it/s]
Training 1/1 epoch (loss 1.5927): 22%|βββ | 277/1250 [01:39<05:27, 2.97it/s]
Training 1/1 epoch (loss 1.5927): 22%|βββ | 278/1250 [01:39<05:16, 3.07it/s]
Training 1/1 epoch (loss 1.6895): 22%|βββ | 278/1250 [01:39<05:16, 3.07it/s]
Training 1/1 epoch (loss 1.6895): 22%|βββ | 279/1250 [01:39<05:29, 2.95it/s]
Training 1/1 epoch (loss 1.7311): 22%|βββ | 279/1250 [01:39<05:29, 2.95it/s]
Training 1/1 epoch (loss 1.7311): 22%|βββ | 280/1250 [01:39<05:30, 2.94it/s]
Training 1/1 epoch (loss 1.6755): 22%|βββ | 280/1250 [01:40<05:30, 2.94it/s]
Training 1/1 epoch (loss 1.6755): 22%|βββ | 281/1250 [01:40<05:28, 2.95it/s]
Training 1/1 epoch (loss 1.5794): 22%|βββ | 281/1250 [01:40<05:28, 2.95it/s]
Training 1/1 epoch (loss 1.5794): 23%|βββ | 282/1250 [01:40<05:26, 2.97it/s]
Training 1/1 epoch (loss 1.5393): 23%|βββ | 282/1250 [01:40<05:26, 2.97it/s]
Training 1/1 epoch (loss 1.5393): 23%|βββ | 283/1250 [01:40<05:19, 3.02it/s]
Training 1/1 epoch (loss 1.5164): 23%|βββ | 283/1250 [01:41<05:19, 3.02it/s]
Training 1/1 epoch (loss 1.5164): 23%|βββ | 284/1250 [01:41<05:08, 3.13it/s]
Training 1/1 epoch (loss 1.6751): 23%|βββ | 284/1250 [01:41<05:08, 3.13it/s]
Training 1/1 epoch (loss 1.6751): 23%|βββ | 285/1250 [01:41<05:05, 3.16it/s]
Training 1/1 epoch (loss 1.5478): 23%|βββ | 285/1250 [01:41<05:05, 3.16it/s]
Training 1/1 epoch (loss 1.5478): 23%|βββ | 286/1250 [01:41<05:03, 3.18it/s]
Training 1/1 epoch (loss 1.6563): 23%|βββ | 286/1250 [01:42<05:03, 3.18it/s]
Training 1/1 epoch (loss 1.6563): 23%|βββ | 287/1250 [01:42<05:01, 3.19it/s]
Training 1/1 epoch (loss 1.5295): 23%|βββ | 287/1250 [01:42<05:01, 3.19it/s]
Training 1/1 epoch (loss 1.5295): 23%|βββ | 288/1250 [01:42<05:12, 3.08it/s]
Training 1/1 epoch (loss 1.6710): 23%|βββ | 288/1250 [01:42<05:12, 3.08it/s]
Training 1/1 epoch (loss 1.6710): 23%|βββ | 289/1250 [01:42<05:16, 3.04it/s]
Training 1/1 epoch (loss 1.5751): 23%|βββ | 289/1250 [01:43<05:16, 3.04it/s]
Training 1/1 epoch (loss 1.5751): 23%|βββ | 290/1250 [01:43<05:08, 3.11it/s]
Training 1/1 epoch (loss 1.5927): 23%|βββ | 290/1250 [01:43<05:08, 3.11it/s]
Training 1/1 epoch (loss 1.5927): 23%|βββ | 291/1250 [01:43<04:59, 3.20it/s]
Training 1/1 epoch (loss 1.6131): 23%|βββ | 291/1250 [01:43<04:59, 3.20it/s]
Training 1/1 epoch (loss 1.6131): 23%|βββ | 292/1250 [01:43<05:03, 3.16it/s]
Training 1/1 epoch (loss 1.6467): 23%|βββ | 292/1250 [01:44<05:03, 3.16it/s]
Training 1/1 epoch (loss 1.6467): 23%|βββ | 293/1250 [01:44<04:59, 3.20it/s]
Training 1/1 epoch (loss 1.6036): 23%|βββ | 293/1250 [01:44<04:59, 3.20it/s]
Training 1/1 epoch (loss 1.6036): 24%|βββ | 294/1250 [01:44<05:08, 3.10it/s]
Training 1/1 epoch (loss 1.6298): 24%|βββ | 294/1250 [01:44<05:08, 3.10it/s]
Training 1/1 epoch (loss 1.6298): 24%|βββ | 295/1250 [01:44<05:04, 3.13it/s]
Training 1/1 epoch (loss 1.6126): 24%|βββ | 295/1250 [01:45<05:04, 3.13it/s]
Training 1/1 epoch (loss 1.6126): 24%|βββ | 296/1250 [01:45<05:12, 3.05it/s]
Training 1/1 epoch (loss 1.7168): 24%|βββ | 296/1250 [01:45<05:12, 3.05it/s]
Training 1/1 epoch (loss 1.7168): 24%|βββ | 297/1250 [01:45<05:05, 3.12it/s]
Training 1/1 epoch (loss 1.5556): 24%|βββ | 297/1250 [01:45<05:05, 3.12it/s]
Training 1/1 epoch (loss 1.5556): 24%|βββ | 298/1250 [01:45<05:02, 3.14it/s]
Training 1/1 epoch (loss 1.5828): 24%|βββ | 298/1250 [01:45<05:02, 3.14it/s]
Training 1/1 epoch (loss 1.5828): 24%|βββ | 299/1250 [01:45<05:01, 3.15it/s]
Training 1/1 epoch (loss 1.5870): 24%|βββ | 299/1250 [01:46<05:01, 3.15it/s]
Training 1/1 epoch (loss 1.5870): 24%|βββ | 300/1250 [01:46<05:05, 3.11it/s]
Training 1/1 epoch (loss 1.6951): 24%|βββ | 300/1250 [01:46<05:05, 3.11it/s]
Training 1/1 epoch (loss 1.6951): 24%|βββ | 301/1250 [01:46<05:10, 3.06it/s]
Training 1/1 epoch (loss 1.6014): 24%|βββ | 301/1250 [01:46<05:10, 3.06it/s]
Training 1/1 epoch (loss 1.6014): 24%|βββ | 302/1250 [01:46<04:59, 3.17it/s]
Training 1/1 epoch (loss 1.6361): 24%|βββ | 302/1250 [01:47<04:59, 3.17it/s]
Training 1/1 epoch (loss 1.6361): 24%|βββ | 303/1250 [01:47<04:59, 3.16it/s]
Training 1/1 epoch (loss 1.6392): 24%|βββ | 303/1250 [01:47<04:59, 3.16it/s]
Training 1/1 epoch (loss 1.6392): 24%|βββ | 304/1250 [01:47<04:57, 3.18it/s]
Training 1/1 epoch (loss 1.6090): 24%|βββ | 304/1250 [01:47<04:57, 3.18it/s]
Training 1/1 epoch (loss 1.6090): 24%|βββ | 305/1250 [01:47<05:15, 3.00it/s]
Training 1/1 epoch (loss 1.5447): 24%|βββ | 305/1250 [01:48<05:15, 3.00it/s]
Training 1/1 epoch (loss 1.5447): 24%|βββ | 306/1250 [01:48<05:14, 3.00it/s]
Training 1/1 epoch (loss 1.6214): 24%|βββ | 306/1250 [01:48<05:14, 3.00it/s]
Training 1/1 epoch (loss 1.6214): 25%|βββ | 307/1250 [01:48<05:18, 2.96it/s]
Training 1/1 epoch (loss 1.5637): 25%|βββ | 307/1250 [01:48<05:18, 2.96it/s]
Training 1/1 epoch (loss 1.5637): 25%|βββ | 308/1250 [01:48<05:09, 3.04it/s]
Training 1/1 epoch (loss 1.6563): 25%|βββ | 308/1250 [01:49<05:09, 3.04it/s]
Training 1/1 epoch (loss 1.6563): 25%|βββ | 309/1250 [01:49<05:02, 3.11it/s]
Training 1/1 epoch (loss 1.6813): 25%|βββ | 309/1250 [01:49<05:02, 3.11it/s]
Training 1/1 epoch (loss 1.6813): 25%|βββ | 310/1250 [01:49<04:58, 3.15it/s]
Training 1/1 epoch (loss 1.5708): 25%|βββ | 310/1250 [01:49<04:58, 3.15it/s]
Training 1/1 epoch (loss 1.5708): 25%|βββ | 311/1250 [01:49<05:00, 3.13it/s]
Training 1/1 epoch (loss 1.6274): 25%|βββ | 311/1250 [01:50<05:00, 3.13it/s]
Training 1/1 epoch (loss 1.6274): 25%|βββ | 312/1250 [01:50<05:19, 2.94it/s]
Training 1/1 epoch (loss 1.6267): 25%|βββ | 312/1250 [01:50<05:19, 2.94it/s]
Training 1/1 epoch (loss 1.6267): 25%|βββ | 313/1250 [01:50<05:20, 2.92it/s]
Training 1/1 epoch (loss 1.5943): 25%|βββ | 313/1250 [01:50<05:20, 2.92it/s]
Training 1/1 epoch (loss 1.5943): 25%|βββ | 314/1250 [01:50<05:38, 2.77it/s]
Training 1/1 epoch (loss 1.6432): 25%|βββ | 314/1250 [01:51<05:38, 2.77it/s]
Training 1/1 epoch (loss 1.6432): 25%|βββ | 315/1250 [01:51<05:33, 2.80it/s]
Training 1/1 epoch (loss 1.4095): 25%|βββ | 315/1250 [01:51<05:33, 2.80it/s]
Training 1/1 epoch (loss 1.4095): 25%|βββ | 316/1250 [01:51<05:19, 2.92it/s]
Training 1/1 epoch (loss 1.5688): 25%|βββ | 316/1250 [01:51<05:19, 2.92it/s]
Training 1/1 epoch (loss 1.5688): 25%|βββ | 317/1250 [01:51<05:09, 3.02it/s]
Training 1/1 epoch (loss 1.6449): 25%|βββ | 317/1250 [01:52<05:09, 3.02it/s]
Training 1/1 epoch (loss 1.6449): 25%|βββ | 318/1250 [01:52<05:07, 3.03it/s]
Training 1/1 epoch (loss 1.6766): 25%|βββ | 318/1250 [01:52<05:07, 3.03it/s]
Training 1/1 epoch (loss 1.6766): 26%|βββ | 319/1250 [01:52<05:25, 2.86it/s]
Training 1/1 epoch (loss 1.4774): 26%|βββ | 319/1250 [01:53<05:25, 2.86it/s]
Training 1/1 epoch (loss 1.4774): 26%|βββ | 320/1250 [01:53<05:40, 2.73it/s]
Training 1/1 epoch (loss 1.6102): 26%|βββ | 320/1250 [01:53<05:40, 2.73it/s]
Training 1/1 epoch (loss 1.6102): 26%|βββ | 321/1250 [01:53<05:43, 2.71it/s]
Training 1/1 epoch (loss 1.6485): 26%|βββ | 321/1250 [01:53<05:43, 2.71it/s]
Training 1/1 epoch (loss 1.6485): 26%|βββ | 322/1250 [01:53<05:42, 2.71it/s]
Training 1/1 epoch (loss 1.5127): 26%|βββ | 322/1250 [01:54<05:42, 2.71it/s]
Training 1/1 epoch (loss 1.5127): 26%|βββ | 323/1250 [01:54<05:36, 2.75it/s]
Training 1/1 epoch (loss 1.6061): 26%|βββ | 323/1250 [01:54<05:36, 2.75it/s]
Training 1/1 epoch (loss 1.6061): 26%|βββ | 324/1250 [01:54<05:43, 2.70it/s]
Training 1/1 epoch (loss 1.6090): 26%|βββ | 324/1250 [01:54<05:43, 2.70it/s]
Training 1/1 epoch (loss 1.6090): 26%|βββ | 325/1250 [01:54<05:29, 2.81it/s]
Training 1/1 epoch (loss 1.6839): 26%|βββ | 325/1250 [01:55<05:29, 2.81it/s]
Training 1/1 epoch (loss 1.6839): 26%|βββ | 326/1250 [01:55<05:13, 2.95it/s]
Training 1/1 epoch (loss 1.6015): 26%|βββ | 326/1250 [01:55<05:13, 2.95it/s]
Training 1/1 epoch (loss 1.6015): 26%|βββ | 327/1250 [01:55<05:00, 3.07it/s]
Training 1/1 epoch (loss 1.5252): 26%|βββ | 327/1250 [01:55<05:00, 3.07it/s]
Training 1/1 epoch (loss 1.5252): 26%|βββ | 328/1250 [01:55<04:59, 3.07it/s]
Training 1/1 epoch (loss 1.6129): 26%|βββ | 328/1250 [01:56<04:59, 3.07it/s]
Training 1/1 epoch (loss 1.6129): 26%|βββ | 329/1250 [01:56<04:56, 3.10it/s]
Training 1/1 epoch (loss 1.6711): 26%|βββ | 329/1250 [01:56<04:56, 3.10it/s]
Training 1/1 epoch (loss 1.6711): 26%|βββ | 330/1250 [01:56<05:31, 2.77it/s]
Training 1/1 epoch (loss 1.6621): 26%|βββ | 330/1250 [01:56<05:31, 2.77it/s]
Training 1/1 epoch (loss 1.6621): 26%|βββ | 331/1250 [01:56<05:31, 2.78it/s]
Training 1/1 epoch (loss 1.5470): 26%|βββ | 331/1250 [01:57<05:31, 2.78it/s]
Training 1/1 epoch (loss 1.5470): 27%|βββ | 332/1250 [01:57<05:16, 2.90it/s]
Training 1/1 epoch (loss 1.4893): 27%|βββ | 332/1250 [01:57<05:16, 2.90it/s]
Training 1/1 epoch (loss 1.4893): 27%|βββ | 333/1250 [01:57<05:11, 2.95it/s]
Training 1/1 epoch (loss 1.5145): 27%|βββ | 333/1250 [01:57<05:11, 2.95it/s]
Training 1/1 epoch (loss 1.5145): 27%|βββ | 334/1250 [01:57<05:16, 2.90it/s]
Training 1/1 epoch (loss 1.5349): 27%|βββ | 334/1250 [01:58<05:16, 2.90it/s]
Training 1/1 epoch (loss 1.5349): 27%|βββ | 335/1250 [01:58<05:17, 2.88it/s]
Training 1/1 epoch (loss 1.5335): 27%|βββ | 335/1250 [01:58<05:17, 2.88it/s]
Training 1/1 epoch (loss 1.5335): 27%|βββ | 336/1250 [01:58<05:39, 2.69it/s]
Training 1/1 epoch (loss 1.6054): 27%|βββ | 336/1250 [01:59<05:39, 2.69it/s]
Training 1/1 epoch (loss 1.6054): 27%|βββ | 337/1250 [01:59<05:24, 2.82it/s]
Training 1/1 epoch (loss 1.6002): 27%|βββ | 337/1250 [01:59<05:24, 2.82it/s]
Training 1/1 epoch (loss 1.6002): 27%|βββ | 338/1250 [01:59<07:12, 2.11it/s]
Training 1/1 epoch (loss 1.5967): 27%|βββ | 338/1250 [02:00<07:12, 2.11it/s]
Training 1/1 epoch (loss 1.5967): 27%|βββ | 339/1250 [02:00<06:35, 2.30it/s]
Training 1/1 epoch (loss 1.5495): 27%|βββ | 339/1250 [02:00<06:35, 2.30it/s]
Training 1/1 epoch (loss 1.5495): 27%|βββ | 340/1250 [02:00<06:08, 2.47it/s]
Training 1/1 epoch (loss 1.6200): 27%|βββ | 340/1250 [02:00<06:08, 2.47it/s]
Training 1/1 epoch (loss 1.6200): 27%|βββ | 341/1250 [02:00<05:51, 2.59it/s]
Training 1/1 epoch (loss 1.5883): 27%|βββ | 341/1250 [02:01<05:51, 2.59it/s]
Training 1/1 epoch (loss 1.5883): 27%|βββ | 342/1250 [02:01<05:32, 2.73it/s]
Training 1/1 epoch (loss 1.6733): 27%|βββ | 342/1250 [02:01<05:32, 2.73it/s]
Training 1/1 epoch (loss 1.6733): 27%|βββ | 343/1250 [02:01<05:13, 2.89it/s]
Training 1/1 epoch (loss 1.6710): 27%|βββ | 343/1250 [02:01<05:13, 2.89it/s]
Training 1/1 epoch (loss 1.6710): 28%|βββ | 344/1250 [02:01<05:11, 2.91it/s]
Training 1/1 epoch (loss 1.6250): 28%|βββ | 344/1250 [02:02<05:11, 2.91it/s]
Training 1/1 epoch (loss 1.6250): 28%|βββ | 345/1250 [02:02<05:01, 3.00it/s]
Training 1/1 epoch (loss 1.5616): 28%|βββ | 345/1250 [02:02<05:01, 3.00it/s]
Training 1/1 epoch (loss 1.5616): 28%|βββ | 346/1250 [02:02<05:02, 2.99it/s]
Training 1/1 epoch (loss 1.5430): 28%|βββ | 346/1250 [02:02<05:02, 2.99it/s]
Training 1/1 epoch (loss 1.5430): 28%|βββ | 347/1250 [02:02<05:06, 2.95it/s]
Training 1/1 epoch (loss 1.6039): 28%|βββ | 347/1250 [02:03<05:06, 2.95it/s]
Training 1/1 epoch (loss 1.6039): 28%|βββ | 348/1250 [02:03<05:11, 2.89it/s]
Training 1/1 epoch (loss 1.6082): 28%|βββ | 348/1250 [02:03<05:11, 2.89it/s]
Training 1/1 epoch (loss 1.6082): 28%|βββ | 349/1250 [02:03<05:18, 2.83it/s]
Training 1/1 epoch (loss 1.4704): 28%|βββ | 349/1250 [02:03<05:18, 2.83it/s]
Training 1/1 epoch (loss 1.4704): 28%|βββ | 350/1250 [02:03<05:20, 2.81it/s]
Training 1/1 epoch (loss 1.4788): 28%|βββ | 350/1250 [02:04<05:20, 2.81it/s]
Training 1/1 epoch (loss 1.4788): 28%|βββ | 351/1250 [02:04<05:21, 2.80it/s]
Training 1/1 epoch (loss 1.6381): 28%|βββ | 351/1250 [02:04<05:21, 2.80it/s]
Training 1/1 epoch (loss 1.6381): 28%|βββ | 352/1250 [02:04<05:43, 2.61it/s]
Training 1/1 epoch (loss 1.5997): 28%|βββ | 352/1250 [02:04<05:43, 2.61it/s]
Training 1/1 epoch (loss 1.5997): 28%|βββ | 353/1250 [02:04<05:30, 2.72it/s]
Training 1/1 epoch (loss 1.6104): 28%|βββ | 353/1250 [02:05<05:30, 2.72it/s]
Training 1/1 epoch (loss 1.6104): 28%|βββ | 354/1250 [02:05<05:22, 2.77it/s]
Training 1/1 epoch (loss 1.7510): 28%|βββ | 354/1250 [02:05<05:22, 2.77it/s]
Training 1/1 epoch (loss 1.7510): 28%|βββ | 355/1250 [02:05<05:07, 2.91it/s]
Training 1/1 epoch (loss 1.5433): 28%|βββ | 355/1250 [02:05<05:07, 2.91it/s]
Training 1/1 epoch (loss 1.5433): 28%|βββ | 356/1250 [02:05<04:55, 3.03it/s]
Training 1/1 epoch (loss 1.6681): 28%|βββ | 356/1250 [02:06<04:55, 3.03it/s]
Training 1/1 epoch (loss 1.6681): 29%|βββ | 357/1250 [02:06<04:53, 3.04it/s]
Training 1/1 epoch (loss 1.6715): 29%|βββ | 357/1250 [02:06<04:53, 3.04it/s]
Training 1/1 epoch (loss 1.6715): 29%|βββ | 358/1250 [02:06<04:54, 3.03it/s]
Training 1/1 epoch (loss 1.6712): 29%|βββ | 358/1250 [02:06<04:54, 3.03it/s]
Training 1/1 epoch (loss 1.6712): 29%|βββ | 359/1250 [02:06<04:49, 3.08it/s]
Training 1/1 epoch (loss 1.5895): 29%|βββ | 359/1250 [02:07<04:49, 3.08it/s]
Training 1/1 epoch (loss 1.5895): 29%|βββ | 360/1250 [02:07<04:45, 3.11it/s]
Training 1/1 epoch (loss 1.5263): 29%|βββ | 360/1250 [02:07<04:45, 3.11it/s]
Training 1/1 epoch (loss 1.5263): 29%|βββ | 361/1250 [02:07<04:46, 3.11it/s]
Training 1/1 epoch (loss 1.5522): 29%|βββ | 361/1250 [02:07<04:46, 3.11it/s]
Training 1/1 epoch (loss 1.5522): 29%|βββ | 362/1250 [02:07<04:37, 3.20it/s]
Training 1/1 epoch (loss 1.6224): 29%|βββ | 362/1250 [02:08<04:37, 3.20it/s]
Training 1/1 epoch (loss 1.6224): 29%|βββ | 363/1250 [02:08<04:44, 3.12it/s]
Training 1/1 epoch (loss 1.5473): 29%|βββ | 363/1250 [02:08<04:44, 3.12it/s]
Training 1/1 epoch (loss 1.5473): 29%|βββ | 364/1250 [02:08<04:54, 3.01it/s]
Training 1/1 epoch (loss 1.7393): 29%|βββ | 364/1250 [02:08<04:54, 3.01it/s]
Training 1/1 epoch (loss 1.7393): 29%|βββ | 365/1250 [02:08<05:07, 2.87it/s]
Training 1/1 epoch (loss 1.4828): 29%|βββ | 365/1250 [02:09<05:07, 2.87it/s]
Training 1/1 epoch (loss 1.4828): 29%|βββ | 366/1250 [02:09<04:56, 2.98it/s]
Training 1/1 epoch (loss 1.6772): 29%|βββ | 366/1250 [02:09<04:56, 2.98it/s]
Training 1/1 epoch (loss 1.6772): 29%|βββ | 367/1250 [02:09<04:45, 3.09it/s]
Training 1/1 epoch (loss 1.6025): 29%|βββ | 367/1250 [02:09<04:45, 3.09it/s]
Training 1/1 epoch (loss 1.6025): 29%|βββ | 368/1250 [02:09<04:48, 3.06it/s]
Training 1/1 epoch (loss 1.5806): 29%|βββ | 368/1250 [02:10<04:48, 3.06it/s]
Training 1/1 epoch (loss 1.5806): 30%|βββ | 369/1250 [02:10<04:54, 2.99it/s]
Training 1/1 epoch (loss 1.6951): 30%|βββ | 369/1250 [02:10<04:54, 2.99it/s]
Training 1/1 epoch (loss 1.6951): 30%|βββ | 370/1250 [02:10<05:05, 2.88it/s]
Training 1/1 epoch (loss 1.5532): 30%|βββ | 370/1250 [02:10<05:05, 2.88it/s]
Training 1/1 epoch (loss 1.5532): 30%|βββ | 371/1250 [02:10<04:55, 2.97it/s]
Training 1/1 epoch (loss 1.5018): 30%|βββ | 371/1250 [02:11<04:55, 2.97it/s]
Training 1/1 epoch (loss 1.5018): 30%|βββ | 372/1250 [02:11<04:48, 3.04it/s]
Training 1/1 epoch (loss 1.5470): 30%|βββ | 372/1250 [02:11<04:48, 3.04it/s]
Training 1/1 epoch (loss 1.5470): 30%|βββ | 373/1250 [02:11<04:42, 3.10it/s]
Training 1/1 epoch (loss 1.6161): 30%|βββ | 373/1250 [02:11<04:42, 3.10it/s]
Training 1/1 epoch (loss 1.6161): 30%|βββ | 374/1250 [02:11<04:32, 3.21it/s]
Training 1/1 epoch (loss 1.5409): 30%|βββ | 374/1250 [02:12<04:32, 3.21it/s]
Training 1/1 epoch (loss 1.5409): 30%|βββ | 375/1250 [02:12<04:33, 3.20it/s]
Training 1/1 epoch (loss 1.5500): 30%|βββ | 375/1250 [02:12<04:33, 3.20it/s]
Training 1/1 epoch (loss 1.5500): 30%|βββ | 376/1250 [02:12<04:52, 2.98it/s]
Training 1/1 epoch (loss 1.5552): 30%|βββ | 376/1250 [02:12<04:52, 2.98it/s]
Training 1/1 epoch (loss 1.5552): 30%|βββ | 377/1250 [02:12<04:47, 3.04it/s]
Training 1/1 epoch (loss 1.5489): 30%|βββ | 377/1250 [02:13<04:47, 3.04it/s]
Training 1/1 epoch (loss 1.5489): 30%|βββ | 378/1250 [02:13<04:43, 3.08it/s]
Training 1/1 epoch (loss 1.6958): 30%|βββ | 378/1250 [02:13<04:43, 3.08it/s]
Training 1/1 epoch (loss 1.6958): 30%|βββ | 379/1250 [02:13<04:36, 3.14it/s]
Training 1/1 epoch (loss 1.5941): 30%|βββ | 379/1250 [02:13<04:36, 3.14it/s]
Training 1/1 epoch (loss 1.5941): 30%|βββ | 380/1250 [02:13<04:28, 3.24it/s]
Training 1/1 epoch (loss 1.6410): 30%|βββ | 380/1250 [02:14<04:28, 3.24it/s]
Training 1/1 epoch (loss 1.6410): 30%|βββ | 381/1250 [02:14<04:46, 3.03it/s]
Training 1/1 epoch (loss 1.5711): 30%|βββ | 381/1250 [02:14<04:46, 3.03it/s]
Training 1/1 epoch (loss 1.5711): 31%|βββ | 382/1250 [02:14<04:48, 3.01it/s]
Training 1/1 epoch (loss 1.4735): 31%|βββ | 382/1250 [02:14<04:48, 3.01it/s]
Training 1/1 epoch (loss 1.4735): 31%|βββ | 383/1250 [02:14<04:45, 3.03it/s]
Training 1/1 epoch (loss 1.6788): 31%|βββ | 383/1250 [02:15<04:45, 3.03it/s]
Training 1/1 epoch (loss 1.6788): 31%|βββ | 384/1250 [02:15<04:49, 2.99it/s]
Training 1/1 epoch (loss 1.5759): 31%|βββ | 384/1250 [02:15<04:49, 2.99it/s]
Training 1/1 epoch (loss 1.5759): 31%|βββ | 385/1250 [02:15<04:41, 3.08it/s]
Training 1/1 epoch (loss 1.5840): 31%|βββ | 385/1250 [02:15<04:41, 3.08it/s]
Training 1/1 epoch (loss 1.5840): 31%|βββ | 386/1250 [02:15<04:39, 3.09it/s]
Training 1/1 epoch (loss 1.5222): 31%|βββ | 386/1250 [02:16<04:39, 3.09it/s]
Training 1/1 epoch (loss 1.5222): 31%|βββ | 387/1250 [02:16<04:30, 3.19it/s]
Training 1/1 epoch (loss 1.6192): 31%|βββ | 387/1250 [02:16<04:30, 3.19it/s]
Training 1/1 epoch (loss 1.6192): 31%|βββ | 388/1250 [02:16<04:38, 3.09it/s]
Training 1/1 epoch (loss 1.6874): 31%|βββ | 388/1250 [02:16<04:38, 3.09it/s]
Training 1/1 epoch (loss 1.6874): 31%|βββ | 389/1250 [02:16<04:37, 3.10it/s]
Training 1/1 epoch (loss 1.6277): 31%|βββ | 389/1250 [02:16<04:37, 3.10it/s]
Training 1/1 epoch (loss 1.6277): 31%|βββ | 390/1250 [02:16<04:32, 3.15it/s]
Training 1/1 epoch (loss 1.5674): 31%|βββ | 390/1250 [02:17<04:32, 3.15it/s]
Training 1/1 epoch (loss 1.5674): 31%|ββββ | 391/1250 [02:17<04:25, 3.24it/s]
Training 1/1 epoch (loss 1.6080): 31%|ββββ | 391/1250 [02:17<04:25, 3.24it/s]
Training 1/1 epoch (loss 1.6080): 31%|ββββ | 392/1250 [02:17<04:30, 3.17it/s]
Training 1/1 epoch (loss 1.5808): 31%|ββββ | 392/1250 [02:17<04:30, 3.17it/s]
Training 1/1 epoch (loss 1.5808): 31%|ββββ | 393/1250 [02:17<04:32, 3.15it/s]
Training 1/1 epoch (loss 1.6048): 31%|ββββ | 393/1250 [02:18<04:32, 3.15it/s]
Training 1/1 epoch (loss 1.6048): 32%|ββββ | 394/1250 [02:18<04:35, 3.11it/s]
Training 1/1 epoch (loss 1.4598): 32%|ββββ | 394/1250 [02:18<04:35, 3.11it/s]
Training 1/1 epoch (loss 1.4598): 32%|ββββ | 395/1250 [02:18<04:48, 2.96it/s]
Training 1/1 epoch (loss 1.5735): 32%|ββββ | 395/1250 [02:18<04:48, 2.96it/s]
Training 1/1 epoch (loss 1.5735): 32%|ββββ | 396/1250 [02:18<04:46, 2.98it/s]
Training 1/1 epoch (loss 1.6423): 32%|ββββ | 396/1250 [02:19<04:46, 2.98it/s]
Training 1/1 epoch (loss 1.6423): 32%|ββββ | 397/1250 [02:19<04:50, 2.94it/s]
Training 1/1 epoch (loss 1.5781): 32%|ββββ | 397/1250 [02:19<04:50, 2.94it/s]
Training 1/1 epoch (loss 1.5781): 32%|ββββ | 398/1250 [02:19<04:36, 3.08it/s]
Training 1/1 epoch (loss 1.5440): 32%|ββββ | 398/1250 [02:19<04:36, 3.08it/s]
Training 1/1 epoch (loss 1.5440): 32%|ββββ | 399/1250 [02:19<04:38, 3.06it/s]
Training 1/1 epoch (loss 1.4194): 32%|ββββ | 399/1250 [02:20<04:38, 3.06it/s]
Training 1/1 epoch (loss 1.4194): 32%|ββββ | 400/1250 [02:20<04:59, 2.84it/s]
Training 1/1 epoch (loss 1.5606): 32%|ββββ | 400/1250 [02:20<04:59, 2.84it/s]
Training 1/1 epoch (loss 1.5606): 32%|ββββ | 401/1250 [02:20<04:56, 2.87it/s]
Training 1/1 epoch (loss 1.5525): 32%|ββββ | 401/1250 [02:20<04:56, 2.87it/s]
Training 1/1 epoch (loss 1.5525): 32%|ββββ | 402/1250 [02:20<04:41, 3.01it/s]
Training 1/1 epoch (loss 1.5812): 32%|ββββ | 402/1250 [02:21<04:41, 3.01it/s]
Training 1/1 epoch (loss 1.5812): 32%|ββββ | 403/1250 [02:21<04:34, 3.09it/s]
Training 1/1 epoch (loss 1.5651): 32%|ββββ | 403/1250 [02:21<04:34, 3.09it/s]
Training 1/1 epoch (loss 1.5651): 32%|ββββ | 404/1250 [02:21<04:30, 3.12it/s]
Training 1/1 epoch (loss 1.5520): 32%|ββββ | 404/1250 [02:21<04:30, 3.12it/s]
Training 1/1 epoch (loss 1.5520): 32%|ββββ | 405/1250 [02:21<04:28, 3.15it/s]
Training 1/1 epoch (loss 1.5062): 32%|ββββ | 405/1250 [02:22<04:28, 3.15it/s]
Training 1/1 epoch (loss 1.5062): 32%|ββββ | 406/1250 [02:22<04:30, 3.13it/s]
Training 1/1 epoch (loss 1.6264): 32%|ββββ | 406/1250 [02:22<04:30, 3.13it/s]
Training 1/1 epoch (loss 1.6264): 33%|ββββ | 407/1250 [02:22<04:34, 3.07it/s]
Training 1/1 epoch (loss 1.7141): 33%|ββββ | 407/1250 [02:22<04:34, 3.07it/s]
Training 1/1 epoch (loss 1.7141): 33%|ββββ | 408/1250 [02:22<04:37, 3.03it/s]
Training 1/1 epoch (loss 1.4806): 33%|ββββ | 408/1250 [02:23<04:37, 3.03it/s]
Training 1/1 epoch (loss 1.4806): 33%|ββββ | 409/1250 [02:23<04:32, 3.09it/s]
Training 1/1 epoch (loss 1.5548): 33%|ββββ | 409/1250 [02:23<04:32, 3.09it/s]
Training 1/1 epoch (loss 1.5548): 33%|ββββ | 410/1250 [02:23<04:29, 3.12it/s]
Training 1/1 epoch (loss 1.6440): 33%|ββββ | 410/1250 [02:23<04:29, 3.12it/s]
Training 1/1 epoch (loss 1.6440): 33%|ββββ | 411/1250 [02:23<04:23, 3.18it/s]
Training 1/1 epoch (loss 1.6685): 33%|ββββ | 411/1250 [02:24<04:23, 3.18it/s]
Training 1/1 epoch (loss 1.6685): 33%|ββββ | 412/1250 [02:24<04:26, 3.14it/s]
Training 1/1 epoch (loss 1.6321): 33%|ββββ | 412/1250 [02:24<04:26, 3.14it/s]
Training 1/1 epoch (loss 1.6321): 33%|ββββ | 413/1250 [02:24<04:36, 3.03it/s]
Training 1/1 epoch (loss 1.6345): 33%|ββββ | 413/1250 [02:24<04:36, 3.03it/s]
Training 1/1 epoch (loss 1.6345): 33%|ββββ | 414/1250 [02:24<04:29, 3.11it/s]
Training 1/1 epoch (loss 1.5570): 33%|ββββ | 414/1250 [02:25<04:29, 3.11it/s]
Training 1/1 epoch (loss 1.5570): 33%|ββββ | 415/1250 [02:25<04:27, 3.12it/s]
Training 1/1 epoch (loss 1.5400): 33%|ββββ | 415/1250 [02:25<04:27, 3.12it/s]
Training 1/1 epoch (loss 1.5400): 33%|ββββ | 416/1250 [02:25<04:26, 3.13it/s]
Training 1/1 epoch (loss 1.5810): 33%|ββββ | 416/1250 [02:25<04:26, 3.13it/s]
Training 1/1 epoch (loss 1.5810): 33%|ββββ | 417/1250 [02:25<04:24, 3.15it/s]
Training 1/1 epoch (loss 1.4769): 33%|ββββ | 417/1250 [02:26<04:24, 3.15it/s]
Training 1/1 epoch (loss 1.4769): 33%|ββββ | 418/1250 [02:26<04:20, 3.19it/s]
Training 1/1 epoch (loss 1.6649): 33%|ββββ | 418/1250 [02:26<04:20, 3.19it/s]
Training 1/1 epoch (loss 1.6649): 34%|ββββ | 419/1250 [02:26<04:20, 3.19it/s]
Training 1/1 epoch (loss 1.5848): 34%|ββββ | 419/1250 [02:26<04:20, 3.19it/s]
Training 1/1 epoch (loss 1.5848): 34%|ββββ | 420/1250 [02:26<04:30, 3.07it/s]
Training 1/1 epoch (loss 1.6661): 34%|ββββ | 420/1250 [02:27<04:30, 3.07it/s]
Training 1/1 epoch (loss 1.6661): 34%|ββββ | 421/1250 [02:27<04:22, 3.16it/s]
Training 1/1 epoch (loss 1.7186): 34%|ββββ | 421/1250 [02:27<04:22, 3.16it/s]
Training 1/1 epoch (loss 1.7186): 34%|ββββ | 422/1250 [02:27<04:15, 3.24it/s]
Training 1/1 epoch (loss 1.6724): 34%|ββββ | 422/1250 [02:27<04:15, 3.24it/s]
Training 1/1 epoch (loss 1.6724): 34%|ββββ | 423/1250 [02:27<04:40, 2.94it/s]
Training 1/1 epoch (loss 1.6397): 34%|ββββ | 423/1250 [02:28<04:40, 2.94it/s]
Training 1/1 epoch (loss 1.6397): 34%|ββββ | 424/1250 [02:28<04:39, 2.96it/s]
Training 1/1 epoch (loss 1.7289): 34%|ββββ | 424/1250 [02:28<04:39, 2.96it/s]
Training 1/1 epoch (loss 1.7289): 34%|ββββ | 425/1250 [02:28<04:34, 3.01it/s]
Training 1/1 epoch (loss 1.7364): 34%|ββββ | 425/1250 [02:28<04:34, 3.01it/s]
Training 1/1 epoch (loss 1.7364): 34%|ββββ | 426/1250 [02:28<05:01, 2.73it/s]
Training 1/1 epoch (loss 1.5318): 34%|ββββ | 426/1250 [02:29<05:01, 2.73it/s]
Training 1/1 epoch (loss 1.5318): 34%|ββββ | 427/1250 [02:29<04:47, 2.86it/s]
Training 1/1 epoch (loss 1.6076): 34%|ββββ | 427/1250 [02:29<04:47, 2.86it/s]
Training 1/1 epoch (loss 1.6076): 34%|ββββ | 428/1250 [02:29<04:34, 2.99it/s]
Training 1/1 epoch (loss 1.5975): 34%|ββββ | 428/1250 [02:29<04:34, 2.99it/s]
Training 1/1 epoch (loss 1.5975): 34%|ββββ | 429/1250 [02:29<04:41, 2.91it/s]
Training 1/1 epoch (loss 1.6277): 34%|ββββ | 429/1250 [02:30<04:41, 2.91it/s]
Training 1/1 epoch (loss 1.6277): 34%|ββββ | 430/1250 [02:30<04:49, 2.83it/s]
Training 1/1 epoch (loss 1.5431): 34%|ββββ | 430/1250 [02:30<04:49, 2.83it/s]
Training 1/1 epoch (loss 1.5431): 34%|ββββ | 431/1250 [02:30<04:44, 2.87it/s]
Training 1/1 epoch (loss 1.4651): 34%|ββββ | 431/1250 [02:30<04:44, 2.87it/s]
Training 1/1 epoch (loss 1.4651): 35%|ββββ | 432/1250 [02:30<04:39, 2.93it/s]
Training 1/1 epoch (loss 1.6423): 35%|ββββ | 432/1250 [02:31<04:39, 2.93it/s]
Training 1/1 epoch (loss 1.6423): 35%|ββββ | 433/1250 [02:31<04:32, 2.99it/s]
Training 1/1 epoch (loss 1.5034): 35%|ββββ | 433/1250 [02:31<04:32, 2.99it/s]
Training 1/1 epoch (loss 1.5034): 35%|ββββ | 434/1250 [02:31<04:22, 3.11it/s]
Training 1/1 epoch (loss 1.6181): 35%|ββββ | 434/1250 [02:31<04:22, 3.11it/s]
Training 1/1 epoch (loss 1.6181): 35%|ββββ | 435/1250 [02:31<04:19, 3.13it/s]
Training 1/1 epoch (loss 1.6415): 35%|ββββ | 435/1250 [02:32<04:19, 3.13it/s]
Training 1/1 epoch (loss 1.6415): 35%|ββββ | 436/1250 [02:32<04:22, 3.10it/s]
Training 1/1 epoch (loss 1.4412): 35%|ββββ | 436/1250 [02:32<04:22, 3.10it/s]
Training 1/1 epoch (loss 1.4412): 35%|ββββ | 437/1250 [02:32<04:19, 3.14it/s]
Training 1/1 epoch (loss 1.4939): 35%|ββββ | 437/1250 [02:32<04:19, 3.14it/s]
Training 1/1 epoch (loss 1.4939): 35%|ββββ | 438/1250 [02:32<04:15, 3.18it/s]
Training 1/1 epoch (loss 1.6320): 35%|ββββ | 438/1250 [02:33<04:15, 3.18it/s]
Training 1/1 epoch (loss 1.6320): 35%|ββββ | 439/1250 [02:33<04:15, 3.18it/s]
Training 1/1 epoch (loss 1.6461): 35%|ββββ | 439/1250 [02:33<04:15, 3.18it/s]
Training 1/1 epoch (loss 1.6461): 35%|ββββ | 440/1250 [02:33<04:17, 3.15it/s]
Training 1/1 epoch (loss 1.4778): 35%|ββββ | 440/1250 [02:33<04:17, 3.15it/s]
Training 1/1 epoch (loss 1.4778): 35%|ββββ | 441/1250 [02:33<04:18, 3.13it/s]
Training 1/1 epoch (loss 1.6374): 35%|ββββ | 441/1250 [02:34<04:18, 3.13it/s]
Training 1/1 epoch (loss 1.6374): 35%|ββββ | 442/1250 [02:34<04:17, 3.13it/s]
Training 1/1 epoch (loss 1.4930): 35%|ββββ | 442/1250 [02:34<04:17, 3.13it/s]
Training 1/1 epoch (loss 1.4930): 35%|ββββ | 443/1250 [02:34<04:21, 3.09it/s]
Training 1/1 epoch (loss 1.6086): 35%|ββββ | 443/1250 [02:35<04:21, 3.09it/s]
Training 1/1 epoch (loss 1.6086): 36%|ββββ | 444/1250 [02:35<06:03, 2.22it/s]
Training 1/1 epoch (loss 1.6976): 36%|ββββ | 444/1250 [02:35<06:03, 2.22it/s]
Training 1/1 epoch (loss 1.6976): 36%|ββββ | 445/1250 [02:35<05:25, 2.47it/s]
Training 1/1 epoch (loss 1.5860): 36%|ββββ | 445/1250 [02:35<05:25, 2.47it/s]
Training 1/1 epoch (loss 1.5860): 36%|ββββ | 446/1250 [02:35<05:02, 2.66it/s]
Training 1/1 epoch (loss 1.5478): 36%|ββββ | 446/1250 [02:36<05:02, 2.66it/s]
Training 1/1 epoch (loss 1.5478): 36%|ββββ | 447/1250 [02:36<04:54, 2.73it/s]
Training 1/1 epoch (loss 1.5977): 36%|ββββ | 447/1250 [02:36<04:54, 2.73it/s]
Training 1/1 epoch (loss 1.5977): 36%|ββββ | 448/1250 [02:36<04:48, 2.78it/s]
Training 1/1 epoch (loss 1.5446): 36%|ββββ | 448/1250 [02:36<04:48, 2.78it/s]
Training 1/1 epoch (loss 1.5446): 36%|ββββ | 449/1250 [02:36<04:49, 2.77it/s]
Training 1/1 epoch (loss 1.6645): 36%|ββββ | 449/1250 [02:37<04:49, 2.77it/s]
Training 1/1 epoch (loss 1.6645): 36%|ββββ | 450/1250 [02:37<04:33, 2.93it/s]
Training 1/1 epoch (loss 1.6707): 36%|ββββ | 450/1250 [02:37<04:33, 2.93it/s]
Training 1/1 epoch (loss 1.6707): 36%|ββββ | 451/1250 [02:37<04:29, 2.96it/s]
Training 1/1 epoch (loss 1.6320): 36%|ββββ | 451/1250 [02:37<04:29, 2.96it/s]
Training 1/1 epoch (loss 1.6320): 36%|ββββ | 452/1250 [02:37<04:21, 3.05it/s]
Training 1/1 epoch (loss 1.6627): 36%|ββββ | 452/1250 [02:37<04:21, 3.05it/s]
Training 1/1 epoch (loss 1.6627): 36%|ββββ | 453/1250 [02:37<04:19, 3.07it/s]
Training 1/1 epoch (loss 1.6500): 36%|ββββ | 453/1250 [02:38<04:19, 3.07it/s]
Training 1/1 epoch (loss 1.6500): 36%|ββββ | 454/1250 [02:38<04:24, 3.01it/s]
Training 1/1 epoch (loss 1.6804): 36%|ββββ | 454/1250 [02:38<04:24, 3.01it/s]
Training 1/1 epoch (loss 1.6804): 36%|ββββ | 455/1250 [02:38<04:24, 3.00it/s]
Training 1/1 epoch (loss 1.5469): 36%|ββββ | 455/1250 [02:39<04:24, 3.00it/s]
Training 1/1 epoch (loss 1.5469): 36%|ββββ | 456/1250 [02:39<04:23, 3.01it/s]
Training 1/1 epoch (loss 1.5104): 36%|ββββ | 456/1250 [02:39<04:23, 3.01it/s]
Training 1/1 epoch (loss 1.5104): 37%|ββββ | 457/1250 [02:39<04:16, 3.09it/s]
Training 1/1 epoch (loss 1.5762): 37%|ββββ | 457/1250 [02:39<04:16, 3.09it/s]
Training 1/1 epoch (loss 1.5762): 37%|ββββ | 458/1250 [02:39<04:14, 3.11it/s]
Training 1/1 epoch (loss 1.6359): 37%|ββββ | 458/1250 [02:39<04:14, 3.11it/s]
Training 1/1 epoch (loss 1.6359): 37%|ββββ | 459/1250 [02:39<04:15, 3.10it/s]
Training 1/1 epoch (loss 1.5895): 37%|ββββ | 459/1250 [02:40<04:15, 3.10it/s]
Training 1/1 epoch (loss 1.5895): 37%|ββββ | 460/1250 [02:40<04:19, 3.05it/s]
Training 1/1 epoch (loss 1.5720): 37%|ββββ | 460/1250 [02:40<04:19, 3.05it/s]
Training 1/1 epoch (loss 1.5720): 37%|ββββ | 461/1250 [02:40<04:37, 2.84it/s]
Training 1/1 epoch (loss 1.5889): 37%|ββββ | 461/1250 [02:41<04:37, 2.84it/s]
Training 1/1 epoch (loss 1.5889): 37%|ββββ | 462/1250 [02:41<04:28, 2.93it/s]
Training 1/1 epoch (loss 1.5692): 37%|ββββ | 462/1250 [02:41<04:28, 2.93it/s]
Training 1/1 epoch (loss 1.5692): 37%|ββββ | 463/1250 [02:41<04:21, 3.01it/s]
Training 1/1 epoch (loss 1.5641): 37%|ββββ | 463/1250 [02:41<04:21, 3.01it/s]
Training 1/1 epoch (loss 1.5641): 37%|ββββ | 464/1250 [02:41<04:17, 3.05it/s]
Training 1/1 epoch (loss 1.5356): 37%|ββββ | 464/1250 [02:41<04:17, 3.05it/s]
Training 1/1 epoch (loss 1.5356): 37%|ββββ | 465/1250 [02:41<04:19, 3.03it/s]
Training 1/1 epoch (loss 1.5776): 37%|ββββ | 465/1250 [02:42<04:19, 3.03it/s]
Training 1/1 epoch (loss 1.5776): 37%|ββββ | 466/1250 [02:42<04:14, 3.08it/s]
Training 1/1 epoch (loss 1.5931): 37%|ββββ | 466/1250 [02:42<04:14, 3.08it/s]
Training 1/1 epoch (loss 1.5931): 37%|ββββ | 467/1250 [02:42<04:17, 3.04it/s]
Training 1/1 epoch (loss 1.5348): 37%|ββββ | 467/1250 [02:42<04:17, 3.04it/s]
Training 1/1 epoch (loss 1.5348): 37%|ββββ | 468/1250 [02:42<04:13, 3.09it/s]
Training 1/1 epoch (loss 1.5332): 37%|ββββ | 468/1250 [02:43<04:13, 3.09it/s]
Training 1/1 epoch (loss 1.5332): 38%|ββββ | 469/1250 [02:43<04:09, 3.13it/s]
Training 1/1 epoch (loss 1.4508): 38%|ββββ | 469/1250 [02:43<04:09, 3.13it/s]
Training 1/1 epoch (loss 1.4508): 38%|ββββ | 470/1250 [02:43<04:04, 3.19it/s]
Training 1/1 epoch (loss 1.4672): 38%|ββββ | 470/1250 [02:43<04:04, 3.19it/s]
Training 1/1 epoch (loss 1.4672): 38%|ββββ | 471/1250 [02:43<04:09, 3.12it/s]
Training 1/1 epoch (loss 1.6180): 38%|ββββ | 471/1250 [02:44<04:09, 3.12it/s]
Training 1/1 epoch (loss 1.6180): 38%|ββββ | 472/1250 [02:44<04:13, 3.07it/s]
Training 1/1 epoch (loss 1.5695): 38%|ββββ | 472/1250 [02:44<04:13, 3.07it/s]
Training 1/1 epoch (loss 1.5695): 38%|ββββ | 473/1250 [02:44<04:15, 3.04it/s]
Training 1/1 epoch (loss 1.5001): 38%|ββββ | 473/1250 [02:44<04:15, 3.04it/s]
Training 1/1 epoch (loss 1.5001): 38%|ββββ | 474/1250 [02:44<04:16, 3.02it/s]
Training 1/1 epoch (loss 1.4415): 38%|ββββ | 474/1250 [02:45<04:16, 3.02it/s]
Training 1/1 epoch (loss 1.4415): 38%|ββββ | 475/1250 [02:45<04:30, 2.86it/s]
Training 1/1 epoch (loss 1.5828): 38%|ββββ | 475/1250 [02:45<04:30, 2.86it/s]
Training 1/1 epoch (loss 1.5828): 38%|ββββ | 476/1250 [02:45<04:29, 2.87it/s]
Training 1/1 epoch (loss 1.5718): 38%|ββββ | 476/1250 [02:46<04:29, 2.87it/s]
Training 1/1 epoch (loss 1.5718): 38%|ββββ | 477/1250 [02:46<04:34, 2.82it/s]
Training 1/1 epoch (loss 1.6308): 38%|ββββ | 477/1250 [02:46<04:34, 2.82it/s]
Training 1/1 epoch (loss 1.6308): 38%|ββββ | 478/1250 [02:46<04:26, 2.89it/s]
Training 1/1 epoch (loss 1.5572): 38%|ββββ | 478/1250 [02:46<04:26, 2.89it/s]
Training 1/1 epoch (loss 1.5572): 38%|ββββ | 479/1250 [02:46<04:30, 2.85it/s]
Training 1/1 epoch (loss 1.5445): 38%|ββββ | 479/1250 [02:47<04:30, 2.85it/s]
Training 1/1 epoch (loss 1.5445): 38%|ββββ | 480/1250 [02:47<04:24, 2.91it/s]
Training 1/1 epoch (loss 1.4971): 38%|ββββ | 480/1250 [02:47<04:24, 2.91it/s]
Training 1/1 epoch (loss 1.4971): 38%|ββββ | 481/1250 [02:47<04:16, 3.00it/s]
Training 1/1 epoch (loss 1.4669): 38%|ββββ | 481/1250 [02:47<04:16, 3.00it/s]
Training 1/1 epoch (loss 1.4669): 39%|ββββ | 482/1250 [02:47<04:10, 3.06it/s]
Training 1/1 epoch (loss 1.5631): 39%|ββββ | 482/1250 [02:47<04:10, 3.06it/s]
Training 1/1 epoch (loss 1.5631): 39%|ββββ | 483/1250 [02:47<04:10, 3.06it/s]
Training 1/1 epoch (loss 1.5626): 39%|ββββ | 483/1250 [02:48<04:10, 3.06it/s]
Training 1/1 epoch (loss 1.5626): 39%|ββββ | 484/1250 [02:48<04:10, 3.06it/s]
Training 1/1 epoch (loss 1.6418): 39%|ββββ | 484/1250 [02:48<04:10, 3.06it/s]
Training 1/1 epoch (loss 1.6418): 39%|ββββ | 485/1250 [02:48<04:13, 3.01it/s]
Training 1/1 epoch (loss 1.5728): 39%|ββββ | 485/1250 [02:48<04:13, 3.01it/s]
Training 1/1 epoch (loss 1.5728): 39%|ββββ | 486/1250 [02:48<04:07, 3.09it/s]
Training 1/1 epoch (loss 1.4971): 39%|ββββ | 486/1250 [02:49<04:07, 3.09it/s]
Training 1/1 epoch (loss 1.4971): 39%|ββββ | 487/1250 [02:49<04:04, 3.12it/s]
Training 1/1 epoch (loss 1.6212): 39%|ββββ | 487/1250 [02:49<04:04, 3.12it/s]
Training 1/1 epoch (loss 1.6212): 39%|ββββ | 488/1250 [02:49<04:12, 3.02it/s]
Training 1/1 epoch (loss 1.4947): 39%|ββββ | 488/1250 [02:49<04:12, 3.02it/s]
Training 1/1 epoch (loss 1.4947): 39%|ββββ | 489/1250 [02:49<04:11, 3.02it/s]
Training 1/1 epoch (loss 1.6791): 39%|ββββ | 489/1250 [02:50<04:11, 3.02it/s]
Training 1/1 epoch (loss 1.6791): 39%|ββββ | 490/1250 [02:50<04:08, 3.05it/s]
Training 1/1 epoch (loss 1.5923): 39%|ββββ | 490/1250 [02:50<04:08, 3.05it/s]
Training 1/1 epoch (loss 1.5923): 39%|ββββ | 491/1250 [02:50<04:50, 2.61it/s]
Training 1/1 epoch (loss 1.5701): 39%|ββββ | 491/1250 [02:51<04:50, 2.61it/s]
Training 1/1 epoch (loss 1.5701): 39%|ββββ | 492/1250 [02:51<04:45, 2.66it/s]
Training 1/1 epoch (loss 1.5429): 39%|ββββ | 492/1250 [02:51<04:45, 2.66it/s]
Training 1/1 epoch (loss 1.5429): 39%|ββββ | 493/1250 [02:51<04:40, 2.70it/s]
Training 1/1 epoch (loss 1.5094): 39%|ββββ | 493/1250 [02:51<04:40, 2.70it/s]
Training 1/1 epoch (loss 1.5094): 40%|ββββ | 494/1250 [02:51<04:32, 2.77it/s]
Training 1/1 epoch (loss 1.5811): 40%|ββββ | 494/1250 [02:52<04:32, 2.77it/s]
Training 1/1 epoch (loss 1.5811): 40%|ββββ | 495/1250 [02:52<04:32, 2.77it/s]
Training 1/1 epoch (loss 1.6165): 40%|ββββ | 495/1250 [02:52<04:32, 2.77it/s]
Training 1/1 epoch (loss 1.6165): 40%|ββββ | 496/1250 [02:52<04:42, 2.67it/s]
Training 1/1 epoch (loss 1.5320): 40%|ββββ | 496/1250 [02:52<04:42, 2.67it/s]
Training 1/1 epoch (loss 1.5320): 40%|ββββ | 497/1250 [02:52<04:34, 2.74it/s]
Training 1/1 epoch (loss 1.5143): 40%|ββββ | 497/1250 [02:53<04:34, 2.74it/s]
Training 1/1 epoch (loss 1.5143): 40%|ββββ | 498/1250 [02:53<04:19, 2.90it/s]
Training 1/1 epoch (loss 1.6379): 40%|ββββ | 498/1250 [02:53<04:19, 2.90it/s]
Training 1/1 epoch (loss 1.6379): 40%|ββββ | 499/1250 [02:53<04:17, 2.92it/s]
Training 1/1 epoch (loss 1.5052): 40%|ββββ | 499/1250 [02:53<04:17, 2.92it/s]
Training 1/1 epoch (loss 1.5052): 40%|ββββ | 500/1250 [02:53<04:07, 3.03it/s]
Training 1/1 epoch (loss 1.5211): 40%|ββββ | 500/1250 [02:54<04:07, 3.03it/s]
Training 1/1 epoch (loss 1.5211): 40%|ββββ | 501/1250 [02:54<04:03, 3.08it/s]
Training 1/1 epoch (loss 1.5569): 40%|ββββ | 501/1250 [02:54<04:03, 3.08it/s]
Training 1/1 epoch (loss 1.5569): 40%|ββββ | 502/1250 [02:54<04:01, 3.10it/s]
Training 1/1 epoch (loss 1.4826): 40%|ββββ | 502/1250 [02:54<04:01, 3.10it/s]
Training 1/1 epoch (loss 1.4826): 40%|ββββ | 503/1250 [02:54<04:04, 3.06it/s]
Training 1/1 epoch (loss 1.6105): 40%|ββββ | 503/1250 [02:55<04:04, 3.06it/s]
Training 1/1 epoch (loss 1.6105): 40%|ββββ | 504/1250 [02:55<04:08, 3.00it/s]
Training 1/1 epoch (loss 1.5121): 40%|ββββ | 504/1250 [02:55<04:08, 3.00it/s]
Training 1/1 epoch (loss 1.5121): 40%|ββββ | 505/1250 [02:55<03:59, 3.11it/s]
Training 1/1 epoch (loss 1.4784): 40%|ββββ | 505/1250 [02:55<03:59, 3.11it/s]
Training 1/1 epoch (loss 1.4784): 40%|ββββ | 506/1250 [02:55<04:03, 3.05it/s]
Training 1/1 epoch (loss 1.5300): 40%|ββββ | 506/1250 [02:56<04:03, 3.05it/s]
Training 1/1 epoch (loss 1.5300): 41%|ββββ | 507/1250 [02:56<04:05, 3.02it/s]
Training 1/1 epoch (loss 1.5866): 41%|ββββ | 507/1250 [02:56<04:05, 3.02it/s]
Training 1/1 epoch (loss 1.5866): 41%|ββββ | 508/1250 [02:56<04:02, 3.05it/s]
Training 1/1 epoch (loss 1.5370): 41%|ββββ | 508/1250 [02:56<04:02, 3.05it/s]
Training 1/1 epoch (loss 1.5370): 41%|ββββ | 509/1250 [02:56<04:06, 3.01it/s]
Training 1/1 epoch (loss 1.6328): 41%|ββββ | 509/1250 [02:57<04:06, 3.01it/s]
Training 1/1 epoch (loss 1.6328): 41%|ββββ | 510/1250 [02:57<03:59, 3.09it/s]
Training 1/1 epoch (loss 1.6216): 41%|ββββ | 510/1250 [02:57<03:59, 3.09it/s]
Training 1/1 epoch (loss 1.6216): 41%|ββββ | 511/1250 [02:57<03:54, 3.15it/s]
Training 1/1 epoch (loss 1.6231): 41%|ββββ | 511/1250 [02:57<03:54, 3.15it/s]
Training 1/1 epoch (loss 1.6231): 41%|ββββ | 512/1250 [02:57<03:55, 3.13it/s]
Training 1/1 epoch (loss 1.5906): 41%|ββββ | 512/1250 [02:58<03:55, 3.13it/s]
Training 1/1 epoch (loss 1.5906): 41%|ββββ | 513/1250 [02:58<04:06, 2.99it/s]
Training 1/1 epoch (loss 1.6143): 41%|ββββ | 513/1250 [02:58<04:06, 2.99it/s]
Training 1/1 epoch (loss 1.6143): 41%|ββββ | 514/1250 [02:58<03:59, 3.07it/s]
Training 1/1 epoch (loss 1.5638): 41%|ββββ | 514/1250 [02:58<03:59, 3.07it/s]
Training 1/1 epoch (loss 1.5638): 41%|ββββ | 515/1250 [02:58<04:04, 3.01it/s]
Training 1/1 epoch (loss 1.4185): 41%|ββββ | 515/1250 [02:59<04:04, 3.01it/s]
Training 1/1 epoch (loss 1.4185): 41%|βββββ | 516/1250 [02:59<03:57, 3.09it/s]
Training 1/1 epoch (loss 1.4724): 41%|βββββ | 516/1250 [02:59<03:57, 3.09it/s]
Training 1/1 epoch (loss 1.4724): 41%|βββββ | 517/1250 [02:59<03:53, 3.13it/s]
Training 1/1 epoch (loss 1.6071): 41%|βββββ | 517/1250 [02:59<03:53, 3.13it/s]
Training 1/1 epoch (loss 1.6071): 41%|βββββ | 518/1250 [02:59<03:49, 3.19it/s]
Training 1/1 epoch (loss 1.6825): 41%|βββββ | 518/1250 [03:00<03:49, 3.19it/s]
Training 1/1 epoch (loss 1.6825): 42%|βββββ | 519/1250 [03:00<04:12, 2.89it/s]
Training 1/1 epoch (loss 1.5490): 42%|βββββ | 519/1250 [03:00<04:12, 2.89it/s]
Training 1/1 epoch (loss 1.5490): 42%|βββββ | 520/1250 [03:00<04:06, 2.96it/s]
Training 1/1 epoch (loss 1.6199): 42%|βββββ | 520/1250 [03:00<04:06, 2.96it/s]
Training 1/1 epoch (loss 1.6199): 42%|βββββ | 521/1250 [03:00<04:06, 2.95it/s]
Training 1/1 epoch (loss 1.5604): 42%|βββββ | 521/1250 [03:01<04:06, 2.95it/s]
Training 1/1 epoch (loss 1.5604): 42%|βββββ | 522/1250 [03:01<04:12, 2.88it/s]
Training 1/1 epoch (loss 1.6412): 42%|βββββ | 522/1250 [03:01<04:12, 2.88it/s]
Training 1/1 epoch (loss 1.6412): 42%|βββββ | 523/1250 [03:01<04:26, 2.73it/s]
Training 1/1 epoch (loss 1.4999): 42%|βββββ | 523/1250 [03:01<04:26, 2.73it/s]
Training 1/1 epoch (loss 1.4999): 42%|βββββ | 524/1250 [03:01<04:30, 2.69it/s]
Training 1/1 epoch (loss 1.5667): 42%|βββββ | 524/1250 [03:02<04:30, 2.69it/s]
Training 1/1 epoch (loss 1.5667): 42%|βββββ | 525/1250 [03:02<04:41, 2.57it/s]
Training 1/1 epoch (loss 1.6243): 42%|βββββ | 525/1250 [03:02<04:41, 2.57it/s]
Training 1/1 epoch (loss 1.6243): 42%|βββββ | 526/1250 [03:02<04:55, 2.45it/s]
Training 1/1 epoch (loss 1.5905): 42%|βββββ | 526/1250 [03:03<04:55, 2.45it/s]
Training 1/1 epoch (loss 1.5905): 42%|βββββ | 527/1250 [03:03<04:54, 2.46it/s]
Training 1/1 epoch (loss 1.6059): 42%|βββββ | 527/1250 [03:03<04:54, 2.46it/s]
Training 1/1 epoch (loss 1.6059): 42%|βββββ | 528/1250 [03:03<04:57, 2.43it/s]
Training 1/1 epoch (loss 1.5396): 42%|βββββ | 528/1250 [03:04<04:57, 2.43it/s]
Training 1/1 epoch (loss 1.5396): 42%|βββββ | 529/1250 [03:04<05:00, 2.40it/s]
Training 1/1 epoch (loss 1.6206): 42%|βββββ | 529/1250 [03:04<05:00, 2.40it/s]
Training 1/1 epoch (loss 1.6206): 42%|βββββ | 530/1250 [03:04<05:05, 2.36it/s]
Training 1/1 epoch (loss 1.5264): 42%|βββββ | 530/1250 [03:04<05:05, 2.36it/s]
Training 1/1 epoch (loss 1.5264): 42%|βββββ | 531/1250 [03:04<05:04, 2.36it/s]
Training 1/1 epoch (loss 1.4780): 42%|βββββ | 531/1250 [03:05<05:04, 2.36it/s]
Training 1/1 epoch (loss 1.4780): 43%|βββββ | 532/1250 [03:05<05:02, 2.38it/s]
Training 1/1 epoch (loss 1.4864): 43%|βββββ | 532/1250 [03:05<05:02, 2.38it/s]
Training 1/1 epoch (loss 1.4864): 43%|βββββ | 533/1250 [03:05<04:57, 2.41it/s]
Training 1/1 epoch (loss 1.6297): 43%|βββββ | 533/1250 [03:06<04:57, 2.41it/s]
Training 1/1 epoch (loss 1.6297): 43%|βββββ | 534/1250 [03:06<04:53, 2.44it/s]
Training 1/1 epoch (loss 1.5705): 43%|βββββ | 534/1250 [03:06<04:53, 2.44it/s]
Training 1/1 epoch (loss 1.5705): 43%|βββββ | 535/1250 [03:06<05:04, 2.35it/s]
Training 1/1 epoch (loss 1.5919): 43%|βββββ | 535/1250 [03:07<05:04, 2.35it/s]
Training 1/1 epoch (loss 1.5919): 43%|βββββ | 536/1250 [03:07<05:06, 2.33it/s]
Training 1/1 epoch (loss 1.4289): 43%|βββββ | 536/1250 [03:07<05:06, 2.33it/s]
Training 1/1 epoch (loss 1.4289): 43%|βββββ | 537/1250 [03:07<04:57, 2.39it/s]
Training 1/1 epoch (loss 1.5196): 43%|βββββ | 537/1250 [03:07<04:57, 2.39it/s]
Training 1/1 epoch (loss 1.5196): 43%|βββββ | 538/1250 [03:07<04:38, 2.56it/s]
Training 1/1 epoch (loss 1.4980): 43%|βββββ | 538/1250 [03:08<04:38, 2.56it/s]
Training 1/1 epoch (loss 1.4980): 43%|βββββ | 539/1250 [03:08<04:22, 2.70it/s]
Training 1/1 epoch (loss 1.6320): 43%|βββββ | 539/1250 [03:08<04:22, 2.70it/s]
Training 1/1 epoch (loss 1.6320): 43%|βββββ | 540/1250 [03:08<04:19, 2.74it/s]
Training 1/1 epoch (loss 1.5796): 43%|βββββ | 540/1250 [03:08<04:19, 2.74it/s]
Training 1/1 epoch (loss 1.5796): 43%|βββββ | 541/1250 [03:08<04:20, 2.72it/s]
Training 1/1 epoch (loss 1.5933): 43%|βββββ | 541/1250 [03:09<04:20, 2.72it/s]
Training 1/1 epoch (loss 1.5933): 43%|βββββ | 542/1250 [03:09<04:16, 2.76it/s]
Training 1/1 epoch (loss 1.5705): 43%|βββββ | 542/1250 [03:09<04:16, 2.76it/s]
Training 1/1 epoch (loss 1.5705): 43%|βββββ | 543/1250 [03:09<04:04, 2.89it/s]
Training 1/1 epoch (loss 1.6099): 43%|βββββ | 543/1250 [03:09<04:04, 2.89it/s]
Training 1/1 epoch (loss 1.6099): 44%|βββββ | 544/1250 [03:09<03:58, 2.95it/s]
Training 1/1 epoch (loss 1.5870): 44%|βββββ | 544/1250 [03:10<03:58, 2.95it/s]
Training 1/1 epoch (loss 1.5870): 44%|βββββ | 545/1250 [03:10<04:07, 2.85it/s]
Training 1/1 epoch (loss 1.4679): 44%|βββββ | 545/1250 [03:10<04:07, 2.85it/s]
Training 1/1 epoch (loss 1.4679): 44%|βββββ | 546/1250 [03:10<04:02, 2.90it/s]
Training 1/1 epoch (loss 1.5485): 44%|βββββ | 546/1250 [03:10<04:02, 2.90it/s]
Training 1/1 epoch (loss 1.5485): 44%|βββββ | 547/1250 [03:10<04:00, 2.93it/s]
Training 1/1 epoch (loss 1.5745): 44%|βββββ | 547/1250 [03:11<04:00, 2.93it/s]
Training 1/1 epoch (loss 1.5745): 44%|βββββ | 548/1250 [03:11<03:56, 2.97it/s]
Training 1/1 epoch (loss 1.5693): 44%|βββββ | 548/1250 [03:11<03:56, 2.97it/s]
Training 1/1 epoch (loss 1.5693): 44%|βββββ | 549/1250 [03:11<03:59, 2.93it/s]
Training 1/1 epoch (loss 1.4975): 44%|βββββ | 549/1250 [03:11<03:59, 2.93it/s]
Training 1/1 epoch (loss 1.4975): 44%|βββββ | 550/1250 [03:11<04:00, 2.91it/s]
Training 1/1 epoch (loss 1.5892): 44%|βββββ | 550/1250 [03:12<04:00, 2.91it/s]
Training 1/1 epoch (loss 1.5892): 44%|βββββ | 551/1250 [03:12<04:07, 2.83it/s]
Training 1/1 epoch (loss 1.5555): 44%|βββββ | 551/1250 [03:12<04:07, 2.83it/s]
Training 1/1 epoch (loss 1.5555): 44%|βββββ | 552/1250 [03:12<04:05, 2.85it/s]
Training 1/1 epoch (loss 1.3951): 44%|βββββ | 552/1250 [03:12<04:05, 2.85it/s]
Training 1/1 epoch (loss 1.3951): 44%|βββββ | 553/1250 [03:12<03:57, 2.94it/s]
Training 1/1 epoch (loss 1.5651): 44%|βββββ | 553/1250 [03:13<03:57, 2.94it/s]
Training 1/1 epoch (loss 1.5651): 44%|βββββ | 554/1250 [03:13<03:50, 3.02it/s]
Training 1/1 epoch (loss 1.5935): 44%|βββββ | 554/1250 [03:13<03:50, 3.02it/s]
Training 1/1 epoch (loss 1.5935): 44%|βββββ | 555/1250 [03:13<03:45, 3.08it/s]
Training 1/1 epoch (loss 1.6183): 44%|βββββ | 555/1250 [03:13<03:45, 3.08it/s]
Training 1/1 epoch (loss 1.6183): 44%|βββββ | 556/1250 [03:13<03:38, 3.17it/s]
Training 1/1 epoch (loss 1.6070): 44%|βββββ | 556/1250 [03:14<03:38, 3.17it/s]
Training 1/1 epoch (loss 1.6070): 45%|βββββ | 557/1250 [03:14<03:37, 3.19it/s]
Training 1/1 epoch (loss 1.5314): 45%|βββββ | 557/1250 [03:14<03:37, 3.19it/s]
Training 1/1 epoch (loss 1.5314): 45%|βββββ | 558/1250 [03:14<03:39, 3.16it/s]
Training 1/1 epoch (loss 1.5149): 45%|βββββ | 558/1250 [03:14<03:39, 3.16it/s]
Training 1/1 epoch (loss 1.5149): 45%|βββββ | 559/1250 [03:14<03:42, 3.10it/s]
Training 1/1 epoch (loss 1.5681): 45%|βββββ | 559/1250 [03:15<03:42, 3.10it/s]
Training 1/1 epoch (loss 1.5681): 45%|βββββ | 560/1250 [03:15<03:50, 2.99it/s]
Training 1/1 epoch (loss 1.6535): 45%|βββββ | 560/1250 [03:15<03:50, 2.99it/s]
Training 1/1 epoch (loss 1.6535): 45%|βββββ | 561/1250 [03:15<03:46, 3.04it/s]
Training 1/1 epoch (loss 1.5790): 45%|βββββ | 561/1250 [03:15<03:46, 3.04it/s]
Training 1/1 epoch (loss 1.5790): 45%|βββββ | 562/1250 [03:15<03:40, 3.12it/s]
Training 1/1 epoch (loss 1.5462): 45%|βββββ | 562/1250 [03:16<03:40, 3.12it/s]
Training 1/1 epoch (loss 1.5462): 45%|βββββ | 563/1250 [03:16<03:39, 3.13it/s]
Training 1/1 epoch (loss 1.6233): 45%|βββββ | 563/1250 [03:16<03:39, 3.13it/s]
Training 1/1 epoch (loss 1.6233): 45%|βββββ | 564/1250 [03:16<03:39, 3.13it/s]
Training 1/1 epoch (loss 1.5984): 45%|βββββ | 564/1250 [03:16<03:39, 3.13it/s]
Training 1/1 epoch (loss 1.5984): 45%|βββββ | 565/1250 [03:16<03:42, 3.08it/s]
Training 1/1 epoch (loss 1.5777): 45%|βββββ | 565/1250 [03:17<03:42, 3.08it/s]
Training 1/1 epoch (loss 1.5777): 45%|βββββ | 566/1250 [03:17<03:40, 3.10it/s]
Training 1/1 epoch (loss 1.5382): 45%|βββββ | 566/1250 [03:17<03:40, 3.10it/s]
Training 1/1 epoch (loss 1.5382): 45%|βββββ | 567/1250 [03:17<03:35, 3.18it/s]
Training 1/1 epoch (loss 1.5715): 45%|βββββ | 567/1250 [03:17<03:35, 3.18it/s]
Training 1/1 epoch (loss 1.5715): 45%|βββββ | 568/1250 [03:17<03:40, 3.09it/s]
Training 1/1 epoch (loss 1.5353): 45%|βββββ | 568/1250 [03:18<03:40, 3.09it/s]
Training 1/1 epoch (loss 1.5353): 46%|βββββ | 569/1250 [03:18<03:45, 3.02it/s]
Training 1/1 epoch (loss 1.5787): 46%|βββββ | 569/1250 [03:18<03:45, 3.02it/s]
Training 1/1 epoch (loss 1.5787): 46%|βββββ | 570/1250 [03:18<03:48, 2.98it/s]
Training 1/1 epoch (loss 1.4551): 46%|βββββ | 570/1250 [03:18<03:48, 2.98it/s]
Training 1/1 epoch (loss 1.4551): 46%|βββββ | 571/1250 [03:18<03:43, 3.04it/s]
Training 1/1 epoch (loss 1.6713): 46%|βββββ | 571/1250 [03:19<03:43, 3.04it/s]
Training 1/1 epoch (loss 1.6713): 46%|βββββ | 572/1250 [03:19<03:39, 3.09it/s]
Training 1/1 epoch (loss 1.6150): 46%|βββββ | 572/1250 [03:19<03:39, 3.09it/s]
Training 1/1 epoch (loss 1.6150): 46%|βββββ | 573/1250 [03:19<03:33, 3.18it/s]
Training 1/1 epoch (loss 1.6411): 46%|βββββ | 573/1250 [03:19<03:33, 3.18it/s]
Training 1/1 epoch (loss 1.6411): 46%|βββββ | 574/1250 [03:19<03:33, 3.16it/s]
Training 1/1 epoch (loss 1.4893): 46%|βββββ | 574/1250 [03:19<03:33, 3.16it/s]
Training 1/1 epoch (loss 1.4893): 46%|βββββ | 575/1250 [03:19<03:34, 3.15it/s]
Training 1/1 epoch (loss 1.6210): 46%|βββββ | 575/1250 [03:20<03:34, 3.15it/s]
Training 1/1 epoch (loss 1.6210): 46%|βββββ | 576/1250 [03:20<03:45, 2.99it/s]
Training 1/1 epoch (loss 1.6285): 46%|βββββ | 576/1250 [03:20<03:45, 2.99it/s]
Training 1/1 epoch (loss 1.6285): 46%|βββββ | 577/1250 [03:20<03:50, 2.92it/s]
Training 1/1 epoch (loss 1.5206): 46%|βββββ | 577/1250 [03:21<03:50, 2.92it/s]
Training 1/1 epoch (loss 1.5206): 46%|βββββ | 578/1250 [03:21<03:48, 2.94it/s]
Training 1/1 epoch (loss 1.3809): 46%|βββββ | 578/1250 [03:21<03:48, 2.94it/s]
Training 1/1 epoch (loss 1.3809): 46%|βββββ | 579/1250 [03:21<03:40, 3.04it/s]
Training 1/1 epoch (loss 1.5945): 46%|βββββ | 579/1250 [03:21<03:40, 3.04it/s]
Training 1/1 epoch (loss 1.5945): 46%|βββββ | 580/1250 [03:21<03:37, 3.08it/s]
Training 1/1 epoch (loss 1.4797): 46%|βββββ | 580/1250 [03:21<03:37, 3.08it/s]
Training 1/1 epoch (loss 1.4797): 46%|βββββ | 581/1250 [03:21<03:37, 3.08it/s]
Training 1/1 epoch (loss 1.6083): 46%|βββββ | 581/1250 [03:22<03:37, 3.08it/s]
Training 1/1 epoch (loss 1.6083): 47%|βββββ | 582/1250 [03:22<03:43, 2.98it/s]
Training 1/1 epoch (loss 1.5278): 47%|βββββ | 582/1250 [03:22<03:43, 2.98it/s]
Training 1/1 epoch (loss 1.5278): 47%|βββββ | 583/1250 [03:22<03:38, 3.05it/s]
Training 1/1 epoch (loss 1.5978): 47%|βββββ | 583/1250 [03:22<03:38, 3.05it/s]
Training 1/1 epoch (loss 1.5978): 47%|βββββ | 584/1250 [03:22<03:44, 2.97it/s]
Training 1/1 epoch (loss 1.6347): 47%|βββββ | 584/1250 [03:23<03:44, 2.97it/s]
Training 1/1 epoch (loss 1.6347): 47%|βββββ | 585/1250 [03:23<03:41, 3.00it/s]
Training 1/1 epoch (loss 1.6076): 47%|βββββ | 585/1250 [03:23<03:41, 3.00it/s]
Training 1/1 epoch (loss 1.6076): 47%|βββββ | 586/1250 [03:23<03:38, 3.04it/s]
Training 1/1 epoch (loss 1.5612): 47%|βββββ | 586/1250 [03:23<03:38, 3.04it/s]
Training 1/1 epoch (loss 1.5612): 47%|βββββ | 587/1250 [03:23<03:34, 3.09it/s]
Training 1/1 epoch (loss 1.6899): 47%|βββββ | 587/1250 [03:24<03:34, 3.09it/s]
Training 1/1 epoch (loss 1.6899): 47%|βββββ | 588/1250 [03:24<03:30, 3.14it/s]
Training 1/1 epoch (loss 1.4311): 47%|βββββ | 588/1250 [03:24<03:30, 3.14it/s]
Training 1/1 epoch (loss 1.4311): 47%|βββββ | 589/1250 [03:24<03:35, 3.06it/s]
Training 1/1 epoch (loss 1.6677): 47%|βββββ | 589/1250 [03:24<03:35, 3.06it/s]
Training 1/1 epoch (loss 1.6677): 47%|βββββ | 590/1250 [03:24<03:32, 3.10it/s]
Training 1/1 epoch (loss 1.5952): 47%|βββββ | 590/1250 [03:25<03:32, 3.10it/s]
Training 1/1 epoch (loss 1.5952): 47%|βββββ | 591/1250 [03:25<03:29, 3.14it/s]
Training 1/1 epoch (loss 1.5115): 47%|βββββ | 591/1250 [03:25<03:29, 3.14it/s]
Training 1/1 epoch (loss 1.5115): 47%|βββββ | 592/1250 [03:25<03:30, 3.12it/s]
Training 1/1 epoch (loss 1.5559): 47%|βββββ | 592/1250 [03:25<03:30, 3.12it/s]
Training 1/1 epoch (loss 1.5559): 47%|βββββ | 593/1250 [03:25<03:27, 3.17it/s]
Training 1/1 epoch (loss 1.4984): 47%|βββββ | 593/1250 [03:26<03:27, 3.17it/s]
Training 1/1 epoch (loss 1.4984): 48%|βββββ | 594/1250 [03:26<03:26, 3.17it/s]
Training 1/1 epoch (loss 1.5408): 48%|βββββ | 594/1250 [03:26<03:26, 3.17it/s]
Training 1/1 epoch (loss 1.5408): 48%|βββββ | 595/1250 [03:26<03:43, 2.93it/s]
Training 1/1 epoch (loss 1.4894): 48%|βββββ | 595/1250 [03:26<03:43, 2.93it/s]
Training 1/1 epoch (loss 1.4894): 48%|βββββ | 596/1250 [03:26<03:48, 2.86it/s]
Training 1/1 epoch (loss 1.5969): 48%|βββββ | 596/1250 [03:27<03:48, 2.86it/s]
Training 1/1 epoch (loss 1.5969): 48%|βββββ | 597/1250 [03:27<03:53, 2.80it/s]
Training 1/1 epoch (loss 1.5364): 48%|βββββ | 597/1250 [03:27<03:53, 2.80it/s]
Training 1/1 epoch (loss 1.5364): 48%|βββββ | 598/1250 [03:27<03:53, 2.79it/s]
Training 1/1 epoch (loss 1.4546): 48%|βββββ | 598/1250 [03:28<03:53, 2.79it/s]
Training 1/1 epoch (loss 1.4546): 48%|βββββ | 599/1250 [03:28<03:49, 2.83it/s]
Training 1/1 epoch (loss 1.4540): 48%|βββββ | 599/1250 [03:28<03:49, 2.83it/s]
Training 1/1 epoch (loss 1.4540): 48%|βββββ | 600/1250 [03:28<03:59, 2.72it/s]
Training 1/1 epoch (loss 1.4980): 48%|βββββ | 600/1250 [03:28<03:59, 2.72it/s]
Training 1/1 epoch (loss 1.4980): 48%|βββββ | 601/1250 [03:28<03:53, 2.78it/s]
Training 1/1 epoch (loss 1.6110): 48%|βββββ | 601/1250 [03:29<03:53, 2.78it/s]
Training 1/1 epoch (loss 1.6110): 48%|βββββ | 602/1250 [03:29<03:42, 2.91it/s]
Training 1/1 epoch (loss 1.6674): 48%|βββββ | 602/1250 [03:29<03:42, 2.91it/s]
Training 1/1 epoch (loss 1.6674): 48%|βββββ | 603/1250 [03:29<03:43, 2.90it/s]
Training 1/1 epoch (loss 1.4991): 48%|βββββ | 603/1250 [03:29<03:43, 2.90it/s]
Training 1/1 epoch (loss 1.4991): 48%|βββββ | 604/1250 [03:29<03:35, 3.00it/s]
Training 1/1 epoch (loss 1.5593): 48%|βββββ | 604/1250 [03:30<03:35, 3.00it/s]
Training 1/1 epoch (loss 1.5593): 48%|βββββ | 605/1250 [03:30<03:43, 2.89it/s]
Training 1/1 epoch (loss 1.4671): 48%|βββββ | 605/1250 [03:30<03:43, 2.89it/s]
Training 1/1 epoch (loss 1.4671): 48%|βββββ | 606/1250 [03:30<03:44, 2.86it/s]
Training 1/1 epoch (loss 1.5818): 48%|βββββ | 606/1250 [03:30<03:44, 2.86it/s]
Training 1/1 epoch (loss 1.5818): 49%|βββββ | 607/1250 [03:30<03:48, 2.82it/s]
Training 1/1 epoch (loss 1.6155): 49%|βββββ | 607/1250 [03:31<03:48, 2.82it/s]
Training 1/1 epoch (loss 1.6155): 49%|βββββ | 608/1250 [03:31<03:42, 2.88it/s]
Training 1/1 epoch (loss 1.5732): 49%|βββββ | 608/1250 [03:31<03:42, 2.88it/s]
Training 1/1 epoch (loss 1.5732): 49%|βββββ | 609/1250 [03:31<03:30, 3.05it/s]
Training 1/1 epoch (loss 1.5103): 49%|βββββ | 609/1250 [03:31<03:30, 3.05it/s]
Training 1/1 epoch (loss 1.5103): 49%|βββββ | 610/1250 [03:31<03:32, 3.01it/s]
Training 1/1 epoch (loss 1.4958): 49%|βββββ | 610/1250 [03:32<03:32, 3.01it/s]
Training 1/1 epoch (loss 1.4958): 49%|βββββ | 611/1250 [03:32<03:22, 3.15it/s]
Training 1/1 epoch (loss 1.6313): 49%|βββββ | 611/1250 [03:32<03:22, 3.15it/s]
Training 1/1 epoch (loss 1.6313): 49%|βββββ | 612/1250 [03:32<03:28, 3.05it/s]
Training 1/1 epoch (loss 1.4736): 49%|βββββ | 612/1250 [03:32<03:28, 3.05it/s]
Training 1/1 epoch (loss 1.4736): 49%|βββββ | 613/1250 [03:32<03:33, 2.98it/s]
Training 1/1 epoch (loss 1.6702): 49%|βββββ | 613/1250 [03:33<03:33, 2.98it/s]
Training 1/1 epoch (loss 1.6702): 49%|βββββ | 614/1250 [03:33<03:27, 3.06it/s]
Training 1/1 epoch (loss 1.4929): 49%|βββββ | 614/1250 [03:33<03:27, 3.06it/s]
Training 1/1 epoch (loss 1.4929): 49%|βββββ | 615/1250 [03:33<03:23, 3.11it/s]
Training 1/1 epoch (loss 1.6564): 49%|βββββ | 615/1250 [03:33<03:23, 3.11it/s]
Training 1/1 epoch (loss 1.6564): 49%|βββββ | 616/1250 [03:33<03:27, 3.05it/s]
Training 1/1 epoch (loss 1.4992): 49%|βββββ | 616/1250 [03:34<03:27, 3.05it/s]
Training 1/1 epoch (loss 1.4992): 49%|βββββ | 617/1250 [03:34<03:26, 3.07it/s]
Training 1/1 epoch (loss 1.6698): 49%|βββββ | 617/1250 [03:34<03:26, 3.07it/s]
Training 1/1 epoch (loss 1.6698): 49%|βββββ | 618/1250 [03:34<03:30, 3.00it/s]
Training 1/1 epoch (loss 1.5076): 49%|βββββ | 618/1250 [03:34<03:30, 3.00it/s]
Training 1/1 epoch (loss 1.5076): 50%|βββββ | 619/1250 [03:34<03:43, 2.82it/s]
Training 1/1 epoch (loss 1.4736): 50%|βββββ | 619/1250 [03:35<03:43, 2.82it/s]
Training 1/1 epoch (loss 1.4736): 50%|βββββ | 620/1250 [03:35<03:45, 2.79it/s]
Training 1/1 epoch (loss 1.5016): 50%|βββββ | 620/1250 [03:35<03:45, 2.79it/s]
Training 1/1 epoch (loss 1.5016): 50%|βββββ | 621/1250 [03:35<03:43, 2.81it/s]
Training 1/1 epoch (loss 1.5647): 50%|βββββ | 621/1250 [03:35<03:43, 2.81it/s]
Training 1/1 epoch (loss 1.5647): 50%|βββββ | 622/1250 [03:35<03:43, 2.81it/s]
Training 1/1 epoch (loss 1.5357): 50%|βββββ | 622/1250 [03:36<03:43, 2.81it/s]
Training 1/1 epoch (loss 1.5357): 50%|βββββ | 623/1250 [03:36<03:42, 2.82it/s]
Training 1/1 epoch (loss 1.5774): 50%|βββββ | 623/1250 [03:36<03:42, 2.82it/s]
Training 1/1 epoch (loss 1.5774): 50%|βββββ | 624/1250 [03:36<03:56, 2.65it/s]
Training 1/1 epoch (loss 1.5092): 50%|βββββ | 624/1250 [03:37<03:56, 2.65it/s]
Training 1/1 epoch (loss 1.5092): 50%|βββββ | 625/1250 [03:37<03:51, 2.70it/s]
Training 1/1 epoch (loss 1.5632): 50%|βββββ | 625/1250 [03:37<03:51, 2.70it/s]
Training 1/1 epoch (loss 1.5632): 50%|βββββ | 626/1250 [03:37<03:39, 2.85it/s]
Training 1/1 epoch (loss 1.7102): 50%|βββββ | 626/1250 [03:37<03:39, 2.85it/s]
Training 1/1 epoch (loss 1.7102): 50%|βββββ | 627/1250 [03:37<03:35, 2.89it/s]
Training 1/1 epoch (loss 1.5602): 50%|βββββ | 627/1250 [03:37<03:35, 2.89it/s]
Training 1/1 epoch (loss 1.5602): 50%|βββββ | 628/1250 [03:37<03:33, 2.91it/s]
Training 1/1 epoch (loss 1.5394): 50%|βββββ | 628/1250 [03:38<03:33, 2.91it/s]
Training 1/1 epoch (loss 1.5394): 50%|βββββ | 629/1250 [03:38<03:28, 2.98it/s]
Training 1/1 epoch (loss 1.6313): 50%|βββββ | 629/1250 [03:38<03:28, 2.98it/s]
Training 1/1 epoch (loss 1.6313): 50%|βββββ | 630/1250 [03:38<03:25, 3.02it/s]
Training 1/1 epoch (loss 1.6110): 50%|βββββ | 630/1250 [03:38<03:25, 3.02it/s]
Training 1/1 epoch (loss 1.6110): 50%|βββββ | 631/1250 [03:38<03:24, 3.03it/s]
Training 1/1 epoch (loss 1.5054): 50%|βββββ | 631/1250 [03:39<03:24, 3.03it/s]
Training 1/1 epoch (loss 1.5054): 51%|βββββ | 632/1250 [03:39<03:25, 3.01it/s]
Training 1/1 epoch (loss 1.5611): 51%|βββββ | 632/1250 [03:39<03:25, 3.01it/s]
Training 1/1 epoch (loss 1.5611): 51%|βββββ | 633/1250 [03:39<03:20, 3.08it/s]
Training 1/1 epoch (loss 1.5268): 51%|βββββ | 633/1250 [03:39<03:20, 3.08it/s]
Training 1/1 epoch (loss 1.5268): 51%|βββββ | 634/1250 [03:39<03:15, 3.16it/s]
Training 1/1 epoch (loss 1.4175): 51%|βββββ | 634/1250 [03:40<03:15, 3.16it/s]
Training 1/1 epoch (loss 1.4175): 51%|βββββ | 635/1250 [03:40<03:15, 3.14it/s]
Training 1/1 epoch (loss 1.5070): 51%|βββββ | 635/1250 [03:40<03:15, 3.14it/s]
Training 1/1 epoch (loss 1.5070): 51%|βββββ | 636/1250 [03:40<03:15, 3.14it/s]
Training 1/1 epoch (loss 1.5028): 51%|βββββ | 636/1250 [03:40<03:15, 3.14it/s]
Training 1/1 epoch (loss 1.5028): 51%|βββββ | 637/1250 [03:40<03:21, 3.05it/s]
Training 1/1 epoch (loss 1.5524): 51%|βββββ | 637/1250 [03:41<03:21, 3.05it/s]
Training 1/1 epoch (loss 1.5524): 51%|βββββ | 638/1250 [03:41<03:18, 3.09it/s]
Training 1/1 epoch (loss 1.6153): 51%|βββββ | 638/1250 [03:41<03:18, 3.09it/s]
Training 1/1 epoch (loss 1.6153): 51%|βββββ | 639/1250 [03:41<03:11, 3.20it/s]
Training 1/1 epoch (loss 1.5490): 51%|βββββ | 639/1250 [03:41<03:11, 3.20it/s]
Training 1/1 epoch (loss 1.5490): 51%|βββββ | 640/1250 [03:41<03:18, 3.07it/s]
Training 1/1 epoch (loss 1.5697): 51%|βββββ | 640/1250 [03:42<03:18, 3.07it/s]
Training 1/1 epoch (loss 1.5697): 51%|ββββββ | 641/1250 [03:42<03:17, 3.08it/s]
Training 1/1 epoch (loss 1.5100): 51%|ββββββ | 641/1250 [03:42<03:17, 3.08it/s]
Training 1/1 epoch (loss 1.5100): 51%|ββββββ | 642/1250 [03:42<03:16, 3.09it/s]
Training 1/1 epoch (loss 1.6577): 51%|ββββββ | 642/1250 [03:42<03:16, 3.09it/s]
Training 1/1 epoch (loss 1.6577): 51%|ββββββ | 643/1250 [03:42<03:32, 2.85it/s]
Training 1/1 epoch (loss 1.4945): 51%|ββββββ | 643/1250 [03:43<03:32, 2.85it/s]
Training 1/1 epoch (loss 1.4945): 52%|ββββββ | 644/1250 [03:43<03:26, 2.94it/s]
Training 1/1 epoch (loss 1.4853): 52%|ββββββ | 644/1250 [03:43<03:26, 2.94it/s]
Training 1/1 epoch (loss 1.4853): 52%|ββββββ | 645/1250 [03:43<03:19, 3.03it/s]
Training 1/1 epoch (loss 1.6125): 52%|ββββββ | 645/1250 [03:43<03:19, 3.03it/s]
Training 1/1 epoch (loss 1.6125): 52%|ββββββ | 646/1250 [03:43<03:15, 3.09it/s]
Training 1/1 epoch (loss 1.4895): 52%|ββββββ | 646/1250 [03:44<03:15, 3.09it/s]
Training 1/1 epoch (loss 1.4895): 52%|ββββββ | 647/1250 [03:44<03:13, 3.11it/s]
Training 1/1 epoch (loss 1.5908): 52%|ββββββ | 647/1250 [03:44<03:13, 3.11it/s]
Training 1/1 epoch (loss 1.5908): 52%|ββββββ | 648/1250 [03:44<03:21, 2.99it/s]
Training 1/1 epoch (loss 1.4284): 52%|ββββββ | 648/1250 [03:44<03:21, 2.99it/s]
Training 1/1 epoch (loss 1.4284): 52%|ββββββ | 649/1250 [03:44<03:28, 2.88it/s]
Training 1/1 epoch (loss 1.6306): 52%|ββββββ | 649/1250 [03:45<03:28, 2.88it/s]
Training 1/1 epoch (loss 1.6306): 52%|ββββββ | 650/1250 [03:45<03:22, 2.96it/s]
Training 1/1 epoch (loss 1.4964): 52%|ββββββ | 650/1250 [03:45<03:22, 2.96it/s]
Training 1/1 epoch (loss 1.4964): 52%|ββββββ | 651/1250 [03:45<03:18, 3.02it/s]
Training 1/1 epoch (loss 1.5366): 52%|ββββββ | 651/1250 [03:45<03:18, 3.02it/s]
Training 1/1 epoch (loss 1.5366): 52%|ββββββ | 652/1250 [03:45<03:12, 3.11it/s]
Training 1/1 epoch (loss 1.5395): 52%|ββββββ | 652/1250 [03:46<03:12, 3.11it/s]
Training 1/1 epoch (loss 1.5395): 52%|ββββββ | 653/1250 [03:46<03:11, 3.11it/s]
Training 1/1 epoch (loss 1.6482): 52%|ββββββ | 653/1250 [03:46<03:11, 3.11it/s]
Training 1/1 epoch (loss 1.6482): 52%|ββββββ | 654/1250 [03:46<03:10, 3.13it/s]
Training 1/1 epoch (loss 1.5043): 52%|ββββββ | 654/1250 [03:46<03:10, 3.13it/s]
Training 1/1 epoch (loss 1.5043): 52%|ββββββ | 655/1250 [03:46<03:17, 3.01it/s]
Training 1/1 epoch (loss 1.5558): 52%|ββββββ | 655/1250 [03:47<03:17, 3.01it/s]
Training 1/1 epoch (loss 1.5558): 52%|ββββββ | 656/1250 [03:47<03:16, 3.02it/s]
Training 1/1 epoch (loss 1.6135): 52%|ββββββ | 656/1250 [03:47<03:16, 3.02it/s]
Training 1/1 epoch (loss 1.6135): 53%|ββββββ | 657/1250 [03:47<03:09, 3.13it/s]
Training 1/1 epoch (loss 1.4348): 53%|ββββββ | 657/1250 [03:47<03:09, 3.13it/s]
Training 1/1 epoch (loss 1.4348): 53%|ββββββ | 658/1250 [03:47<03:10, 3.11it/s]
Training 1/1 epoch (loss 1.4710): 53%|ββββββ | 658/1250 [03:48<03:10, 3.11it/s]
Training 1/1 epoch (loss 1.4710): 53%|ββββββ | 659/1250 [03:48<03:16, 3.01it/s]
Training 1/1 epoch (loss 1.3850): 53%|ββββββ | 659/1250 [03:48<03:16, 3.01it/s]
Training 1/1 epoch (loss 1.3850): 53%|ββββββ | 660/1250 [03:48<03:13, 3.05it/s]
Training 1/1 epoch (loss 1.5009): 53%|ββββββ | 660/1250 [03:48<03:13, 3.05it/s]
Training 1/1 epoch (loss 1.5009): 53%|ββββββ | 661/1250 [03:48<03:20, 2.94it/s]
Training 1/1 epoch (loss 1.5064): 53%|ββββββ | 661/1250 [03:49<03:20, 2.94it/s]
Training 1/1 epoch (loss 1.5064): 53%|ββββββ | 662/1250 [03:49<03:15, 3.00it/s]
Training 1/1 epoch (loss 1.5555): 53%|ββββββ | 662/1250 [03:49<03:15, 3.00it/s]
Training 1/1 epoch (loss 1.5555): 53%|ββββββ | 663/1250 [03:49<03:11, 3.07it/s]
Training 1/1 epoch (loss 1.5485): 53%|ββββββ | 663/1250 [03:49<03:11, 3.07it/s]
Training 1/1 epoch (loss 1.5485): 53%|ββββββ | 664/1250 [03:49<03:11, 3.05it/s]
Training 1/1 epoch (loss 1.5486): 53%|ββββββ | 664/1250 [03:50<03:11, 3.05it/s]
Training 1/1 epoch (loss 1.5486): 53%|ββββββ | 665/1250 [03:50<03:08, 3.10it/s]
Training 1/1 epoch (loss 1.6450): 53%|ββββββ | 665/1250 [03:50<03:08, 3.10it/s]
Training 1/1 epoch (loss 1.6450): 53%|ββββββ | 666/1250 [03:50<03:09, 3.08it/s]
Training 1/1 epoch (loss 1.5050): 53%|ββββββ | 666/1250 [03:50<03:09, 3.08it/s]
Training 1/1 epoch (loss 1.5050): 53%|ββββββ | 667/1250 [03:50<03:10, 3.07it/s]
Training 1/1 epoch (loss 1.5568): 53%|ββββββ | 667/1250 [03:51<03:10, 3.07it/s]
Training 1/1 epoch (loss 1.5568): 53%|ββββββ | 668/1250 [03:51<03:09, 3.07it/s]
Training 1/1 epoch (loss 1.5551): 53%|ββββββ | 668/1250 [03:51<03:09, 3.07it/s]
Training 1/1 epoch (loss 1.5551): 54%|ββββββ | 669/1250 [03:51<03:08, 3.09it/s]
Training 1/1 epoch (loss 1.5806): 54%|ββββββ | 669/1250 [03:51<03:08, 3.09it/s]
Training 1/1 epoch (loss 1.5806): 54%|ββββββ | 670/1250 [03:51<03:02, 3.19it/s]
Training 1/1 epoch (loss 1.4892): 54%|ββββββ | 670/1250 [03:51<03:02, 3.19it/s]
Training 1/1 epoch (loss 1.4892): 54%|ββββββ | 671/1250 [03:51<03:01, 3.18it/s]
Training 1/1 epoch (loss 1.4861): 54%|ββββββ | 671/1250 [03:52<03:01, 3.18it/s]
Training 1/1 epoch (loss 1.4861): 54%|ββββββ | 672/1250 [03:52<03:07, 3.08it/s]
Training 1/1 epoch (loss 1.5145): 54%|ββββββ | 672/1250 [03:52<03:07, 3.08it/s]
Training 1/1 epoch (loss 1.5145): 54%|ββββββ | 673/1250 [03:52<03:07, 3.08it/s]
Training 1/1 epoch (loss 1.5171): 54%|ββββββ | 673/1250 [03:52<03:07, 3.08it/s]
Training 1/1 epoch (loss 1.5171): 54%|ββββββ | 674/1250 [03:52<03:06, 3.08it/s]
Training 1/1 epoch (loss 1.5632): 54%|ββββββ | 674/1250 [03:53<03:06, 3.08it/s]
Training 1/1 epoch (loss 1.5632): 54%|ββββββ | 675/1250 [03:53<03:12, 2.98it/s]
Training 1/1 epoch (loss 1.5361): 54%|ββββββ | 675/1250 [03:53<03:12, 2.98it/s]
Training 1/1 epoch (loss 1.5361): 54%|ββββββ | 676/1250 [03:53<03:07, 3.07it/s]
Training 1/1 epoch (loss 1.5601): 54%|ββββββ | 676/1250 [03:53<03:07, 3.07it/s]
Training 1/1 epoch (loss 1.5601): 54%|ββββββ | 677/1250 [03:53<03:03, 3.12it/s]
Training 1/1 epoch (loss 1.5361): 54%|ββββββ | 677/1250 [03:54<03:03, 3.12it/s]
Training 1/1 epoch (loss 1.5361): 54%|ββββββ | 678/1250 [03:54<03:04, 3.09it/s]
Training 1/1 epoch (loss 1.5355): 54%|ββββββ | 678/1250 [03:54<03:04, 3.09it/s]
Training 1/1 epoch (loss 1.5355): 54%|ββββββ | 679/1250 [03:54<03:06, 3.07it/s]
Training 1/1 epoch (loss 1.5598): 54%|ββββββ | 679/1250 [03:54<03:06, 3.07it/s]
Training 1/1 epoch (loss 1.5598): 54%|ββββββ | 680/1250 [03:54<03:06, 3.05it/s]
Training 1/1 epoch (loss 1.4968): 54%|ββββββ | 680/1250 [03:55<03:06, 3.05it/s]
Training 1/1 epoch (loss 1.4968): 54%|ββββββ | 681/1250 [03:55<03:04, 3.08it/s]
Training 1/1 epoch (loss 1.5678): 54%|ββββββ | 681/1250 [03:55<03:04, 3.08it/s]
Training 1/1 epoch (loss 1.5678): 55%|ββββββ | 682/1250 [03:55<02:58, 3.19it/s]
Training 1/1 epoch (loss 1.5511): 55%|ββββββ | 682/1250 [03:55<02:58, 3.19it/s]
Training 1/1 epoch (loss 1.5511): 55%|ββββββ | 683/1250 [03:55<03:00, 3.14it/s]
Training 1/1 epoch (loss 1.6106): 55%|ββββββ | 683/1250 [03:56<03:00, 3.14it/s]
Training 1/1 epoch (loss 1.6106): 55%|ββββββ | 684/1250 [03:56<02:58, 3.17it/s]
Training 1/1 epoch (loss 1.5098): 55%|ββββββ | 684/1250 [03:56<02:58, 3.17it/s]
Training 1/1 epoch (loss 1.5098): 55%|ββββββ | 685/1250 [03:56<03:00, 3.14it/s]
Training 1/1 epoch (loss 1.5290): 55%|ββββββ | 685/1250 [03:56<03:00, 3.14it/s]
Training 1/1 epoch (loss 1.5290): 55%|ββββββ | 686/1250 [03:56<03:00, 3.13it/s]
Training 1/1 epoch (loss 1.5427): 55%|ββββββ | 686/1250 [03:57<03:00, 3.13it/s]
Training 1/1 epoch (loss 1.5427): 55%|ββββββ | 687/1250 [03:57<02:57, 3.18it/s]
Training 1/1 epoch (loss 1.3919): 55%|ββββββ | 687/1250 [03:57<02:57, 3.18it/s]
Training 1/1 epoch (loss 1.3919): 55%|ββββββ | 688/1250 [03:57<02:56, 3.19it/s]
Training 1/1 epoch (loss 1.4859): 55%|ββββββ | 688/1250 [03:57<02:56, 3.19it/s]
Training 1/1 epoch (loss 1.4859): 55%|ββββββ | 689/1250 [03:57<02:56, 3.17it/s]
Training 1/1 epoch (loss 1.4165): 55%|ββββββ | 689/1250 [03:58<02:56, 3.17it/s]
Training 1/1 epoch (loss 1.4165): 55%|ββββββ | 690/1250 [03:58<02:58, 3.14it/s]
Training 1/1 epoch (loss 1.5640): 55%|ββββββ | 690/1250 [03:58<02:58, 3.14it/s]
Training 1/1 epoch (loss 1.5640): 55%|ββββββ | 691/1250 [03:58<03:02, 3.06it/s]
Training 1/1 epoch (loss 1.4841): 55%|ββββββ | 691/1250 [03:58<03:02, 3.06it/s]
Training 1/1 epoch (loss 1.4841): 55%|ββββββ | 692/1250 [03:58<03:08, 2.96it/s]
Training 1/1 epoch (loss 1.5701): 55%|ββββββ | 692/1250 [03:59<03:08, 2.96it/s]
Training 1/1 epoch (loss 1.5701): 55%|ββββββ | 693/1250 [03:59<03:02, 3.05it/s]
Training 1/1 epoch (loss 1.4587): 55%|ββββββ | 693/1250 [03:59<03:02, 3.05it/s]
Training 1/1 epoch (loss 1.4587): 56%|ββββββ | 694/1250 [03:59<03:17, 2.81it/s]
Training 1/1 epoch (loss 1.5958): 56%|ββββββ | 694/1250 [03:59<03:17, 2.81it/s]
Training 1/1 epoch (loss 1.5958): 56%|ββββββ | 695/1250 [03:59<03:10, 2.92it/s]
Training 1/1 epoch (loss 1.6715): 56%|ββββββ | 695/1250 [04:00<03:10, 2.92it/s]
Training 1/1 epoch (loss 1.6715): 56%|ββββββ | 696/1250 [04:00<03:13, 2.87it/s]
Training 1/1 epoch (loss 1.5506): 56%|ββββββ | 696/1250 [04:00<03:13, 2.87it/s]
Training 1/1 epoch (loss 1.5506): 56%|ββββββ | 697/1250 [04:00<03:12, 2.88it/s]
Training 1/1 epoch (loss 1.5667): 56%|ββββββ | 697/1250 [04:00<03:12, 2.88it/s]
Training 1/1 epoch (loss 1.5667): 56%|ββββββ | 698/1250 [04:00<03:09, 2.91it/s]
Training 1/1 epoch (loss 1.6549): 56%|ββββββ | 698/1250 [04:01<03:09, 2.91it/s]
Training 1/1 epoch (loss 1.6549): 56%|ββββββ | 699/1250 [04:01<03:05, 2.97it/s]
Training 1/1 epoch (loss 1.3779): 56%|ββββββ | 699/1250 [04:01<03:05, 2.97it/s]
Training 1/1 epoch (loss 1.3779): 56%|ββββββ | 700/1250 [04:01<03:03, 3.00it/s]
Training 1/1 epoch (loss 1.5554): 56%|ββββββ | 700/1250 [04:01<03:03, 3.00it/s]
Training 1/1 epoch (loss 1.5554): 56%|ββββββ | 701/1250 [04:01<02:58, 3.08it/s]
Training 1/1 epoch (loss 1.4032): 56%|ββββββ | 701/1250 [04:02<02:58, 3.08it/s]
Training 1/1 epoch (loss 1.4032): 56%|ββββββ | 702/1250 [04:02<02:56, 3.11it/s]
Training 1/1 epoch (loss 1.5705): 56%|ββββββ | 702/1250 [04:02<02:56, 3.11it/s]
Training 1/1 epoch (loss 1.5705): 56%|ββββββ | 703/1250 [04:02<02:54, 3.13it/s]
Training 1/1 epoch (loss 1.4012): 56%|ββββββ | 703/1250 [04:02<02:54, 3.13it/s]
Training 1/1 epoch (loss 1.4012): 56%|ββββββ | 704/1250 [04:02<03:09, 2.88it/s]
Training 1/1 epoch (loss 1.5793): 56%|ββββββ | 704/1250 [04:03<03:09, 2.88it/s]
Training 1/1 epoch (loss 1.5793): 56%|ββββββ | 705/1250 [04:03<03:04, 2.96it/s]
Training 1/1 epoch (loss 1.5035): 56%|ββββββ | 705/1250 [04:03<03:04, 2.96it/s]
Training 1/1 epoch (loss 1.5035): 56%|ββββββ | 706/1250 [04:03<02:59, 3.03it/s]
Training 1/1 epoch (loss 1.6331): 56%|ββββββ | 706/1250 [04:03<02:59, 3.03it/s]
Training 1/1 epoch (loss 1.6331): 57%|ββββββ | 707/1250 [04:03<03:03, 2.96it/s]
Training 1/1 epoch (loss 1.6060): 57%|ββββββ | 707/1250 [04:04<03:03, 2.96it/s]
Training 1/1 epoch (loss 1.6060): 57%|ββββββ | 708/1250 [04:04<03:00, 3.00it/s]
Training 1/1 epoch (loss 1.5164): 57%|ββββββ | 708/1250 [04:04<03:00, 3.00it/s]
Training 1/1 epoch (loss 1.5164): 57%|ββββββ | 709/1250 [04:04<02:59, 3.01it/s]
Training 1/1 epoch (loss 1.5647): 57%|ββββββ | 709/1250 [04:04<02:59, 3.01it/s]
Training 1/1 epoch (loss 1.5647): 57%|ββββββ | 710/1250 [04:04<03:02, 2.96it/s]
Training 1/1 epoch (loss 1.5979): 57%|ββββββ | 710/1250 [04:05<03:02, 2.96it/s]
Training 1/1 epoch (loss 1.5979): 57%|ββββββ | 711/1250 [04:05<02:56, 3.05it/s]
Training 1/1 epoch (loss 1.4790): 57%|ββββββ | 711/1250 [04:05<02:56, 3.05it/s]
Training 1/1 epoch (loss 1.4790): 57%|ββββββ | 712/1250 [04:05<02:53, 3.09it/s]
Training 1/1 epoch (loss 1.5869): 57%|ββββββ | 712/1250 [04:05<02:53, 3.09it/s]
Training 1/1 epoch (loss 1.5869): 57%|ββββββ | 713/1250 [04:05<02:53, 3.10it/s]
Training 1/1 epoch (loss 1.4049): 57%|ββββββ | 713/1250 [04:06<02:53, 3.10it/s]
Training 1/1 epoch (loss 1.4049): 57%|ββββββ | 714/1250 [04:06<02:53, 3.08it/s]
Training 1/1 epoch (loss 1.5129): 57%|ββββββ | 714/1250 [04:06<02:53, 3.08it/s]
Training 1/1 epoch (loss 1.5129): 57%|ββββββ | 715/1250 [04:06<02:50, 3.14it/s]
Training 1/1 epoch (loss 1.6267): 57%|ββββββ | 715/1250 [04:06<02:50, 3.14it/s]
Training 1/1 epoch (loss 1.6267): 57%|ββββββ | 716/1250 [04:06<02:52, 3.10it/s]
Training 1/1 epoch (loss 1.6487): 57%|ββββββ | 716/1250 [04:07<02:52, 3.10it/s]
Training 1/1 epoch (loss 1.6487): 57%|ββββββ | 717/1250 [04:07<02:51, 3.10it/s]
Training 1/1 epoch (loss 1.5512): 57%|ββββββ | 717/1250 [04:07<02:51, 3.10it/s]
Training 1/1 epoch (loss 1.5512): 57%|ββββββ | 718/1250 [04:07<02:48, 3.16it/s]
Training 1/1 epoch (loss 1.6256): 57%|ββββββ | 718/1250 [04:07<02:48, 3.16it/s]
Training 1/1 epoch (loss 1.6256): 58%|ββββββ | 719/1250 [04:07<02:44, 3.22it/s]
Training 1/1 epoch (loss 1.4680): 58%|ββββββ | 719/1250 [04:08<02:44, 3.22it/s]
Training 1/1 epoch (loss 1.4680): 58%|ββββββ | 720/1250 [04:08<02:47, 3.16it/s]
Training 1/1 epoch (loss 1.5120): 58%|ββββββ | 720/1250 [04:08<02:47, 3.16it/s]
Training 1/1 epoch (loss 1.5120): 58%|ββββββ | 721/1250 [04:08<02:53, 3.04it/s]
Training 1/1 epoch (loss 1.5342): 58%|ββββββ | 721/1250 [04:08<02:53, 3.04it/s]
Training 1/1 epoch (loss 1.5342): 58%|ββββββ | 722/1250 [04:08<02:55, 3.01it/s]
Training 1/1 epoch (loss 1.5233): 58%|ββββββ | 722/1250 [04:09<02:55, 3.01it/s]
Training 1/1 epoch (loss 1.5233): 58%|ββββββ | 723/1250 [04:09<03:02, 2.89it/s]
Training 1/1 epoch (loss 1.4893): 58%|ββββββ | 723/1250 [04:09<03:02, 2.89it/s]
Training 1/1 epoch (loss 1.4893): 58%|ββββββ | 724/1250 [04:09<02:55, 2.99it/s]
Training 1/1 epoch (loss 1.5236): 58%|ββββββ | 724/1250 [04:09<02:55, 2.99it/s]
Training 1/1 epoch (loss 1.5236): 58%|ββββββ | 725/1250 [04:09<02:48, 3.11it/s]
Training 1/1 epoch (loss 1.5478): 58%|ββββββ | 725/1250 [04:10<02:48, 3.11it/s]
Training 1/1 epoch (loss 1.5478): 58%|ββββββ | 726/1250 [04:10<02:46, 3.15it/s]
Training 1/1 epoch (loss 1.5741): 58%|ββββββ | 726/1250 [04:10<02:46, 3.15it/s]
Training 1/1 epoch (loss 1.5741): 58%|ββββββ | 727/1250 [04:10<02:45, 3.16it/s]
Training 1/1 epoch (loss 1.5890): 58%|ββββββ | 727/1250 [04:10<02:45, 3.16it/s]
Training 1/1 epoch (loss 1.5890): 58%|ββββββ | 728/1250 [04:10<02:49, 3.08it/s]
Training 1/1 epoch (loss 1.5317): 58%|ββββββ | 728/1250 [04:10<02:49, 3.08it/s]
Training 1/1 epoch (loss 1.5317): 58%|ββββββ | 729/1250 [04:10<02:50, 3.06it/s]
Training 1/1 epoch (loss 1.6553): 58%|ββββββ | 729/1250 [04:11<02:50, 3.06it/s]
Training 1/1 epoch (loss 1.6553): 58%|ββββββ | 730/1250 [04:11<02:47, 3.10it/s]
Training 1/1 epoch (loss 1.6161): 58%|ββββββ | 730/1250 [04:11<02:47, 3.10it/s]
Training 1/1 epoch (loss 1.6161): 58%|ββββββ | 731/1250 [04:11<02:50, 3.04it/s]
Training 1/1 epoch (loss 1.5242): 58%|ββββββ | 731/1250 [04:11<02:50, 3.04it/s]
Training 1/1 epoch (loss 1.5242): 59%|ββββββ | 732/1250 [04:11<02:45, 3.12it/s]
Training 1/1 epoch (loss 1.4447): 59%|ββββββ | 732/1250 [04:12<02:45, 3.12it/s]
Training 1/1 epoch (loss 1.4447): 59%|ββββββ | 733/1250 [04:12<02:52, 2.99it/s]
Training 1/1 epoch (loss 1.5695): 59%|ββββββ | 733/1250 [04:12<02:52, 2.99it/s]
Training 1/1 epoch (loss 1.5695): 59%|ββββββ | 734/1250 [04:12<02:51, 3.02it/s]
Training 1/1 epoch (loss 1.4556): 59%|ββββββ | 734/1250 [04:12<02:51, 3.02it/s]
Training 1/1 epoch (loss 1.4556): 59%|ββββββ | 735/1250 [04:12<02:48, 3.05it/s]
Training 1/1 epoch (loss 1.4958): 59%|ββββββ | 735/1250 [04:13<02:48, 3.05it/s]
Training 1/1 epoch (loss 1.4958): 59%|ββββββ | 736/1250 [04:13<02:50, 3.02it/s]
Training 1/1 epoch (loss 1.6340): 59%|ββββββ | 736/1250 [04:13<02:50, 3.02it/s]
Training 1/1 epoch (loss 1.6340): 59%|ββββββ | 737/1250 [04:13<02:46, 3.07it/s]
Training 1/1 epoch (loss 1.5707): 59%|ββββββ | 737/1250 [04:13<02:46, 3.07it/s]
Training 1/1 epoch (loss 1.5707): 59%|ββββββ | 738/1250 [04:13<02:44, 3.11it/s]
Training 1/1 epoch (loss 1.4948): 59%|ββββββ | 738/1250 [04:14<02:44, 3.11it/s]
Training 1/1 epoch (loss 1.4948): 59%|ββββββ | 739/1250 [04:14<02:50, 3.00it/s]
Training 1/1 epoch (loss 1.5104): 59%|ββββββ | 739/1250 [04:14<02:50, 3.00it/s]
Training 1/1 epoch (loss 1.5104): 59%|ββββββ | 740/1250 [04:14<02:50, 3.00it/s]
Training 1/1 epoch (loss 1.5528): 59%|ββββββ | 740/1250 [04:14<02:50, 3.00it/s]
Training 1/1 epoch (loss 1.5528): 59%|ββββββ | 741/1250 [04:14<02:54, 2.92it/s]
Training 1/1 epoch (loss 1.4689): 59%|ββββββ | 741/1250 [04:15<02:54, 2.92it/s]
Training 1/1 epoch (loss 1.4689): 59%|ββββββ | 742/1250 [04:15<02:50, 2.99it/s]
Training 1/1 epoch (loss 1.5141): 59%|ββββββ | 742/1250 [04:15<02:50, 2.99it/s]
Training 1/1 epoch (loss 1.5141): 59%|ββββββ | 743/1250 [04:15<02:46, 3.04it/s]
Training 1/1 epoch (loss 1.6149): 59%|ββββββ | 743/1250 [04:15<02:46, 3.04it/s]
Training 1/1 epoch (loss 1.6149): 60%|ββββββ | 744/1250 [04:15<02:48, 3.00it/s]
Training 1/1 epoch (loss 1.5124): 60%|ββββββ | 744/1250 [04:16<02:48, 3.00it/s]
Training 1/1 epoch (loss 1.5124): 60%|ββββββ | 745/1250 [04:16<02:51, 2.95it/s]
Training 1/1 epoch (loss 1.5428): 60%|ββββββ | 745/1250 [04:16<02:51, 2.95it/s]
Training 1/1 epoch (loss 1.5428): 60%|ββββββ | 746/1250 [04:16<02:47, 3.02it/s]
Training 1/1 epoch (loss 1.5416): 60%|ββββββ | 746/1250 [04:16<02:47, 3.02it/s]
Training 1/1 epoch (loss 1.5416): 60%|ββββββ | 747/1250 [04:16<02:46, 3.02it/s]
Training 1/1 epoch (loss 1.6113): 60%|ββββββ | 747/1250 [04:17<02:46, 3.02it/s]
Training 1/1 epoch (loss 1.6113): 60%|ββββββ | 748/1250 [04:17<02:42, 3.09it/s]
Training 1/1 epoch (loss 1.5208): 60%|ββββββ | 748/1250 [04:17<02:42, 3.09it/s]
Training 1/1 epoch (loss 1.5208): 60%|ββββββ | 749/1250 [04:17<02:40, 3.11it/s]
Training 1/1 epoch (loss 1.4705): 60%|ββββββ | 749/1250 [04:17<02:40, 3.11it/s]
Training 1/1 epoch (loss 1.4705): 60%|ββββββ | 750/1250 [04:17<02:40, 3.12it/s]
Training 1/1 epoch (loss 1.4951): 60%|ββββββ | 750/1250 [04:18<02:40, 3.12it/s]
Training 1/1 epoch (loss 1.4951): 60%|ββββββ | 751/1250 [04:18<02:41, 3.08it/s]
Training 1/1 epoch (loss 1.4975): 60%|ββββββ | 751/1250 [04:18<02:41, 3.08it/s]
Training 1/1 epoch (loss 1.4975): 60%|ββββββ | 752/1250 [04:18<02:50, 2.92it/s]
Training 1/1 epoch (loss 1.5712): 60%|ββββββ | 752/1250 [04:18<02:50, 2.92it/s]
Training 1/1 epoch (loss 1.5712): 60%|ββββββ | 753/1250 [04:18<02:50, 2.91it/s]
Training 1/1 epoch (loss 1.4695): 60%|ββββββ | 753/1250 [04:19<02:50, 2.91it/s]
Training 1/1 epoch (loss 1.4695): 60%|ββββββ | 754/1250 [04:19<02:45, 2.99it/s]
Training 1/1 epoch (loss 1.5575): 60%|ββββββ | 754/1250 [04:19<02:45, 2.99it/s]
Training 1/1 epoch (loss 1.5575): 60%|ββββββ | 755/1250 [04:19<02:47, 2.95it/s]
Training 1/1 epoch (loss 1.6193): 60%|ββββββ | 755/1250 [04:19<02:47, 2.95it/s]
Training 1/1 epoch (loss 1.6193): 60%|ββββββ | 756/1250 [04:19<02:43, 3.02it/s]
Training 1/1 epoch (loss 1.4771): 60%|ββββββ | 756/1250 [04:20<02:43, 3.02it/s]
Training 1/1 epoch (loss 1.4771): 61%|ββββββ | 757/1250 [04:20<02:43, 3.01it/s]
Training 1/1 epoch (loss 1.4873): 61%|ββββββ | 757/1250 [04:20<02:43, 3.01it/s]
Training 1/1 epoch (loss 1.4873): 61%|ββββββ | 758/1250 [04:20<02:42, 3.03it/s]
Training 1/1 epoch (loss 1.6699): 61%|ββββββ | 758/1250 [04:20<02:42, 3.03it/s]
Training 1/1 epoch (loss 1.6699): 61%|ββββββ | 759/1250 [04:20<02:43, 3.01it/s]
Training 1/1 epoch (loss 1.3874): 61%|ββββββ | 759/1250 [04:21<02:43, 3.01it/s]
Training 1/1 epoch (loss 1.3874): 61%|ββββββ | 760/1250 [04:21<02:44, 2.98it/s]
Training 1/1 epoch (loss 1.4359): 61%|ββββββ | 760/1250 [04:21<02:44, 2.98it/s]
Training 1/1 epoch (loss 1.4359): 61%|ββββββ | 761/1250 [04:21<02:43, 2.99it/s]
Training 1/1 epoch (loss 1.5229): 61%|ββββββ | 761/1250 [04:21<02:43, 2.99it/s]
Training 1/1 epoch (loss 1.5229): 61%|ββββββ | 762/1250 [04:21<02:43, 2.98it/s]
Training 1/1 epoch (loss 1.5644): 61%|ββββββ | 762/1250 [04:22<02:43, 2.98it/s]
Training 1/1 epoch (loss 1.5644): 61%|ββββββ | 763/1250 [04:22<02:41, 3.01it/s]
Training 1/1 epoch (loss 1.4707): 61%|ββββββ | 763/1250 [04:22<02:41, 3.01it/s]
Training 1/1 epoch (loss 1.4707): 61%|ββββββ | 764/1250 [04:22<02:39, 3.04it/s]
Training 1/1 epoch (loss 1.6647): 61%|ββββββ | 764/1250 [04:22<02:39, 3.04it/s]
Training 1/1 epoch (loss 1.6647): 61%|ββββββ | 765/1250 [04:22<02:43, 2.98it/s]
Training 1/1 epoch (loss 1.4921): 61%|ββββββ | 765/1250 [04:23<02:43, 2.98it/s]
Training 1/1 epoch (loss 1.4921): 61%|βββββββ | 766/1250 [04:23<02:37, 3.08it/s]
Training 1/1 epoch (loss 1.5770): 61%|βββββββ | 766/1250 [04:23<02:37, 3.08it/s]
Training 1/1 epoch (loss 1.5770): 61%|βββββββ | 767/1250 [04:23<02:37, 3.06it/s]
Training 1/1 epoch (loss 1.4378): 61%|βββββββ | 767/1250 [04:23<02:37, 3.06it/s]
Training 1/1 epoch (loss 1.4378): 61%|βββββββ | 768/1250 [04:23<02:36, 3.08it/s]
Training 1/1 epoch (loss 1.4862): 61%|βββββββ | 768/1250 [04:24<02:36, 3.08it/s]
Training 1/1 epoch (loss 1.4862): 62%|βββββββ | 769/1250 [04:24<02:36, 3.08it/s]
Training 1/1 epoch (loss 1.5283): 62%|βββββββ | 769/1250 [04:24<02:36, 3.08it/s]
Training 1/1 epoch (loss 1.5283): 62%|βββββββ | 770/1250 [04:24<02:37, 3.04it/s]
Training 1/1 epoch (loss 1.4122): 62%|βββββββ | 770/1250 [04:24<02:37, 3.04it/s]
Training 1/1 epoch (loss 1.4122): 62%|βββββββ | 771/1250 [04:24<02:43, 2.94it/s]
Training 1/1 epoch (loss 1.5304): 62%|βββββββ | 771/1250 [04:25<02:43, 2.94it/s]
Training 1/1 epoch (loss 1.5304): 62%|βββββββ | 772/1250 [04:25<02:36, 3.06it/s]
Training 1/1 epoch (loss 1.5475): 62%|βββββββ | 772/1250 [04:25<02:36, 3.06it/s]
Training 1/1 epoch (loss 1.5475): 62%|βββββββ | 773/1250 [04:25<02:33, 3.12it/s]
Training 1/1 epoch (loss 1.4357): 62%|βββββββ | 773/1250 [04:25<02:33, 3.12it/s]
Training 1/1 epoch (loss 1.4357): 62%|βββββββ | 774/1250 [04:25<02:33, 3.10it/s]
Training 1/1 epoch (loss 1.6378): 62%|βββββββ | 774/1250 [04:26<02:33, 3.10it/s]
Training 1/1 epoch (loss 1.6378): 62%|βββββββ | 775/1250 [04:26<02:31, 3.13it/s]
Training 1/1 epoch (loss 1.4972): 62%|βββββββ | 775/1250 [04:26<02:31, 3.13it/s]
Training 1/1 epoch (loss 1.4972): 62%|βββββββ | 776/1250 [04:26<02:34, 3.06it/s]
Training 1/1 epoch (loss 1.4782): 62%|βββββββ | 776/1250 [04:26<02:34, 3.06it/s]
Training 1/1 epoch (loss 1.4782): 62%|βββββββ | 777/1250 [04:26<02:38, 2.99it/s]
Training 1/1 epoch (loss 1.6839): 62%|βββββββ | 777/1250 [04:27<02:38, 2.99it/s]
Training 1/1 epoch (loss 1.6839): 62%|βββββββ | 778/1250 [04:27<02:30, 3.13it/s]
Training 1/1 epoch (loss 1.5379): 62%|βββββββ | 778/1250 [04:27<02:30, 3.13it/s]
Training 1/1 epoch (loss 1.5379): 62%|βββββββ | 779/1250 [04:27<02:28, 3.17it/s]
Training 1/1 epoch (loss 1.5217): 62%|βββββββ | 779/1250 [04:27<02:28, 3.17it/s]
Training 1/1 epoch (loss 1.5217): 62%|βββββββ | 780/1250 [04:27<02:26, 3.20it/s]
Training 1/1 epoch (loss 1.6488): 62%|βββββββ | 780/1250 [04:28<02:26, 3.20it/s]
Training 1/1 epoch (loss 1.6488): 62%|βββββββ | 781/1250 [04:28<02:25, 3.22it/s]
Training 1/1 epoch (loss 1.4970): 62%|βββββββ | 781/1250 [04:28<02:25, 3.22it/s]
Training 1/1 epoch (loss 1.4970): 63%|βββββββ | 782/1250 [04:28<02:29, 3.13it/s]
Training 1/1 epoch (loss 1.5148): 63%|βββββββ | 782/1250 [04:28<02:29, 3.13it/s]
Training 1/1 epoch (loss 1.5148): 63%|βββββββ | 783/1250 [04:28<02:28, 3.15it/s]
Training 1/1 epoch (loss 1.6265): 63%|βββββββ | 783/1250 [04:29<02:28, 3.15it/s]
Training 1/1 epoch (loss 1.6265): 63%|βββββββ | 784/1250 [04:29<02:31, 3.08it/s]
Training 1/1 epoch (loss 1.5748): 63%|βββββββ | 784/1250 [04:29<02:31, 3.08it/s]
Training 1/1 epoch (loss 1.5748): 63%|βββββββ | 785/1250 [04:29<02:27, 3.16it/s]
Training 1/1 epoch (loss 1.5247): 63%|βββββββ | 785/1250 [04:29<02:27, 3.16it/s]
Training 1/1 epoch (loss 1.5247): 63%|βββββββ | 786/1250 [04:29<02:41, 2.88it/s]
Training 1/1 epoch (loss 1.5718): 63%|βββββββ | 786/1250 [04:30<02:41, 2.88it/s]
Training 1/1 epoch (loss 1.5718): 63%|βββββββ | 787/1250 [04:30<02:44, 2.81it/s]
Training 1/1 epoch (loss 1.5385): 63%|βββββββ | 787/1250 [04:30<02:44, 2.81it/s]
Training 1/1 epoch (loss 1.5385): 63%|βββββββ | 788/1250 [04:30<02:42, 2.85it/s]
Training 1/1 epoch (loss 1.5869): 63%|βββββββ | 788/1250 [04:30<02:42, 2.85it/s]
Training 1/1 epoch (loss 1.5869): 63%|βββββββ | 789/1250 [04:30<02:37, 2.93it/s]
Training 1/1 epoch (loss 1.5246): 63%|βββββββ | 789/1250 [04:31<02:37, 2.93it/s]
Training 1/1 epoch (loss 1.5246): 63%|βββββββ | 790/1250 [04:31<02:33, 3.00it/s]
Training 1/1 epoch (loss 1.5363): 63%|βββββββ | 790/1250 [04:31<02:33, 3.00it/s]
Training 1/1 epoch (loss 1.5363): 63%|βββββββ | 791/1250 [04:31<02:26, 3.14it/s]
Training 1/1 epoch (loss 1.5363): 63%|βββββββ | 791/1250 [04:31<02:26, 3.14it/s]
Training 1/1 epoch (loss 1.5363): 63%|βββββββ | 792/1250 [04:31<02:26, 3.12it/s]
Training 1/1 epoch (loss 1.5538): 63%|βββββββ | 792/1250 [04:32<02:26, 3.12it/s]
Training 1/1 epoch (loss 1.5538): 63%|βββββββ | 793/1250 [04:32<02:31, 3.01it/s]
Training 1/1 epoch (loss 1.5381): 63%|βββββββ | 793/1250 [04:32<02:31, 3.01it/s]
Training 1/1 epoch (loss 1.5381): 64%|βββββββ | 794/1250 [04:32<02:33, 2.98it/s]
Training 1/1 epoch (loss 1.5610): 64%|βββββββ | 794/1250 [04:32<02:33, 2.98it/s]
Training 1/1 epoch (loss 1.5610): 64%|βββββββ | 795/1250 [04:32<02:30, 3.02it/s]
Training 1/1 epoch (loss 1.5536): 64%|βββββββ | 795/1250 [04:33<02:30, 3.02it/s]
Training 1/1 epoch (loss 1.5536): 64%|βββββββ | 796/1250 [04:33<02:28, 3.06it/s]
Training 1/1 epoch (loss 1.6396): 64%|βββββββ | 796/1250 [04:33<02:28, 3.06it/s]
Training 1/1 epoch (loss 1.6396): 64%|βββββββ | 797/1250 [04:33<02:26, 3.09it/s]
Training 1/1 epoch (loss 1.6358): 64%|βββββββ | 797/1250 [04:33<02:26, 3.09it/s]
Training 1/1 epoch (loss 1.6358): 64%|βββββββ | 798/1250 [04:33<02:23, 3.14it/s]
Training 1/1 epoch (loss 1.5081): 64%|βββββββ | 798/1250 [04:34<02:23, 3.14it/s]
Training 1/1 epoch (loss 1.5081): 64%|βββββββ | 799/1250 [04:34<02:24, 3.12it/s]
Training 1/1 epoch (loss 1.5358): 64%|βββββββ | 799/1250 [04:34<02:24, 3.12it/s]
Training 1/1 epoch (loss 1.5358): 64%|βββββββ | 800/1250 [04:34<02:27, 3.04it/s]
Training 1/1 epoch (loss 1.6399): 64%|βββββββ | 800/1250 [04:34<02:27, 3.04it/s]
Training 1/1 epoch (loss 1.6399): 64%|βββββββ | 801/1250 [04:34<02:31, 2.96it/s]
Training 1/1 epoch (loss 1.5044): 64%|βββββββ | 801/1250 [04:35<02:31, 2.96it/s]
Training 1/1 epoch (loss 1.5044): 64%|βββββββ | 802/1250 [04:35<02:33, 2.91it/s]
Training 1/1 epoch (loss 1.6364): 64%|βββββββ | 802/1250 [04:35<02:33, 2.91it/s]
Training 1/1 epoch (loss 1.6364): 64%|βββββββ | 803/1250 [04:35<02:29, 3.00it/s]
Training 1/1 epoch (loss 1.6202): 64%|βββββββ | 803/1250 [04:35<02:29, 3.00it/s]
Training 1/1 epoch (loss 1.6202): 64%|βββββββ | 804/1250 [04:35<02:23, 3.11it/s]
Training 1/1 epoch (loss 1.5521): 64%|βββββββ | 804/1250 [04:35<02:23, 3.11it/s]
Training 1/1 epoch (loss 1.5521): 64%|βββββββ | 805/1250 [04:35<02:19, 3.18it/s]
Training 1/1 epoch (loss 1.5733): 64%|βββββββ | 805/1250 [04:36<02:19, 3.18it/s]
Training 1/1 epoch (loss 1.5733): 64%|βββββββ | 806/1250 [04:36<02:19, 3.19it/s]
Training 1/1 epoch (loss 1.4210): 64%|βββββββ | 806/1250 [04:36<02:19, 3.19it/s]
Training 1/1 epoch (loss 1.4210): 65%|βββββββ | 807/1250 [04:36<02:21, 3.14it/s]
Training 1/1 epoch (loss 1.4747): 65%|βββββββ | 807/1250 [04:36<02:21, 3.14it/s]
Training 1/1 epoch (loss 1.4747): 65%|βββββββ | 808/1250 [04:36<02:25, 3.04it/s]
Training 1/1 epoch (loss 1.4363): 65%|βββββββ | 808/1250 [04:37<02:25, 3.04it/s]
Training 1/1 epoch (loss 1.4363): 65%|βββββββ | 809/1250 [04:37<02:21, 3.12it/s]
Training 1/1 epoch (loss 1.5225): 65%|βββββββ | 809/1250 [04:37<02:21, 3.12it/s]
Training 1/1 epoch (loss 1.5225): 65%|βββββββ | 810/1250 [04:37<02:17, 3.19it/s]
Training 1/1 epoch (loss 1.4501): 65%|βββββββ | 810/1250 [04:37<02:17, 3.19it/s]
Training 1/1 epoch (loss 1.4501): 65%|βββββββ | 811/1250 [04:37<02:15, 3.25it/s]
Training 1/1 epoch (loss 1.5825): 65%|βββββββ | 811/1250 [04:38<02:15, 3.25it/s]
Training 1/1 epoch (loss 1.5825): 65%|βββββββ | 812/1250 [04:38<02:15, 3.23it/s]
Training 1/1 epoch (loss 1.6420): 65%|βββββββ | 812/1250 [04:38<02:15, 3.23it/s]
Training 1/1 epoch (loss 1.6420): 65%|βββββββ | 813/1250 [04:38<02:19, 3.14it/s]
Training 1/1 epoch (loss 1.7040): 65%|βββββββ | 813/1250 [04:38<02:19, 3.14it/s]
Training 1/1 epoch (loss 1.7040): 65%|βββββββ | 814/1250 [04:38<02:21, 3.07it/s]
Training 1/1 epoch (loss 1.5806): 65%|βββββββ | 814/1250 [04:39<02:21, 3.07it/s]
Training 1/1 epoch (loss 1.5806): 65%|βββββββ | 815/1250 [04:39<02:15, 3.22it/s]
Training 1/1 epoch (loss 1.4987): 65%|βββββββ | 815/1250 [04:39<02:15, 3.22it/s]
Training 1/1 epoch (loss 1.4987): 65%|βββββββ | 816/1250 [04:39<02:17, 3.16it/s]
Training 1/1 epoch (loss 1.5076): 65%|βββββββ | 816/1250 [04:39<02:17, 3.16it/s]
Training 1/1 epoch (loss 1.5076): 65%|βββββββ | 817/1250 [04:39<02:16, 3.18it/s]
Training 1/1 epoch (loss 1.5428): 65%|βββββββ | 817/1250 [04:40<02:16, 3.18it/s]
Training 1/1 epoch (loss 1.5428): 65%|βββββββ | 818/1250 [04:40<02:12, 3.26it/s]
Training 1/1 epoch (loss 1.4815): 65%|βββββββ | 818/1250 [04:40<02:12, 3.26it/s]
Training 1/1 epoch (loss 1.4815): 66%|βββββββ | 819/1250 [04:40<02:16, 3.16it/s]
Training 1/1 epoch (loss 1.5303): 66%|βββββββ | 819/1250 [04:40<02:16, 3.16it/s]
Training 1/1 epoch (loss 1.5303): 66%|βββββββ | 820/1250 [04:40<02:17, 3.13it/s]
Training 1/1 epoch (loss 1.4957): 66%|βββββββ | 820/1250 [04:41<02:17, 3.13it/s]
Training 1/1 epoch (loss 1.4957): 66%|βββββββ | 821/1250 [04:41<02:15, 3.17it/s]
Training 1/1 epoch (loss 1.5775): 66%|βββββββ | 821/1250 [04:41<02:15, 3.17it/s]
Training 1/1 epoch (loss 1.5775): 66%|βββββββ | 822/1250 [04:41<02:11, 3.26it/s]
Training 1/1 epoch (loss 1.4156): 66%|βββββββ | 822/1250 [04:41<02:11, 3.26it/s]
Training 1/1 epoch (loss 1.4156): 66%|βββββββ | 823/1250 [04:41<02:10, 3.27it/s]
Training 1/1 epoch (loss 1.5131): 66%|βββββββ | 823/1250 [04:41<02:10, 3.27it/s]
Training 1/1 epoch (loss 1.5131): 66%|βββββββ | 824/1250 [04:41<02:12, 3.22it/s]
Training 1/1 epoch (loss 1.5455): 66%|βββββββ | 824/1250 [04:42<02:12, 3.22it/s]
Training 1/1 epoch (loss 1.5455): 66%|βββββββ | 825/1250 [04:42<02:26, 2.90it/s]
Training 1/1 epoch (loss 1.5988): 66%|βββββββ | 825/1250 [04:42<02:26, 2.90it/s]
Training 1/1 epoch (loss 1.5988): 66%|βββββββ | 826/1250 [04:42<02:22, 2.97it/s]
Training 1/1 epoch (loss 1.4333): 66%|βββββββ | 826/1250 [04:43<02:22, 2.97it/s]
Training 1/1 epoch (loss 1.4333): 66%|βββββββ | 827/1250 [04:43<02:23, 2.94it/s]
Training 1/1 epoch (loss 1.4552): 66%|βββββββ | 827/1250 [04:43<02:23, 2.94it/s]
Training 1/1 epoch (loss 1.4552): 66%|βββββββ | 828/1250 [04:43<02:19, 3.03it/s]
Training 1/1 epoch (loss 1.5200): 66%|βββββββ | 828/1250 [04:43<02:19, 3.03it/s]
Training 1/1 epoch (loss 1.5200): 66%|βββββββ | 829/1250 [04:43<02:15, 3.11it/s]
Training 1/1 epoch (loss 1.4884): 66%|βββββββ | 829/1250 [04:43<02:15, 3.11it/s]
Training 1/1 epoch (loss 1.4884): 66%|βββββββ | 830/1250 [04:43<02:12, 3.17it/s]
Training 1/1 epoch (loss 1.4938): 66%|βββββββ | 830/1250 [04:44<02:12, 3.17it/s]
Training 1/1 epoch (loss 1.4938): 66%|βββββββ | 831/1250 [04:44<02:11, 3.19it/s]
Training 1/1 epoch (loss 1.4009): 66%|βββββββ | 831/1250 [04:44<02:11, 3.19it/s]
Training 1/1 epoch (loss 1.4009): 67%|βββββββ | 832/1250 [04:44<02:14, 3.12it/s]
Training 1/1 epoch (loss 1.6065): 67%|βββββββ | 832/1250 [04:44<02:14, 3.12it/s]
Training 1/1 epoch (loss 1.6065): 67%|βββββββ | 833/1250 [04:44<02:21, 2.95it/s]
Training 1/1 epoch (loss 1.5781): 67%|βββββββ | 833/1250 [04:45<02:21, 2.95it/s]
Training 1/1 epoch (loss 1.5781): 67%|βββββββ | 834/1250 [04:45<02:16, 3.05it/s]
Training 1/1 epoch (loss 1.5358): 67%|βββββββ | 834/1250 [04:45<02:16, 3.05it/s]
Training 1/1 epoch (loss 1.5358): 67%|βββββββ | 835/1250 [04:45<02:20, 2.95it/s]
Training 1/1 epoch (loss 1.6046): 67%|βββββββ | 835/1250 [04:45<02:20, 2.95it/s]
Training 1/1 epoch (loss 1.6046): 67%|βββββββ | 836/1250 [04:45<02:16, 3.04it/s]
Training 1/1 epoch (loss 1.4452): 67%|βββββββ | 836/1250 [04:46<02:16, 3.04it/s]
Training 1/1 epoch (loss 1.4452): 67%|βββββββ | 837/1250 [04:46<02:11, 3.14it/s]
Training 1/1 epoch (loss 1.3903): 67%|βββββββ | 837/1250 [04:46<02:11, 3.14it/s]
Training 1/1 epoch (loss 1.3903): 67%|βββββββ | 838/1250 [04:46<02:15, 3.05it/s]
Training 1/1 epoch (loss 1.5863): 67%|βββββββ | 838/1250 [04:46<02:15, 3.05it/s]
Training 1/1 epoch (loss 1.5863): 67%|βββββββ | 839/1250 [04:46<02:13, 3.08it/s]
Training 1/1 epoch (loss 1.4333): 67%|βββββββ | 839/1250 [04:47<02:13, 3.08it/s]
Training 1/1 epoch (loss 1.4333): 67%|βββββββ | 840/1250 [04:47<02:11, 3.11it/s]
Training 1/1 epoch (loss 1.5048): 67%|βββββββ | 840/1250 [04:47<02:11, 3.11it/s]
Training 1/1 epoch (loss 1.5048): 67%|βββββββ | 841/1250 [04:47<02:09, 3.16it/s]
Training 1/1 epoch (loss 1.5996): 67%|βββββββ | 841/1250 [04:47<02:09, 3.16it/s]
Training 1/1 epoch (loss 1.5996): 67%|βββββββ | 842/1250 [04:47<02:09, 3.15it/s]
Training 1/1 epoch (loss 1.6154): 67%|βββββββ | 842/1250 [04:48<02:09, 3.15it/s]
Training 1/1 epoch (loss 1.6154): 67%|βββββββ | 843/1250 [04:48<02:08, 3.17it/s]
Training 1/1 epoch (loss 1.6039): 67%|βββββββ | 843/1250 [04:48<02:08, 3.17it/s]
Training 1/1 epoch (loss 1.6039): 68%|βββββββ | 844/1250 [04:48<02:07, 3.18it/s]
Training 1/1 epoch (loss 1.4767): 68%|βββββββ | 844/1250 [04:48<02:07, 3.18it/s]
Training 1/1 epoch (loss 1.4767): 68%|βββββββ | 845/1250 [04:48<02:08, 3.14it/s]
Training 1/1 epoch (loss 1.7192): 68%|βββββββ | 845/1250 [04:49<02:08, 3.14it/s]
Training 1/1 epoch (loss 1.7192): 68%|βββββββ | 846/1250 [04:49<02:07, 3.18it/s]
Training 1/1 epoch (loss 1.5316): 68%|βββββββ | 846/1250 [04:49<02:07, 3.18it/s]
Training 1/1 epoch (loss 1.5316): 68%|βββββββ | 847/1250 [04:49<02:04, 3.23it/s]
Training 1/1 epoch (loss 1.6052): 68%|βββββββ | 847/1250 [04:49<02:04, 3.23it/s]
Training 1/1 epoch (loss 1.6052): 68%|βββββββ | 848/1250 [04:49<02:08, 3.14it/s]
Training 1/1 epoch (loss 1.5869): 68%|βββββββ | 848/1250 [04:50<02:08, 3.14it/s]
Training 1/1 epoch (loss 1.5869): 68%|βββββββ | 849/1250 [04:50<02:08, 3.12it/s]
Training 1/1 epoch (loss 1.4886): 68%|βββββββ | 849/1250 [04:50<02:08, 3.12it/s]
Training 1/1 epoch (loss 1.4886): 68%|βββββββ | 850/1250 [04:50<02:07, 3.15it/s]
Training 1/1 epoch (loss 1.5364): 68%|βββββββ | 850/1250 [04:50<02:07, 3.15it/s]
Training 1/1 epoch (loss 1.5364): 68%|βββββββ | 851/1250 [04:50<02:12, 3.00it/s]
Training 1/1 epoch (loss 1.6339): 68%|βββββββ | 851/1250 [04:51<02:12, 3.00it/s]
Training 1/1 epoch (loss 1.6339): 68%|βββββββ | 852/1250 [04:51<02:09, 3.07it/s]
Training 1/1 epoch (loss 1.4726): 68%|βββββββ | 852/1250 [04:51<02:09, 3.07it/s]
Training 1/1 epoch (loss 1.4726): 68%|βββββββ | 853/1250 [04:51<02:07, 3.12it/s]
Training 1/1 epoch (loss 1.5097): 68%|βββββββ | 853/1250 [04:51<02:07, 3.12it/s]
Training 1/1 epoch (loss 1.5097): 68%|βββββββ | 854/1250 [04:51<02:04, 3.18it/s]
Training 1/1 epoch (loss 1.5988): 68%|βββββββ | 854/1250 [04:52<02:04, 3.18it/s]
Training 1/1 epoch (loss 1.5988): 68%|βββββββ | 855/1250 [04:52<02:09, 3.06it/s]
Training 1/1 epoch (loss 1.6620): 68%|βββββββ | 855/1250 [04:52<02:09, 3.06it/s]
Training 1/1 epoch (loss 1.6620): 68%|βββββββ | 856/1250 [04:52<02:14, 2.93it/s]
Training 1/1 epoch (loss 1.5514): 68%|βββββββ | 856/1250 [04:52<02:14, 2.93it/s]
Training 1/1 epoch (loss 1.5514): 69%|βββββββ | 857/1250 [04:52<02:15, 2.90it/s]
Training 1/1 epoch (loss 1.4706): 69%|βββββββ | 857/1250 [04:53<02:15, 2.90it/s]
Training 1/1 epoch (loss 1.4706): 69%|βββββββ | 858/1250 [04:53<02:12, 2.95it/s]
Training 1/1 epoch (loss 1.5011): 69%|βββββββ | 858/1250 [04:53<02:12, 2.95it/s]
Training 1/1 epoch (loss 1.5011): 69%|βββββββ | 859/1250 [04:53<02:09, 3.03it/s]
Training 1/1 epoch (loss 1.6174): 69%|βββββββ | 859/1250 [04:53<02:09, 3.03it/s]
Training 1/1 epoch (loss 1.6174): 69%|βββββββ | 860/1250 [04:53<02:05, 3.12it/s]
Training 1/1 epoch (loss 1.5749): 69%|βββββββ | 860/1250 [04:53<02:05, 3.12it/s]
Training 1/1 epoch (loss 1.5749): 69%|βββββββ | 861/1250 [04:53<02:03, 3.16it/s]
Training 1/1 epoch (loss 1.5075): 69%|βββββββ | 861/1250 [04:54<02:03, 3.16it/s]
Training 1/1 epoch (loss 1.5075): 69%|βββββββ | 862/1250 [04:54<02:04, 3.11it/s]
Training 1/1 epoch (loss 1.5597): 69%|βββββββ | 862/1250 [04:54<02:04, 3.11it/s]
Training 1/1 epoch (loss 1.5597): 69%|βββββββ | 863/1250 [04:54<02:03, 3.13it/s]
Training 1/1 epoch (loss 1.5604): 69%|βββββββ | 863/1250 [04:54<02:03, 3.13it/s]
Training 1/1 epoch (loss 1.5604): 69%|βββββββ | 864/1250 [04:54<02:06, 3.06it/s]
Training 1/1 epoch (loss 1.3939): 69%|βββββββ | 864/1250 [04:55<02:06, 3.06it/s]
Training 1/1 epoch (loss 1.3939): 69%|βββββββ | 865/1250 [04:55<02:04, 3.10it/s]
Training 1/1 epoch (loss 1.4725): 69%|βββββββ | 865/1250 [04:55<02:04, 3.10it/s]
Training 1/1 epoch (loss 1.4725): 69%|βββββββ | 866/1250 [04:55<02:00, 3.18it/s]
Training 1/1 epoch (loss 1.5243): 69%|βββββββ | 866/1250 [04:55<02:00, 3.18it/s]
Training 1/1 epoch (loss 1.5243): 69%|βββββββ | 867/1250 [04:55<01:59, 3.22it/s]
Training 1/1 epoch (loss 1.6171): 69%|βββββββ | 867/1250 [04:56<01:59, 3.22it/s]
Training 1/1 epoch (loss 1.6171): 69%|βββββββ | 868/1250 [04:56<01:58, 3.23it/s]
Training 1/1 epoch (loss 1.6152): 69%|βββββββ | 868/1250 [04:56<01:58, 3.23it/s]
Training 1/1 epoch (loss 1.6152): 70%|βββββββ | 869/1250 [04:56<01:58, 3.22it/s]
Training 1/1 epoch (loss 1.6057): 70%|βββββββ | 869/1250 [04:56<01:58, 3.22it/s]
Training 1/1 epoch (loss 1.6057): 70%|βββββββ | 870/1250 [04:56<02:00, 3.15it/s]
Training 1/1 epoch (loss 1.4954): 70%|βββββββ | 870/1250 [04:57<02:00, 3.15it/s]
Training 1/1 epoch (loss 1.4954): 70%|βββββββ | 871/1250 [04:57<01:58, 3.19it/s]
Training 1/1 epoch (loss 1.4014): 70%|βββββββ | 871/1250 [04:57<01:58, 3.19it/s]
Training 1/1 epoch (loss 1.4014): 70%|βββββββ | 872/1250 [04:57<01:59, 3.16it/s]
Training 1/1 epoch (loss 1.5711): 70%|βββββββ | 872/1250 [04:57<01:59, 3.16it/s]
Training 1/1 epoch (loss 1.5711): 70%|βββββββ | 873/1250 [04:57<01:58, 3.17it/s]
Training 1/1 epoch (loss 1.6424): 70%|βββββββ | 873/1250 [04:58<01:58, 3.17it/s]
Training 1/1 epoch (loss 1.6424): 70%|βββββββ | 874/1250 [04:58<01:58, 3.17it/s]
Training 1/1 epoch (loss 1.4240): 70%|βββββββ | 874/1250 [04:58<01:58, 3.17it/s]
Training 1/1 epoch (loss 1.4240): 70%|βββββββ | 875/1250 [04:58<01:56, 3.22it/s]
Training 1/1 epoch (loss 1.5732): 70%|βββββββ | 875/1250 [04:58<01:56, 3.22it/s]
Training 1/1 epoch (loss 1.5732): 70%|βββββββ | 876/1250 [04:58<01:56, 3.21it/s]
Training 1/1 epoch (loss 1.6116): 70%|βββββββ | 876/1250 [04:59<01:56, 3.21it/s]
Training 1/1 epoch (loss 1.6116): 70%|βββββββ | 877/1250 [04:59<01:58, 3.14it/s]
Training 1/1 epoch (loss 1.4791): 70%|βββββββ | 877/1250 [04:59<01:58, 3.14it/s]
Training 1/1 epoch (loss 1.4791): 70%|βββββββ | 878/1250 [04:59<01:56, 3.21it/s]
Training 1/1 epoch (loss 1.5430): 70%|βββββββ | 878/1250 [04:59<01:56, 3.21it/s]
Training 1/1 epoch (loss 1.5430): 70%|βββββββ | 879/1250 [04:59<01:54, 3.24it/s]
Training 1/1 epoch (loss 1.5690): 70%|βββββββ | 879/1250 [05:00<01:54, 3.24it/s]
Training 1/1 epoch (loss 1.5690): 70%|βββββββ | 880/1250 [05:00<02:12, 2.80it/s]
Training 1/1 epoch (loss 1.4871): 70%|βββββββ | 880/1250 [05:00<02:12, 2.80it/s]
Training 1/1 epoch (loss 1.4871): 70%|βββββββ | 881/1250 [05:00<02:06, 2.92it/s]
Training 1/1 epoch (loss 1.4610): 70%|βββββββ | 881/1250 [05:00<02:06, 2.92it/s]
Training 1/1 epoch (loss 1.4610): 71%|βββββββ | 882/1250 [05:00<02:06, 2.92it/s]
Training 1/1 epoch (loss 1.4793): 71%|βββββββ | 882/1250 [05:01<02:06, 2.92it/s]
Training 1/1 epoch (loss 1.4793): 71%|βββββββ | 883/1250 [05:01<02:01, 3.01it/s]
Training 1/1 epoch (loss 1.5542): 71%|βββββββ | 883/1250 [05:01<02:01, 3.01it/s]
Training 1/1 epoch (loss 1.5542): 71%|βββββββ | 884/1250 [05:01<02:02, 3.00it/s]
Training 1/1 epoch (loss 1.4983): 71%|βββββββ | 884/1250 [05:01<02:02, 3.00it/s]
Training 1/1 epoch (loss 1.4983): 71%|βββββββ | 885/1250 [05:01<01:58, 3.07it/s]
Training 1/1 epoch (loss 1.5561): 71%|βββββββ | 885/1250 [05:02<01:58, 3.07it/s]
Training 1/1 epoch (loss 1.5561): 71%|βββββββ | 886/1250 [05:02<01:58, 3.08it/s]
Training 1/1 epoch (loss 1.4674): 71%|βββββββ | 886/1250 [05:02<01:58, 3.08it/s]
Training 1/1 epoch (loss 1.4674): 71%|βββββββ | 887/1250 [05:02<01:56, 3.11it/s]
Training 1/1 epoch (loss 1.4388): 71%|βββββββ | 887/1250 [05:02<01:56, 3.11it/s]
Training 1/1 epoch (loss 1.4388): 71%|βββββββ | 888/1250 [05:02<02:03, 2.94it/s]
Training 1/1 epoch (loss 1.5322): 71%|βββββββ | 888/1250 [05:03<02:03, 2.94it/s]
Training 1/1 epoch (loss 1.5322): 71%|βββββββ | 889/1250 [05:03<02:01, 2.97it/s]
Training 1/1 epoch (loss 1.6113): 71%|βββββββ | 889/1250 [05:03<02:01, 2.97it/s]
Training 1/1 epoch (loss 1.6113): 71%|βββββββ | 890/1250 [05:03<01:59, 3.02it/s]
Training 1/1 epoch (loss 1.4490): 71%|βββββββ | 890/1250 [05:03<01:59, 3.02it/s]
Training 1/1 epoch (loss 1.4490): 71%|ββββββββ | 891/1250 [05:03<01:55, 3.10it/s]
Training 1/1 epoch (loss 1.5376): 71%|ββββββββ | 891/1250 [05:03<01:55, 3.10it/s]
Training 1/1 epoch (loss 1.5376): 71%|ββββββββ | 892/1250 [05:03<01:52, 3.17it/s]
Training 1/1 epoch (loss 1.4771): 71%|ββββββββ | 892/1250 [05:04<01:52, 3.17it/s]
Training 1/1 epoch (loss 1.4771): 71%|ββββββββ | 893/1250 [05:04<01:54, 3.12it/s]
Training 1/1 epoch (loss 1.3834): 71%|ββββββββ | 893/1250 [05:04<01:54, 3.12it/s]
Training 1/1 epoch (loss 1.3834): 72%|ββββββββ | 894/1250 [05:04<01:54, 3.12it/s]
Training 1/1 epoch (loss 1.5614): 72%|ββββββββ | 894/1250 [05:04<01:54, 3.12it/s]
Training 1/1 epoch (loss 1.5614): 72%|ββββββββ | 895/1250 [05:04<01:55, 3.08it/s]
Training 1/1 epoch (loss 1.6370): 72%|ββββββββ | 895/1250 [05:05<01:55, 3.08it/s]
Training 1/1 epoch (loss 1.6370): 72%|ββββββββ | 896/1250 [05:05<01:55, 3.07it/s]
Training 1/1 epoch (loss 1.5142): 72%|ββββββββ | 896/1250 [05:05<01:55, 3.07it/s]
Training 1/1 epoch (loss 1.5142): 72%|ββββββββ | 897/1250 [05:05<01:52, 3.14it/s]
Training 1/1 epoch (loss 1.5525): 72%|ββββββββ | 897/1250 [05:05<01:52, 3.14it/s]
Training 1/1 epoch (loss 1.5525): 72%|ββββββββ | 898/1250 [05:05<01:51, 3.15it/s]
Training 1/1 epoch (loss 1.5946): 72%|ββββββββ | 898/1250 [05:06<01:51, 3.15it/s]
Training 1/1 epoch (loss 1.5946): 72%|ββββββββ | 899/1250 [05:06<01:48, 3.23it/s]
Training 1/1 epoch (loss 1.6162): 72%|ββββββββ | 899/1250 [05:06<01:48, 3.23it/s]
Training 1/1 epoch (loss 1.6162): 72%|ββββββββ | 900/1250 [05:06<01:54, 3.05it/s]
Training 1/1 epoch (loss 1.4959): 72%|ββββββββ | 900/1250 [05:06<01:54, 3.05it/s]
Training 1/1 epoch (loss 1.4959): 72%|ββββββββ | 901/1250 [05:06<01:56, 3.00it/s]
Training 1/1 epoch (loss 1.5590): 72%|ββββββββ | 901/1250 [05:07<01:56, 3.00it/s]
Training 1/1 epoch (loss 1.5590): 72%|ββββββββ | 902/1250 [05:07<01:52, 3.10it/s]
Training 1/1 epoch (loss 1.5183): 72%|ββββββββ | 902/1250 [05:07<01:52, 3.10it/s]
Training 1/1 epoch (loss 1.5183): 72%|ββββββββ | 903/1250 [05:07<01:50, 3.15it/s]
Training 1/1 epoch (loss 1.5360): 72%|ββββββββ | 903/1250 [05:07<01:50, 3.15it/s]
Training 1/1 epoch (loss 1.5360): 72%|ββββββββ | 904/1250 [05:07<01:49, 3.16it/s]
Training 1/1 epoch (loss 1.5303): 72%|ββββββββ | 904/1250 [05:08<01:49, 3.16it/s]
Training 1/1 epoch (loss 1.5303): 72%|ββββββββ | 905/1250 [05:08<01:50, 3.13it/s]
Training 1/1 epoch (loss 1.4621): 72%|ββββββββ | 905/1250 [05:08<01:50, 3.13it/s]
Training 1/1 epoch (loss 1.4621): 72%|ββββββββ | 906/1250 [05:08<01:51, 3.08it/s]
Training 1/1 epoch (loss 1.5248): 72%|ββββββββ | 906/1250 [05:08<01:51, 3.08it/s]
Training 1/1 epoch (loss 1.5248): 73%|ββββββββ | 907/1250 [05:08<01:50, 3.09it/s]
Training 1/1 epoch (loss 1.6044): 73%|ββββββββ | 907/1250 [05:09<01:50, 3.09it/s]
Training 1/1 epoch (loss 1.6044): 73%|ββββββββ | 908/1250 [05:09<01:51, 3.06it/s]
Training 1/1 epoch (loss 1.5584): 73%|ββββββββ | 908/1250 [05:09<01:51, 3.06it/s]
Training 1/1 epoch (loss 1.5584): 73%|ββββββββ | 909/1250 [05:09<01:48, 3.13it/s]
Training 1/1 epoch (loss 1.5289): 73%|ββββββββ | 909/1250 [05:09<01:48, 3.13it/s]
Training 1/1 epoch (loss 1.5289): 73%|ββββββββ | 910/1250 [05:09<01:46, 3.18it/s]
Training 1/1 epoch (loss 1.5895): 73%|ββββββββ | 910/1250 [05:10<01:46, 3.18it/s]
Training 1/1 epoch (loss 1.5895): 73%|ββββββββ | 911/1250 [05:10<01:50, 3.08it/s]
Training 1/1 epoch (loss 1.5291): 73%|ββββββββ | 911/1250 [05:10<01:50, 3.08it/s]
Training 1/1 epoch (loss 1.5291): 73%|ββββββββ | 912/1250 [05:10<01:55, 2.93it/s]
Training 1/1 epoch (loss 1.5648): 73%|ββββββββ | 912/1250 [05:10<01:55, 2.93it/s]
Training 1/1 epoch (loss 1.5648): 73%|ββββββββ | 913/1250 [05:10<02:02, 2.75it/s]
Training 1/1 epoch (loss 1.5342): 73%|ββββββββ | 913/1250 [05:11<02:02, 2.75it/s]
Training 1/1 epoch (loss 1.5342): 73%|ββββββββ | 914/1250 [05:11<01:55, 2.90it/s]
Training 1/1 epoch (loss 1.4782): 73%|ββββββββ | 914/1250 [05:11<01:55, 2.90it/s]
Training 1/1 epoch (loss 1.4782): 73%|ββββββββ | 915/1250 [05:11<01:52, 2.99it/s]
Training 1/1 epoch (loss 1.4819): 73%|ββββββββ | 915/1250 [05:11<01:52, 2.99it/s]
Training 1/1 epoch (loss 1.4819): 73%|ββββββββ | 916/1250 [05:11<01:57, 2.84it/s]
Training 1/1 epoch (loss 1.5668): 73%|ββββββββ | 916/1250 [05:12<01:57, 2.84it/s]
Training 1/1 epoch (loss 1.5668): 73%|ββββββββ | 917/1250 [05:12<01:54, 2.92it/s]
Training 1/1 epoch (loss 1.5875): 73%|ββββββββ | 917/1250 [05:12<01:54, 2.92it/s]
Training 1/1 epoch (loss 1.5875): 73%|ββββββββ | 918/1250 [05:12<01:50, 3.01it/s]
Training 1/1 epoch (loss 1.5592): 73%|ββββββββ | 918/1250 [05:12<01:50, 3.01it/s]
Training 1/1 epoch (loss 1.5592): 74%|ββββββββ | 919/1250 [05:12<01:56, 2.85it/s]
Training 1/1 epoch (loss 1.6008): 74%|ββββββββ | 919/1250 [05:13<01:56, 2.85it/s]
Training 1/1 epoch (loss 1.6008): 74%|ββββββββ | 920/1250 [05:13<01:54, 2.87it/s]
Training 1/1 epoch (loss 1.5358): 74%|ββββββββ | 920/1250 [05:13<01:54, 2.87it/s]
Training 1/1 epoch (loss 1.5358): 74%|ββββββββ | 921/1250 [05:13<01:50, 2.97it/s]
Training 1/1 epoch (loss 1.6106): 74%|ββββββββ | 921/1250 [05:13<01:50, 2.97it/s]
Training 1/1 epoch (loss 1.6106): 74%|ββββββββ | 922/1250 [05:13<01:48, 3.02it/s]
Training 1/1 epoch (loss 1.4685): 74%|ββββββββ | 922/1250 [05:14<01:48, 3.02it/s]
Training 1/1 epoch (loss 1.4685): 74%|ββββββββ | 923/1250 [05:14<01:45, 3.10it/s]
Training 1/1 epoch (loss 1.6671): 74%|ββββββββ | 923/1250 [05:14<01:45, 3.10it/s]
Training 1/1 epoch (loss 1.6671): 74%|ββββββββ | 924/1250 [05:14<01:43, 3.14it/s]
Training 1/1 epoch (loss 1.5343): 74%|ββββββββ | 924/1250 [05:14<01:43, 3.14it/s]
Training 1/1 epoch (loss 1.5343): 74%|ββββββββ | 925/1250 [05:14<01:46, 3.04it/s]
Training 1/1 epoch (loss 1.4507): 74%|ββββββββ | 925/1250 [05:15<01:46, 3.04it/s]
Training 1/1 epoch (loss 1.4507): 74%|ββββββββ | 926/1250 [05:15<01:43, 3.14it/s]
Training 1/1 epoch (loss 1.5125): 74%|ββββββββ | 926/1250 [05:15<01:43, 3.14it/s]
Training 1/1 epoch (loss 1.5125): 74%|ββββββββ | 927/1250 [05:15<01:42, 3.16it/s]
Training 1/1 epoch (loss 1.4777): 74%|ββββββββ | 927/1250 [05:15<01:42, 3.16it/s]
Training 1/1 epoch (loss 1.4777): 74%|ββββββββ | 928/1250 [05:15<01:43, 3.11it/s]
Training 1/1 epoch (loss 1.4645): 74%|ββββββββ | 928/1250 [05:16<01:43, 3.11it/s]
Training 1/1 epoch (loss 1.4645): 74%|ββββββββ | 929/1250 [05:16<01:43, 3.11it/s]
Training 1/1 epoch (loss 1.5281): 74%|ββββββββ | 929/1250 [05:16<01:43, 3.11it/s]
Training 1/1 epoch (loss 1.5281): 74%|ββββββββ | 930/1250 [05:16<01:41, 3.17it/s]
Training 1/1 epoch (loss 1.6200): 74%|ββββββββ | 930/1250 [05:16<01:41, 3.17it/s]
Training 1/1 epoch (loss 1.6200): 74%|ββββββββ | 931/1250 [05:16<01:40, 3.17it/s]
Training 1/1 epoch (loss 1.6711): 74%|ββββββββ | 931/1250 [05:17<01:40, 3.17it/s]
Training 1/1 epoch (loss 1.6711): 75%|ββββββββ | 932/1250 [05:17<01:44, 3.04it/s]
Training 1/1 epoch (loss 1.5022): 75%|ββββββββ | 932/1250 [05:17<01:44, 3.04it/s]
Training 1/1 epoch (loss 1.5022): 75%|ββββββββ | 933/1250 [05:17<01:42, 3.09it/s]
Training 1/1 epoch (loss 1.4835): 75%|ββββββββ | 933/1250 [05:17<01:42, 3.09it/s]
Training 1/1 epoch (loss 1.4835): 75%|ββββββββ | 934/1250 [05:17<01:41, 3.10it/s]
Training 1/1 epoch (loss 1.4880): 75%|ββββββββ | 934/1250 [05:18<01:41, 3.10it/s]
Training 1/1 epoch (loss 1.4880): 75%|ββββββββ | 935/1250 [05:18<01:40, 3.14it/s]
Training 1/1 epoch (loss 1.4890): 75%|ββββββββ | 935/1250 [05:18<01:40, 3.14it/s]
Training 1/1 epoch (loss 1.4890): 75%|ββββββββ | 936/1250 [05:18<01:41, 3.10it/s]
Training 1/1 epoch (loss 1.5871): 75%|ββββββββ | 936/1250 [05:18<01:41, 3.10it/s]
Training 1/1 epoch (loss 1.5871): 75%|ββββββββ | 937/1250 [05:18<01:43, 3.03it/s]
Training 1/1 epoch (loss 1.4271): 75%|ββββββββ | 937/1250 [05:19<01:43, 3.03it/s]
Training 1/1 epoch (loss 1.4271): 75%|ββββββββ | 938/1250 [05:19<01:40, 3.11it/s]
Training 1/1 epoch (loss 1.6059): 75%|ββββββββ | 938/1250 [05:19<01:40, 3.11it/s]
Training 1/1 epoch (loss 1.6059): 75%|ββββββββ | 939/1250 [05:19<01:40, 3.09it/s]
Training 1/1 epoch (loss 1.6016): 75%|ββββββββ | 939/1250 [05:19<01:40, 3.09it/s]
Training 1/1 epoch (loss 1.6016): 75%|ββββββββ | 940/1250 [05:19<01:37, 3.17it/s]
Training 1/1 epoch (loss 1.6302): 75%|ββββββββ | 940/1250 [05:19<01:37, 3.17it/s]
Training 1/1 epoch (loss 1.6302): 75%|ββββββββ | 941/1250 [05:19<01:36, 3.20it/s]
Training 1/1 epoch (loss 1.4554): 75%|ββββββββ | 941/1250 [05:20<01:36, 3.20it/s]
Training 1/1 epoch (loss 1.4554): 75%|ββββββββ | 942/1250 [05:20<01:38, 3.14it/s]
Training 1/1 epoch (loss 1.4011): 75%|ββββββββ | 942/1250 [05:20<01:38, 3.14it/s]
Training 1/1 epoch (loss 1.4011): 75%|ββββββββ | 943/1250 [05:20<01:38, 3.10it/s]
Training 1/1 epoch (loss 1.5023): 75%|ββββββββ | 943/1250 [05:20<01:38, 3.10it/s]
Training 1/1 epoch (loss 1.5023): 76%|ββββββββ | 944/1250 [05:20<01:40, 3.04it/s]
Training 1/1 epoch (loss 1.5837): 76%|ββββββββ | 944/1250 [05:21<01:40, 3.04it/s]
Training 1/1 epoch (loss 1.5837): 76%|ββββββββ | 945/1250 [05:21<01:38, 3.08it/s]
Training 1/1 epoch (loss 1.5377): 76%|ββββββββ | 945/1250 [05:21<01:38, 3.08it/s]
Training 1/1 epoch (loss 1.5377): 76%|ββββββββ | 946/1250 [05:21<01:36, 3.14it/s]
Training 1/1 epoch (loss 1.5812): 76%|ββββββββ | 946/1250 [05:21<01:36, 3.14it/s]
Training 1/1 epoch (loss 1.5812): 76%|ββββββββ | 947/1250 [05:21<01:33, 3.23it/s]
Training 1/1 epoch (loss 1.4544): 76%|ββββββββ | 947/1250 [05:22<01:33, 3.23it/s]
Training 1/1 epoch (loss 1.4544): 76%|ββββββββ | 948/1250 [05:22<01:37, 3.11it/s]
Training 1/1 epoch (loss 1.5271): 76%|ββββββββ | 948/1250 [05:22<01:37, 3.11it/s]
Training 1/1 epoch (loss 1.5271): 76%|ββββββββ | 949/1250 [05:22<01:39, 3.03it/s]
Training 1/1 epoch (loss 1.6679): 76%|ββββββββ | 949/1250 [05:22<01:39, 3.03it/s]
Training 1/1 epoch (loss 1.6679): 76%|ββββββββ | 950/1250 [05:22<01:38, 3.04it/s]
Training 1/1 epoch (loss 1.5736): 76%|ββββββββ | 950/1250 [05:23<01:38, 3.04it/s]
Training 1/1 epoch (loss 1.5736): 76%|ββββββββ | 951/1250 [05:23<01:43, 2.88it/s]
Training 1/1 epoch (loss 1.4502): 76%|ββββββββ | 951/1250 [05:23<01:43, 2.88it/s]
Training 1/1 epoch (loss 1.4502): 76%|ββββββββ | 952/1250 [05:23<01:41, 2.95it/s]
Training 1/1 epoch (loss 1.4816): 76%|ββββββββ | 952/1250 [05:23<01:41, 2.95it/s]
Training 1/1 epoch (loss 1.4816): 76%|ββββββββ | 953/1250 [05:23<01:38, 3.01it/s]
Training 1/1 epoch (loss 1.6781): 76%|ββββββββ | 953/1250 [05:24<01:38, 3.01it/s]
Training 1/1 epoch (loss 1.6781): 76%|ββββββββ | 954/1250 [05:24<01:35, 3.09it/s]
Training 1/1 epoch (loss 1.5696): 76%|ββββββββ | 954/1250 [05:24<01:35, 3.09it/s]
Training 1/1 epoch (loss 1.5696): 76%|ββββββββ | 955/1250 [05:24<01:33, 3.15it/s]
Training 1/1 epoch (loss 1.5603): 76%|ββββββββ | 955/1250 [05:24<01:33, 3.15it/s]
Training 1/1 epoch (loss 1.5603): 76%|ββββββββ | 956/1250 [05:24<01:33, 3.13it/s]
Training 1/1 epoch (loss 1.5149): 76%|ββββββββ | 956/1250 [05:25<01:33, 3.13it/s]
Training 1/1 epoch (loss 1.5149): 77%|ββββββββ | 957/1250 [05:25<01:33, 3.15it/s]
Training 1/1 epoch (loss 1.4539): 77%|ββββββββ | 957/1250 [05:25<01:33, 3.15it/s]
Training 1/1 epoch (loss 1.4539): 77%|ββββββββ | 958/1250 [05:25<01:31, 3.18it/s]
Training 1/1 epoch (loss 1.4160): 77%|ββββββββ | 958/1250 [05:25<01:31, 3.18it/s]
Training 1/1 epoch (loss 1.4160): 77%|ββββββββ | 959/1250 [05:25<01:29, 3.24it/s]
Training 1/1 epoch (loss 1.5162): 77%|ββββββββ | 959/1250 [05:26<01:29, 3.24it/s]
Training 1/1 epoch (loss 1.5162): 77%|ββββββββ | 960/1250 [05:26<01:30, 3.20it/s]
Training 1/1 epoch (loss 1.5933): 77%|ββββββββ | 960/1250 [05:26<01:30, 3.20it/s]
Training 1/1 epoch (loss 1.5933): 77%|ββββββββ | 961/1250 [05:26<01:31, 3.17it/s]
Training 1/1 epoch (loss 1.6407): 77%|ββββββββ | 961/1250 [05:26<01:31, 3.17it/s]
Training 1/1 epoch (loss 1.6407): 77%|ββββββββ | 962/1250 [05:26<01:31, 3.16it/s]
Training 1/1 epoch (loss 1.4632): 77%|ββββββββ | 962/1250 [05:27<01:31, 3.16it/s]
Training 1/1 epoch (loss 1.4632): 77%|ββββββββ | 963/1250 [05:27<01:30, 3.16it/s]
Training 1/1 epoch (loss 1.5920): 77%|ββββββββ | 963/1250 [05:27<01:30, 3.16it/s]
Training 1/1 epoch (loss 1.5920): 77%|ββββββββ | 964/1250 [05:27<01:29, 3.19it/s]
Training 1/1 epoch (loss 1.5674): 77%|ββββββββ | 964/1250 [05:27<01:29, 3.19it/s]
Training 1/1 epoch (loss 1.5674): 77%|ββββββββ | 965/1250 [05:27<01:32, 3.10it/s]
Training 1/1 epoch (loss 1.4878): 77%|ββββββββ | 965/1250 [05:28<01:32, 3.10it/s]
Training 1/1 epoch (loss 1.4878): 77%|ββββββββ | 966/1250 [05:28<01:30, 3.12it/s]
Training 1/1 epoch (loss 1.5906): 77%|ββββββββ | 966/1250 [05:28<01:30, 3.12it/s]
Training 1/1 epoch (loss 1.5906): 77%|ββββββββ | 967/1250 [05:28<01:32, 3.07it/s]
Training 1/1 epoch (loss 1.5782): 77%|ββββββββ | 967/1250 [05:28<01:32, 3.07it/s]
Training 1/1 epoch (loss 1.5782): 77%|ββββββββ | 968/1250 [05:28<01:32, 3.04it/s]
Training 1/1 epoch (loss 1.4705): 77%|ββββββββ | 968/1250 [05:29<01:32, 3.04it/s]
Training 1/1 epoch (loss 1.4705): 78%|ββββββββ | 969/1250 [05:29<01:31, 3.08it/s]
Training 1/1 epoch (loss 1.5298): 78%|ββββββββ | 969/1250 [05:29<01:31, 3.08it/s]
Training 1/1 epoch (loss 1.5298): 78%|ββββββββ | 970/1250 [05:29<01:29, 3.12it/s]
Training 1/1 epoch (loss 1.5548): 78%|ββββββββ | 970/1250 [05:29<01:29, 3.12it/s]
Training 1/1 epoch (loss 1.5548): 78%|ββββββββ | 971/1250 [05:29<01:28, 3.15it/s]
Training 1/1 epoch (loss 1.4463): 78%|ββββββββ | 971/1250 [05:29<01:28, 3.15it/s]
Training 1/1 epoch (loss 1.4463): 78%|ββββββββ | 972/1250 [05:29<01:29, 3.11it/s]
Training 1/1 epoch (loss 1.5256): 78%|ββββββββ | 972/1250 [05:30<01:29, 3.11it/s]
Training 1/1 epoch (loss 1.5256): 78%|ββββββββ | 973/1250 [05:30<01:35, 2.89it/s]
Training 1/1 epoch (loss 1.4914): 78%|ββββββββ | 973/1250 [05:30<01:35, 2.89it/s]
Training 1/1 epoch (loss 1.4914): 78%|ββββββββ | 974/1250 [05:30<01:34, 2.93it/s]
Training 1/1 epoch (loss 1.4422): 78%|ββββββββ | 974/1250 [05:31<01:34, 2.93it/s]
Training 1/1 epoch (loss 1.4422): 78%|ββββββββ | 975/1250 [05:31<01:32, 2.98it/s]
Training 1/1 epoch (loss 1.4502): 78%|ββββββββ | 975/1250 [05:31<01:32, 2.98it/s]
Training 1/1 epoch (loss 1.4502): 78%|ββββββββ | 976/1250 [05:31<01:29, 3.07it/s]
Training 1/1 epoch (loss 1.5569): 78%|ββββββββ | 976/1250 [05:31<01:29, 3.07it/s]
Training 1/1 epoch (loss 1.5569): 78%|ββββββββ | 977/1250 [05:31<01:28, 3.10it/s]
Training 1/1 epoch (loss 1.5337): 78%|ββββββββ | 977/1250 [05:31<01:28, 3.10it/s]
Training 1/1 epoch (loss 1.5337): 78%|ββββββββ | 978/1250 [05:31<01:26, 3.13it/s]
Training 1/1 epoch (loss 1.4641): 78%|ββββββββ | 978/1250 [05:32<01:26, 3.13it/s]
Training 1/1 epoch (loss 1.4641): 78%|ββββββββ | 979/1250 [05:32<01:24, 3.21it/s]
Training 1/1 epoch (loss 1.6305): 78%|ββββββββ | 979/1250 [05:32<01:24, 3.21it/s]
Training 1/1 epoch (loss 1.6305): 78%|ββββββββ | 980/1250 [05:32<01:24, 3.20it/s]
Training 1/1 epoch (loss 1.5961): 78%|ββββββββ | 980/1250 [05:32<01:24, 3.20it/s]
Training 1/1 epoch (loss 1.5961): 78%|ββββββββ | 981/1250 [05:32<01:27, 3.06it/s]
Training 1/1 epoch (loss 1.5834): 78%|ββββββββ | 981/1250 [05:33<01:27, 3.06it/s]
Training 1/1 epoch (loss 1.5834): 79%|ββββββββ | 982/1250 [05:33<01:44, 2.57it/s]
Training 1/1 epoch (loss 1.4991): 79%|ββββββββ | 982/1250 [05:33<01:44, 2.57it/s]
Training 1/1 epoch (loss 1.4991): 79%|ββββββββ | 983/1250 [05:33<01:37, 2.73it/s]
Training 1/1 epoch (loss 1.5029): 79%|ββββββββ | 983/1250 [05:34<01:37, 2.73it/s]
Training 1/1 epoch (loss 1.5029): 79%|ββββββββ | 984/1250 [05:34<01:34, 2.83it/s]
Training 1/1 epoch (loss 1.6770): 79%|ββββββββ | 984/1250 [05:34<01:34, 2.83it/s]
Training 1/1 epoch (loss 1.6770): 79%|ββββββββ | 985/1250 [05:34<01:30, 2.94it/s]
Training 1/1 epoch (loss 1.4782): 79%|ββββββββ | 985/1250 [05:34<01:30, 2.94it/s]
Training 1/1 epoch (loss 1.4782): 79%|ββββββββ | 986/1250 [05:34<01:27, 3.00it/s]
Training 1/1 epoch (loss 1.5329): 79%|ββββββββ | 986/1250 [05:35<01:27, 3.00it/s]
Training 1/1 epoch (loss 1.5329): 79%|ββββββββ | 987/1250 [05:35<01:30, 2.89it/s]
Training 1/1 epoch (loss 1.4046): 79%|ββββββββ | 987/1250 [05:35<01:30, 2.89it/s]
Training 1/1 epoch (loss 1.4046): 79%|ββββββββ | 988/1250 [05:35<01:28, 2.97it/s]
Training 1/1 epoch (loss 1.4223): 79%|ββββββββ | 988/1250 [05:35<01:28, 2.97it/s]
Training 1/1 epoch (loss 1.4223): 79%|ββββββββ | 989/1250 [05:35<01:25, 3.06it/s]
Training 1/1 epoch (loss 1.5294): 79%|ββββββββ | 989/1250 [05:36<01:25, 3.06it/s]
Training 1/1 epoch (loss 1.5294): 79%|ββββββββ | 990/1250 [05:36<01:23, 3.12it/s]
Training 1/1 epoch (loss 1.3583): 79%|ββββββββ | 990/1250 [05:36<01:23, 3.12it/s]
Training 1/1 epoch (loss 1.3583): 79%|ββββββββ | 991/1250 [05:36<01:23, 3.11it/s]
Training 1/1 epoch (loss 1.5677): 79%|ββββββββ | 991/1250 [05:36<01:23, 3.11it/s]
Training 1/1 epoch (loss 1.5677): 79%|ββββββββ | 992/1250 [05:36<01:25, 3.02it/s]
Training 1/1 epoch (loss 1.5283): 79%|ββββββββ | 992/1250 [05:37<01:25, 3.02it/s]
Training 1/1 epoch (loss 1.5283): 79%|ββββββββ | 993/1250 [05:37<01:25, 3.02it/s]
Training 1/1 epoch (loss 1.5871): 79%|ββββββββ | 993/1250 [05:37<01:25, 3.02it/s]
Training 1/1 epoch (loss 1.5871): 80%|ββββββββ | 994/1250 [05:37<01:22, 3.11it/s]
Training 1/1 epoch (loss 1.5409): 80%|ββββββββ | 994/1250 [05:37<01:22, 3.11it/s]
Training 1/1 epoch (loss 1.5409): 80%|ββββββββ | 995/1250 [05:37<01:21, 3.12it/s]
Training 1/1 epoch (loss 1.5137): 80%|ββββββββ | 995/1250 [05:37<01:21, 3.12it/s]
Training 1/1 epoch (loss 1.5137): 80%|ββββββββ | 996/1250 [05:37<01:18, 3.22it/s]
Training 1/1 epoch (loss 1.4877): 80%|ββββββββ | 996/1250 [05:38<01:18, 3.22it/s]
Training 1/1 epoch (loss 1.4877): 80%|ββββββββ | 997/1250 [05:38<01:18, 3.21it/s]
Training 1/1 epoch (loss 1.5266): 80%|ββββββββ | 997/1250 [05:38<01:18, 3.21it/s]
Training 1/1 epoch (loss 1.5266): 80%|ββββββββ | 998/1250 [05:38<01:19, 3.15it/s]
Training 1/1 epoch (loss 1.5034): 80%|ββββββββ | 998/1250 [05:38<01:19, 3.15it/s]
Training 1/1 epoch (loss 1.5034): 80%|ββββββββ | 999/1250 [05:38<01:22, 3.04it/s]
Training 1/1 epoch (loss 1.4889): 80%|ββββββββ | 999/1250 [05:39<01:22, 3.04it/s]
Training 1/1 epoch (loss 1.4889): 80%|ββββββββ | 1000/1250 [05:39<01:22, 3.02it/s]
Training 1/1 epoch (loss 1.3863): 80%|ββββββββ | 1000/1250 [05:39<01:22, 3.02it/s]
Training 1/1 epoch (loss 1.3863): 80%|ββββββββ | 1001/1250 [05:39<01:21, 3.06it/s]
Training 1/1 epoch (loss 1.5961): 80%|ββββββββ | 1001/1250 [05:39<01:21, 3.06it/s]
Training 1/1 epoch (loss 1.5961): 80%|ββββββββ | 1002/1250 [05:39<01:18, 3.17it/s]
Training 1/1 epoch (loss 1.4582): 80%|ββββββββ | 1002/1250 [05:40<01:18, 3.17it/s]
Training 1/1 epoch (loss 1.4582): 80%|ββββββββ | 1003/1250 [05:40<01:16, 3.22it/s]
Training 1/1 epoch (loss 1.4236): 80%|ββββββββ | 1003/1250 [05:40<01:16, 3.22it/s]
Training 1/1 epoch (loss 1.4236): 80%|ββββββββ | 1004/1250 [05:40<01:17, 3.16it/s]
Training 1/1 epoch (loss 1.5849): 80%|ββββββββ | 1004/1250 [05:40<01:17, 3.16it/s]
Training 1/1 epoch (loss 1.5849): 80%|ββββββββ | 1005/1250 [05:40<01:16, 3.21it/s]
Training 1/1 epoch (loss 1.4895): 80%|ββββββββ | 1005/1250 [05:41<01:16, 3.21it/s]
Training 1/1 epoch (loss 1.4895): 80%|ββββββββ | 1006/1250 [05:41<01:17, 3.14it/s]
Training 1/1 epoch (loss 1.5844): 80%|ββββββββ | 1006/1250 [05:41<01:17, 3.14it/s]
Training 1/1 epoch (loss 1.5844): 81%|ββββββββ | 1007/1250 [05:41<01:17, 3.15it/s]
Training 1/1 epoch (loss 1.4831): 81%|ββββββββ | 1007/1250 [05:41<01:17, 3.15it/s]
Training 1/1 epoch (loss 1.4831): 81%|ββββββββ | 1008/1250 [05:41<01:17, 3.11it/s]
Training 1/1 epoch (loss 1.5491): 81%|ββββββββ | 1008/1250 [05:42<01:17, 3.11it/s]
Training 1/1 epoch (loss 1.5491): 81%|ββββββββ | 1009/1250 [05:42<01:15, 3.21it/s]
Training 1/1 epoch (loss 1.4080): 81%|ββββββββ | 1009/1250 [05:42<01:15, 3.21it/s]
Training 1/1 epoch (loss 1.4080): 81%|ββββββββ | 1010/1250 [05:42<01:16, 3.15it/s]
Training 1/1 epoch (loss 1.5406): 81%|ββββββββ | 1010/1250 [05:42<01:16, 3.15it/s]
Training 1/1 epoch (loss 1.5406): 81%|ββββββββ | 1011/1250 [05:42<01:14, 3.22it/s]
Training 1/1 epoch (loss 1.4998): 81%|ββββββββ | 1011/1250 [05:43<01:14, 3.22it/s]
Training 1/1 epoch (loss 1.4998): 81%|ββββββββ | 1012/1250 [05:43<01:14, 3.18it/s]
Training 1/1 epoch (loss 1.4927): 81%|ββββββββ | 1012/1250 [05:43<01:14, 3.18it/s]
Training 1/1 epoch (loss 1.4927): 81%|ββββββββ | 1013/1250 [05:43<01:16, 3.12it/s]
Training 1/1 epoch (loss 1.5068): 81%|ββββββββ | 1013/1250 [05:43<01:16, 3.12it/s]
Training 1/1 epoch (loss 1.5068): 81%|ββββββββ | 1014/1250 [05:43<01:17, 3.03it/s]
Training 1/1 epoch (loss 1.4415): 81%|ββββββββ | 1014/1250 [05:44<01:17, 3.03it/s]
Training 1/1 epoch (loss 1.4415): 81%|ββββββββ | 1015/1250 [05:44<01:17, 3.03it/s]
Training 1/1 epoch (loss 1.4476): 81%|ββββββββ | 1015/1250 [05:44<01:17, 3.03it/s]
Training 1/1 epoch (loss 1.4476): 81%|βββββββββ | 1016/1250 [05:44<01:17, 3.00it/s]
Training 1/1 epoch (loss 1.4620): 81%|βββββββββ | 1016/1250 [05:44<01:17, 3.00it/s]
Training 1/1 epoch (loss 1.4620): 81%|βββββββββ | 1017/1250 [05:44<01:16, 3.04it/s]
Training 1/1 epoch (loss 1.5235): 81%|βββββββββ | 1017/1250 [05:45<01:16, 3.04it/s]
Training 1/1 epoch (loss 1.5235): 81%|βββββββββ | 1018/1250 [05:45<01:20, 2.88it/s]
Training 1/1 epoch (loss 1.5125): 81%|βββββββββ | 1018/1250 [05:45<01:20, 2.88it/s]
Training 1/1 epoch (loss 1.5125): 82%|βββββββββ | 1019/1250 [05:45<01:18, 2.96it/s]
Training 1/1 epoch (loss 1.5321): 82%|βββββββββ | 1019/1250 [05:45<01:18, 2.96it/s]
Training 1/1 epoch (loss 1.5321): 82%|βββββββββ | 1020/1250 [05:45<01:15, 3.05it/s]
Training 1/1 epoch (loss 1.4380): 82%|βββββββββ | 1020/1250 [05:46<01:15, 3.05it/s]
Training 1/1 epoch (loss 1.4380): 82%|βββββββββ | 1021/1250 [05:46<01:12, 3.14it/s]
Training 1/1 epoch (loss 1.4302): 82%|βββββββββ | 1021/1250 [05:46<01:12, 3.14it/s]
Training 1/1 epoch (loss 1.4302): 82%|βββββββββ | 1022/1250 [05:46<01:13, 3.09it/s]
Training 1/1 epoch (loss 1.4822): 82%|βββββββββ | 1022/1250 [05:46<01:13, 3.09it/s]
Training 1/1 epoch (loss 1.4822): 82%|βββββββββ | 1023/1250 [05:46<01:14, 3.04it/s]
Training 1/1 epoch (loss 1.4193): 82%|βββββββββ | 1023/1250 [05:47<01:14, 3.04it/s]
Training 1/1 epoch (loss 1.4193): 82%|βββββββββ | 1024/1250 [05:47<01:15, 2.99it/s]
Training 1/1 epoch (loss 1.5143): 82%|βββββββββ | 1024/1250 [05:47<01:15, 2.99it/s]
Training 1/1 epoch (loss 1.5143): 82%|βββββββββ | 1025/1250 [05:47<01:13, 3.07it/s]
Training 1/1 epoch (loss 1.5920): 82%|βββββββββ | 1025/1250 [05:47<01:13, 3.07it/s]
Training 1/1 epoch (loss 1.5920): 82%|βββββββββ | 1026/1250 [05:47<01:11, 3.13it/s]
Training 1/1 epoch (loss 1.5817): 82%|βββββββββ | 1026/1250 [05:47<01:11, 3.13it/s]
Training 1/1 epoch (loss 1.5817): 82%|βββββββββ | 1027/1250 [05:47<01:11, 3.11it/s]
Training 1/1 epoch (loss 1.5970): 82%|βββββββββ | 1027/1250 [05:48<01:11, 3.11it/s]
Training 1/1 epoch (loss 1.5970): 82%|βββββββββ | 1028/1250 [05:48<01:10, 3.13it/s]
Training 1/1 epoch (loss 1.5138): 82%|βββββββββ | 1028/1250 [05:48<01:10, 3.13it/s]
Training 1/1 epoch (loss 1.5138): 82%|βββββββββ | 1029/1250 [05:48<01:13, 3.02it/s]
Training 1/1 epoch (loss 1.5569): 82%|βββββββββ | 1029/1250 [05:48<01:13, 3.02it/s]
Training 1/1 epoch (loss 1.5569): 82%|βββββββββ | 1030/1250 [05:48<01:13, 3.00it/s]
Training 1/1 epoch (loss 1.5531): 82%|βββββββββ | 1030/1250 [05:49<01:13, 3.00it/s]
Training 1/1 epoch (loss 1.5531): 82%|βββββββββ | 1031/1250 [05:49<01:11, 3.04it/s]
Training 1/1 epoch (loss 1.5214): 82%|βββββββββ | 1031/1250 [05:49<01:11, 3.04it/s]
Training 1/1 epoch (loss 1.5214): 83%|βββββββββ | 1032/1250 [05:49<01:10, 3.08it/s]
Training 1/1 epoch (loss 1.5384): 83%|βββββββββ | 1032/1250 [05:49<01:10, 3.08it/s]
Training 1/1 epoch (loss 1.5384): 83%|βββββββββ | 1033/1250 [05:49<01:09, 3.11it/s]
Training 1/1 epoch (loss 1.5956): 83%|βββββββββ | 1033/1250 [05:50<01:09, 3.11it/s]
Training 1/1 epoch (loss 1.5956): 83%|βββββββββ | 1034/1250 [05:50<01:08, 3.15it/s]
Training 1/1 epoch (loss 1.5909): 83%|βββββββββ | 1034/1250 [05:50<01:08, 3.15it/s]
Training 1/1 epoch (loss 1.5909): 83%|βββββββββ | 1035/1250 [05:50<01:07, 3.18it/s]
Training 1/1 epoch (loss 1.4137): 83%|βββββββββ | 1035/1250 [05:50<01:07, 3.18it/s]
Training 1/1 epoch (loss 1.4137): 83%|βββββββββ | 1036/1250 [05:50<01:07, 3.16it/s]
Training 1/1 epoch (loss 1.6276): 83%|βββββββββ | 1036/1250 [05:51<01:07, 3.16it/s]
Training 1/1 epoch (loss 1.6276): 83%|βββββββββ | 1037/1250 [05:51<01:08, 3.12it/s]
Training 1/1 epoch (loss 1.3863): 83%|βββββββββ | 1037/1250 [05:51<01:08, 3.12it/s]
Training 1/1 epoch (loss 1.3863): 83%|βββββββββ | 1038/1250 [05:51<01:07, 3.14it/s]
Training 1/1 epoch (loss 1.4579): 83%|βββββββββ | 1038/1250 [05:51<01:07, 3.14it/s]
Training 1/1 epoch (loss 1.4579): 83%|βββββββββ | 1039/1250 [05:51<01:06, 3.16it/s]
Training 1/1 epoch (loss 1.5178): 83%|βββββββββ | 1039/1250 [05:52<01:06, 3.16it/s]
Training 1/1 epoch (loss 1.5178): 83%|βββββββββ | 1040/1250 [05:52<01:06, 3.14it/s]
Training 1/1 epoch (loss 1.6098): 83%|βββββββββ | 1040/1250 [05:52<01:06, 3.14it/s]
Training 1/1 epoch (loss 1.6098): 83%|βββββββββ | 1041/1250 [05:52<01:09, 3.02it/s]
Training 1/1 epoch (loss 1.4959): 83%|βββββββββ | 1041/1250 [05:52<01:09, 3.02it/s]
Training 1/1 epoch (loss 1.4959): 83%|βββββββββ | 1042/1250 [05:52<01:11, 2.93it/s]
Training 1/1 epoch (loss 1.5264): 83%|βββββββββ | 1042/1250 [05:53<01:11, 2.93it/s]
Training 1/1 epoch (loss 1.5264): 83%|βββββββββ | 1043/1250 [05:53<01:08, 3.00it/s]
Training 1/1 epoch (loss 1.5144): 83%|βββββββββ | 1043/1250 [05:53<01:08, 3.00it/s]
Training 1/1 epoch (loss 1.5144): 84%|βββββββββ | 1044/1250 [05:53<01:08, 3.03it/s]
Training 1/1 epoch (loss 1.5125): 84%|βββββββββ | 1044/1250 [05:53<01:08, 3.03it/s]
Training 1/1 epoch (loss 1.5125): 84%|βββββββββ | 1045/1250 [05:53<01:11, 2.86it/s]
Training 1/1 epoch (loss 1.4989): 84%|βββββββββ | 1045/1250 [05:54<01:11, 2.86it/s]
Training 1/1 epoch (loss 1.4989): 84%|βββββββββ | 1046/1250 [05:54<01:08, 3.00it/s]
Training 1/1 epoch (loss 1.4628): 84%|βββββββββ | 1046/1250 [05:54<01:08, 3.00it/s]
Training 1/1 epoch (loss 1.4628): 84%|βββββββββ | 1047/1250 [05:54<01:06, 3.04it/s]
Training 1/1 epoch (loss 1.5085): 84%|βββββββββ | 1047/1250 [05:54<01:06, 3.04it/s]
Training 1/1 epoch (loss 1.5085): 84%|βββββββββ | 1048/1250 [05:54<01:06, 3.06it/s]
Training 1/1 epoch (loss 1.5718): 84%|βββββββββ | 1048/1250 [05:55<01:06, 3.06it/s]
Training 1/1 epoch (loss 1.5718): 84%|βββββββββ | 1049/1250 [05:55<01:05, 3.06it/s]
Training 1/1 epoch (loss 1.4702): 84%|βββββββββ | 1049/1250 [05:55<01:05, 3.06it/s]
Training 1/1 epoch (loss 1.4702): 84%|βββββββββ | 1050/1250 [05:55<01:04, 3.11it/s]
Training 1/1 epoch (loss 1.5218): 84%|βββββββββ | 1050/1250 [05:55<01:04, 3.11it/s]
Training 1/1 epoch (loss 1.5218): 84%|βββββββββ | 1051/1250 [05:55<01:02, 3.21it/s]
Training 1/1 epoch (loss 1.6182): 84%|βββββββββ | 1051/1250 [05:56<01:02, 3.21it/s]
Training 1/1 epoch (loss 1.6182): 84%|βββββββββ | 1052/1250 [05:56<01:01, 3.22it/s]
Training 1/1 epoch (loss 1.4541): 84%|βββββββββ | 1052/1250 [05:56<01:01, 3.22it/s]
Training 1/1 epoch (loss 1.4541): 84%|βββββββββ | 1053/1250 [05:56<01:01, 3.20it/s]
Training 1/1 epoch (loss 1.5987): 84%|βββββββββ | 1053/1250 [05:56<01:01, 3.20it/s]
Training 1/1 epoch (loss 1.5987): 84%|βββββββββ | 1054/1250 [05:56<01:00, 3.23it/s]
Training 1/1 epoch (loss 1.4995): 84%|βββββββββ | 1054/1250 [05:57<01:00, 3.23it/s]
Training 1/1 epoch (loss 1.4995): 84%|βββββββββ | 1055/1250 [05:57<01:02, 3.11it/s]
Training 1/1 epoch (loss 1.4815): 84%|βββββββββ | 1055/1250 [05:57<01:02, 3.11it/s]
Training 1/1 epoch (loss 1.4815): 84%|βββββββββ | 1056/1250 [05:57<01:02, 3.09it/s]
Training 1/1 epoch (loss 1.4193): 84%|βββββββββ | 1056/1250 [05:57<01:02, 3.09it/s]
Training 1/1 epoch (loss 1.4193): 85%|βββββββββ | 1057/1250 [05:57<01:01, 3.13it/s]
Training 1/1 epoch (loss 1.5636): 85%|βββββββββ | 1057/1250 [05:57<01:01, 3.13it/s]
Training 1/1 epoch (loss 1.5636): 85%|βββββββββ | 1058/1250 [05:57<01:00, 3.15it/s]
Training 1/1 epoch (loss 1.4078): 85%|βββββββββ | 1058/1250 [05:58<01:00, 3.15it/s]
Training 1/1 epoch (loss 1.4078): 85%|βββββββββ | 1059/1250 [05:58<01:00, 3.18it/s]
Training 1/1 epoch (loss 1.4246): 85%|βββββββββ | 1059/1250 [05:58<01:00, 3.18it/s]
Training 1/1 epoch (loss 1.4246): 85%|βββββββββ | 1060/1250 [05:58<00:59, 3.20it/s]
Training 1/1 epoch (loss 1.5666): 85%|βββββββββ | 1060/1250 [05:58<00:59, 3.20it/s]
Training 1/1 epoch (loss 1.5666): 85%|βββββββββ | 1061/1250 [05:58<01:03, 2.99it/s]
Training 1/1 epoch (loss 1.5006): 85%|βββββββββ | 1061/1250 [05:59<01:03, 2.99it/s]
Training 1/1 epoch (loss 1.5006): 85%|βββββββββ | 1062/1250 [05:59<01:01, 3.04it/s]
Training 1/1 epoch (loss 1.4770): 85%|βββββββββ | 1062/1250 [05:59<01:01, 3.04it/s]
Training 1/1 epoch (loss 1.4770): 85%|βββββββββ | 1063/1250 [05:59<00:59, 3.15it/s]
Training 1/1 epoch (loss 1.5701): 85%|βββββββββ | 1063/1250 [05:59<00:59, 3.15it/s]
Training 1/1 epoch (loss 1.5701): 85%|βββββββββ | 1064/1250 [05:59<01:02, 3.00it/s]
Training 1/1 epoch (loss 1.5631): 85%|βββββββββ | 1064/1250 [06:00<01:02, 3.00it/s]
Training 1/1 epoch (loss 1.5631): 85%|βββββββββ | 1065/1250 [06:00<01:00, 3.07it/s]
Training 1/1 epoch (loss 1.5540): 85%|βββββββββ | 1065/1250 [06:00<01:00, 3.07it/s]
Training 1/1 epoch (loss 1.5540): 85%|βββββββββ | 1066/1250 [06:00<01:02, 2.92it/s]
Training 1/1 epoch (loss 1.6083): 85%|βββββββββ | 1066/1250 [06:00<01:02, 2.92it/s]
Training 1/1 epoch (loss 1.6083): 85%|βββββββββ | 1067/1250 [06:00<01:02, 2.93it/s]
Training 1/1 epoch (loss 1.5385): 85%|βββββββββ | 1067/1250 [06:01<01:02, 2.93it/s]
Training 1/1 epoch (loss 1.5385): 85%|βββββββββ | 1068/1250 [06:01<01:00, 3.00it/s]
Training 1/1 epoch (loss 1.6379): 85%|βββββββββ | 1068/1250 [06:01<01:00, 3.00it/s]
Training 1/1 epoch (loss 1.6379): 86%|βββββββββ | 1069/1250 [06:01<00:58, 3.09it/s]
Training 1/1 epoch (loss 1.5622): 86%|βββββββββ | 1069/1250 [06:01<00:58, 3.09it/s]
Training 1/1 epoch (loss 1.5622): 86%|βββββββββ | 1070/1250 [06:01<00:57, 3.14it/s]
Training 1/1 epoch (loss 1.5487): 86%|βββββββββ | 1070/1250 [06:02<00:57, 3.14it/s]
Training 1/1 epoch (loss 1.5487): 86%|βββββββββ | 1071/1250 [06:02<00:56, 3.16it/s]
Training 1/1 epoch (loss 1.4144): 86%|βββββββββ | 1071/1250 [06:02<00:56, 3.16it/s]
Training 1/1 epoch (loss 1.4144): 86%|βββββββββ | 1072/1250 [06:02<00:56, 3.15it/s]
Training 1/1 epoch (loss 1.4811): 86%|βββββββββ | 1072/1250 [06:02<00:56, 3.15it/s]
Training 1/1 epoch (loss 1.4811): 86%|βββββββββ | 1073/1250 [06:02<00:58, 3.03it/s]
Training 1/1 epoch (loss 1.6016): 86%|βββββββββ | 1073/1250 [06:03<00:58, 3.03it/s]
Training 1/1 epoch (loss 1.6016): 86%|βββββββββ | 1074/1250 [06:03<00:57, 3.08it/s]
Training 1/1 epoch (loss 1.6015): 86%|βββββββββ | 1074/1250 [06:03<00:57, 3.08it/s]
Training 1/1 epoch (loss 1.6015): 86%|βββββββββ | 1075/1250 [06:03<00:55, 3.14it/s]
Training 1/1 epoch (loss 1.4948): 86%|βββββββββ | 1075/1250 [06:03<00:55, 3.14it/s]
Training 1/1 epoch (loss 1.4948): 86%|βββββββββ | 1076/1250 [06:03<00:54, 3.17it/s]
Training 1/1 epoch (loss 1.4165): 86%|βββββββββ | 1076/1250 [06:04<00:54, 3.17it/s]
Training 1/1 epoch (loss 1.4165): 86%|βββββββββ | 1077/1250 [06:04<00:58, 2.98it/s]
Training 1/1 epoch (loss 1.3891): 86%|βββββββββ | 1077/1250 [06:04<00:58, 2.98it/s]
Training 1/1 epoch (loss 1.3891): 86%|βββββββββ | 1078/1250 [06:04<00:57, 3.00it/s]
Training 1/1 epoch (loss 1.5889): 86%|βββββββββ | 1078/1250 [06:04<00:57, 3.00it/s]
Training 1/1 epoch (loss 1.5889): 86%|βββββββββ | 1079/1250 [06:04<00:55, 3.06it/s]
Training 1/1 epoch (loss 1.4684): 86%|βββββββββ | 1079/1250 [06:05<00:55, 3.06it/s]
Training 1/1 epoch (loss 1.4684): 86%|βββββββββ | 1080/1250 [06:05<00:55, 3.05it/s]
Training 1/1 epoch (loss 1.4919): 86%|βββββββββ | 1080/1250 [06:05<00:55, 3.05it/s]
Training 1/1 epoch (loss 1.4919): 86%|βββββββββ | 1081/1250 [06:05<00:54, 3.11it/s]
Training 1/1 epoch (loss 1.4283): 86%|βββββββββ | 1081/1250 [06:05<00:54, 3.11it/s]
Training 1/1 epoch (loss 1.4283): 87%|βββββββββ | 1082/1250 [06:05<00:52, 3.20it/s]
Training 1/1 epoch (loss 1.5916): 87%|βββββββββ | 1082/1250 [06:06<00:52, 3.20it/s]
Training 1/1 epoch (loss 1.5916): 87%|βββββββββ | 1083/1250 [06:06<00:51, 3.22it/s]
Training 1/1 epoch (loss 1.4223): 87%|βββββββββ | 1083/1250 [06:06<00:51, 3.22it/s]
Training 1/1 epoch (loss 1.4223): 87%|βββββββββ | 1084/1250 [06:06<00:51, 3.25it/s]
Training 1/1 epoch (loss 1.5726): 87%|βββββββββ | 1084/1250 [06:06<00:51, 3.25it/s]
Training 1/1 epoch (loss 1.5726): 87%|βββββββββ | 1085/1250 [06:06<00:51, 3.22it/s]
Training 1/1 epoch (loss 1.6210): 87%|βββββββββ | 1085/1250 [06:07<00:51, 3.22it/s]
Training 1/1 epoch (loss 1.6210): 87%|βββββββββ | 1086/1250 [06:07<00:51, 3.18it/s]
Training 1/1 epoch (loss 1.5089): 87%|βββββββββ | 1086/1250 [06:07<00:51, 3.18it/s]
Training 1/1 epoch (loss 1.5089): 87%|βββββββββ | 1087/1250 [06:07<00:52, 3.08it/s]
Training 1/1 epoch (loss 1.4973): 87%|βββββββββ | 1087/1250 [06:07<00:52, 3.08it/s]
Training 1/1 epoch (loss 1.4973): 87%|βββββββββ | 1088/1250 [06:07<00:52, 3.09it/s]
Training 1/1 epoch (loss 1.4214): 87%|βββββββββ | 1088/1250 [06:08<00:52, 3.09it/s]
Training 1/1 epoch (loss 1.4214): 87%|βββββββββ | 1089/1250 [06:08<00:51, 3.14it/s]
Training 1/1 epoch (loss 1.5623): 87%|βββββββββ | 1089/1250 [06:08<00:51, 3.14it/s]
Training 1/1 epoch (loss 1.5623): 87%|βββββββββ | 1090/1250 [06:08<00:50, 3.20it/s]
Training 1/1 epoch (loss 1.4239): 87%|βββββββββ | 1090/1250 [06:08<00:50, 3.20it/s]
Training 1/1 epoch (loss 1.4239): 87%|βββββββββ | 1091/1250 [06:08<00:51, 3.11it/s]
Training 1/1 epoch (loss 1.3967): 87%|βββββββββ | 1091/1250 [06:08<00:51, 3.11it/s]
Training 1/1 epoch (loss 1.3967): 87%|βββββββββ | 1092/1250 [06:08<00:51, 3.07it/s]
Training 1/1 epoch (loss 1.4487): 87%|βββββββββ | 1092/1250 [06:09<00:51, 3.07it/s]
Training 1/1 epoch (loss 1.4487): 87%|βββββββββ | 1093/1250 [06:09<00:49, 3.15it/s]
Training 1/1 epoch (loss 1.3463): 87%|βββββββββ | 1093/1250 [06:09<00:49, 3.15it/s]
Training 1/1 epoch (loss 1.3463): 88%|βββββββββ | 1094/1250 [06:09<00:50, 3.10it/s]
Training 1/1 epoch (loss 1.5077): 88%|βββββββββ | 1094/1250 [06:09<00:50, 3.10it/s]
Training 1/1 epoch (loss 1.5077): 88%|βββββββββ | 1095/1250 [06:09<00:49, 3.11it/s]
Training 1/1 epoch (loss 1.5868): 88%|βββββββββ | 1095/1250 [06:10<00:49, 3.11it/s]
Training 1/1 epoch (loss 1.5868): 88%|βββββββββ | 1096/1250 [06:10<00:49, 3.11it/s]
Training 1/1 epoch (loss 1.5641): 88%|βββββββββ | 1096/1250 [06:10<00:49, 3.11it/s]
Training 1/1 epoch (loss 1.5641): 88%|βββββββββ | 1097/1250 [06:10<00:47, 3.20it/s]
Training 1/1 epoch (loss 1.5654): 88%|βββββββββ | 1097/1250 [06:10<00:47, 3.20it/s]
Training 1/1 epoch (loss 1.5654): 88%|βββββββββ | 1098/1250 [06:10<00:48, 3.16it/s]
Training 1/1 epoch (loss 1.5231): 88%|βββββββββ | 1098/1250 [06:11<00:48, 3.16it/s]
Training 1/1 epoch (loss 1.5231): 88%|βββββββββ | 1099/1250 [06:11<00:48, 3.14it/s]
Training 1/1 epoch (loss 1.5237): 88%|βββββββββ | 1099/1250 [06:11<00:48, 3.14it/s]
Training 1/1 epoch (loss 1.5237): 88%|βββββββββ | 1100/1250 [06:11<00:46, 3.21it/s]
Training 1/1 epoch (loss 1.5343): 88%|βββββββββ | 1100/1250 [06:11<00:46, 3.21it/s]
Training 1/1 epoch (loss 1.5343): 88%|βββββββββ | 1101/1250 [06:11<00:45, 3.29it/s]
Training 1/1 epoch (loss 1.6430): 88%|βββββββββ | 1101/1250 [06:12<00:45, 3.29it/s]
Training 1/1 epoch (loss 1.6430): 88%|βββββββββ | 1102/1250 [06:12<00:45, 3.23it/s]
Training 1/1 epoch (loss 1.4826): 88%|βββββββββ | 1102/1250 [06:12<00:45, 3.23it/s]
Training 1/1 epoch (loss 1.4826): 88%|βββββββββ | 1103/1250 [06:12<00:45, 3.24it/s]
Training 1/1 epoch (loss 1.5137): 88%|βββββββββ | 1103/1250 [06:12<00:45, 3.24it/s]
Training 1/1 epoch (loss 1.5137): 88%|βββββββββ | 1104/1250 [06:12<00:45, 3.19it/s]
Training 1/1 epoch (loss 1.4560): 88%|βββββββββ | 1104/1250 [06:13<00:45, 3.19it/s]
Training 1/1 epoch (loss 1.4560): 88%|βββββββββ | 1105/1250 [06:13<00:46, 3.10it/s]
Training 1/1 epoch (loss 1.5660): 88%|βββββββββ | 1105/1250 [06:13<00:46, 3.10it/s]
Training 1/1 epoch (loss 1.5660): 88%|βββββββββ | 1106/1250 [06:13<00:45, 3.16it/s]
Training 1/1 epoch (loss 1.4244): 88%|βββββββββ | 1106/1250 [06:13<00:45, 3.16it/s]
Training 1/1 epoch (loss 1.4244): 89%|βββββββββ | 1107/1250 [06:13<00:44, 3.24it/s]
Training 1/1 epoch (loss 1.4205): 89%|βββββββββ | 1107/1250 [06:13<00:44, 3.24it/s]
Training 1/1 epoch (loss 1.4205): 89%|βββββββββ | 1108/1250 [06:13<00:44, 3.20it/s]
Training 1/1 epoch (loss 1.4582): 89%|βββββββββ | 1108/1250 [06:14<00:44, 3.20it/s]
Training 1/1 epoch (loss 1.4582): 89%|βββββββββ | 1109/1250 [06:14<00:44, 3.14it/s]
Training 1/1 epoch (loss 1.6196): 89%|βββββββββ | 1109/1250 [06:14<00:44, 3.14it/s]
Training 1/1 epoch (loss 1.6196): 89%|βββββββββ | 1110/1250 [06:14<00:48, 2.91it/s]
Training 1/1 epoch (loss 1.5100): 89%|βββββββββ | 1110/1250 [06:15<00:48, 2.91it/s]
Training 1/1 epoch (loss 1.5100): 89%|βββββββββ | 1111/1250 [06:15<00:49, 2.81it/s]
Training 1/1 epoch (loss 1.5168): 89%|βββββββββ | 1111/1250 [06:15<00:49, 2.81it/s]
Training 1/1 epoch (loss 1.5168): 89%|βββββββββ | 1112/1250 [06:15<00:48, 2.87it/s]
Training 1/1 epoch (loss 1.3217): 89%|βββββββββ | 1112/1250 [06:15<00:48, 2.87it/s]
Training 1/1 epoch (loss 1.3217): 89%|βββββββββ | 1113/1250 [06:15<00:45, 3.00it/s]
Training 1/1 epoch (loss 1.5053): 89%|βββββββββ | 1113/1250 [06:16<00:45, 3.00it/s]
Training 1/1 epoch (loss 1.5053): 89%|βββββββββ | 1114/1250 [06:16<00:44, 3.03it/s]
Training 1/1 epoch (loss 1.5441): 89%|βββββββββ | 1114/1250 [06:16<00:44, 3.03it/s]
Training 1/1 epoch (loss 1.5441): 89%|βββββββββ | 1115/1250 [06:16<00:43, 3.07it/s]
Training 1/1 epoch (loss 1.5006): 89%|βββββββββ | 1115/1250 [06:16<00:43, 3.07it/s]
Training 1/1 epoch (loss 1.5006): 89%|βββββββββ | 1116/1250 [06:16<00:42, 3.13it/s]
Training 1/1 epoch (loss 1.5660): 89%|βββββββββ | 1116/1250 [06:17<00:42, 3.13it/s]
Training 1/1 epoch (loss 1.5660): 89%|βββββββββ | 1117/1250 [06:17<00:42, 3.10it/s]
Training 1/1 epoch (loss 1.6200): 89%|βββββββββ | 1117/1250 [06:17<00:42, 3.10it/s]
Training 1/1 epoch (loss 1.6200): 89%|βββββββββ | 1118/1250 [06:17<00:42, 3.11it/s]
Training 1/1 epoch (loss 1.4577): 89%|βββββββββ | 1118/1250 [06:17<00:42, 3.11it/s]
Training 1/1 epoch (loss 1.4577): 90%|βββββββββ | 1119/1250 [06:17<00:41, 3.12it/s]
Training 1/1 epoch (loss 1.5416): 90%|βββββββββ | 1119/1250 [06:17<00:41, 3.12it/s]
Training 1/1 epoch (loss 1.5416): 90%|βββββββββ | 1120/1250 [06:17<00:41, 3.13it/s]
Training 1/1 epoch (loss 1.5913): 90%|βββββββββ | 1120/1250 [06:18<00:41, 3.13it/s]
Training 1/1 epoch (loss 1.5913): 90%|βββββββββ | 1121/1250 [06:18<00:41, 3.14it/s]
Training 1/1 epoch (loss 1.6258): 90%|βββββββββ | 1121/1250 [06:18<00:41, 3.14it/s]
Training 1/1 epoch (loss 1.6258): 90%|βββββββββ | 1122/1250 [06:18<00:39, 3.24it/s]
Training 1/1 epoch (loss 1.4931): 90%|βββββββββ | 1122/1250 [06:18<00:39, 3.24it/s]
Training 1/1 epoch (loss 1.4931): 90%|βββββββββ | 1123/1250 [06:18<00:40, 3.14it/s]
Training 1/1 epoch (loss 1.6270): 90%|βββββββββ | 1123/1250 [06:19<00:40, 3.14it/s]
Training 1/1 epoch (loss 1.6270): 90%|βββββββββ | 1124/1250 [06:19<00:40, 3.12it/s]
Training 1/1 epoch (loss 1.5280): 90%|βββββββββ | 1124/1250 [06:19<00:40, 3.12it/s]
Training 1/1 epoch (loss 1.5280): 90%|βββββββββ | 1125/1250 [06:19<00:38, 3.21it/s]
Training 1/1 epoch (loss 1.4190): 90%|βββββββββ | 1125/1250 [06:19<00:38, 3.21it/s]
Training 1/1 epoch (loss 1.4190): 90%|βββββββββ | 1126/1250 [06:19<00:39, 3.11it/s]
Training 1/1 epoch (loss 1.4516): 90%|βββββββββ | 1126/1250 [06:20<00:39, 3.11it/s]
Training 1/1 epoch (loss 1.4516): 90%|βββββββββ | 1127/1250 [06:20<00:39, 3.10it/s]
Training 1/1 epoch (loss 1.6464): 90%|βββββββββ | 1127/1250 [06:20<00:39, 3.10it/s]
Training 1/1 epoch (loss 1.6464): 90%|βββββββββ | 1128/1250 [06:20<00:39, 3.06it/s]
Training 1/1 epoch (loss 1.4923): 90%|βββββββββ | 1128/1250 [06:20<00:39, 3.06it/s]
Training 1/1 epoch (loss 1.4923): 90%|βββββββββ | 1129/1250 [06:20<00:38, 3.11it/s]
Training 1/1 epoch (loss 1.5608): 90%|βββββββββ | 1129/1250 [06:21<00:38, 3.11it/s]
Training 1/1 epoch (loss 1.5608): 90%|βββββββββ | 1130/1250 [06:21<00:39, 3.07it/s]
Training 1/1 epoch (loss 1.5772): 90%|βββββββββ | 1130/1250 [06:21<00:39, 3.07it/s]
Training 1/1 epoch (loss 1.5772): 90%|βββββββββ | 1131/1250 [06:21<00:37, 3.15it/s]
Training 1/1 epoch (loss 1.5126): 90%|βββββββββ | 1131/1250 [06:21<00:37, 3.15it/s]
Training 1/1 epoch (loss 1.5126): 91%|βββββββββ | 1132/1250 [06:21<00:36, 3.19it/s]
Training 1/1 epoch (loss 1.5583): 91%|βββββββββ | 1132/1250 [06:22<00:36, 3.19it/s]
Training 1/1 epoch (loss 1.5583): 91%|βββββββββ | 1133/1250 [06:22<00:36, 3.20it/s]
Training 1/1 epoch (loss 1.6000): 91%|βββββββββ | 1133/1250 [06:22<00:36, 3.20it/s]
Training 1/1 epoch (loss 1.6000): 91%|βββββββββ | 1134/1250 [06:22<00:36, 3.18it/s]
Training 1/1 epoch (loss 1.4799): 91%|βββββββββ | 1134/1250 [06:22<00:36, 3.18it/s]
Training 1/1 epoch (loss 1.4799): 91%|βββββββββ | 1135/1250 [06:22<00:35, 3.22it/s]
Training 1/1 epoch (loss 1.5271): 91%|βββββββββ | 1135/1250 [06:23<00:35, 3.22it/s]
Training 1/1 epoch (loss 1.5271): 91%|βββββββββ | 1136/1250 [06:23<00:37, 3.00it/s]
Training 1/1 epoch (loss 1.4374): 91%|βββββββββ | 1136/1250 [06:23<00:37, 3.00it/s]
Training 1/1 epoch (loss 1.4374): 91%|βββββββββ | 1137/1250 [06:23<00:37, 2.99it/s]
Training 1/1 epoch (loss 1.6343): 91%|βββββββββ | 1137/1250 [06:24<00:37, 2.99it/s]
Training 1/1 epoch (loss 1.6343): 91%|βββββββββ | 1138/1250 [06:24<00:47, 2.35it/s]
Training 1/1 epoch (loss 1.5339): 91%|βββββββββ | 1138/1250 [06:24<00:47, 2.35it/s]
Training 1/1 epoch (loss 1.5339): 91%|βββββββββ | 1139/1250 [06:24<00:44, 2.51it/s]
Training 1/1 epoch (loss 1.4878): 91%|βββββββββ | 1139/1250 [06:24<00:44, 2.51it/s]
Training 1/1 epoch (loss 1.4878): 91%|βββββββββ | 1140/1250 [06:24<00:40, 2.71it/s]
Training 1/1 epoch (loss 1.4166): 91%|βββββββββ | 1140/1250 [06:25<00:40, 2.71it/s]
Training 1/1 epoch (loss 1.4166): 91%|ββββββββββ| 1141/1250 [06:25<00:39, 2.73it/s]
Training 1/1 epoch (loss 1.4479): 91%|ββββββββββ| 1141/1250 [06:25<00:39, 2.73it/s]
Training 1/1 epoch (loss 1.4479): 91%|ββββββββββ| 1142/1250 [06:25<00:38, 2.80it/s]
Training 1/1 epoch (loss 1.4248): 91%|ββββββββββ| 1142/1250 [06:25<00:38, 2.80it/s]
Training 1/1 epoch (loss 1.4248): 91%|ββββββββββ| 1143/1250 [06:25<00:35, 2.99it/s]
Training 1/1 epoch (loss 1.6552): 91%|ββββββββββ| 1143/1250 [06:26<00:35, 2.99it/s]
Training 1/1 epoch (loss 1.6552): 92%|ββββββββββ| 1144/1250 [06:26<00:35, 2.99it/s]
Training 1/1 epoch (loss 1.5583): 92%|ββββββββββ| 1144/1250 [06:26<00:35, 2.99it/s]
Training 1/1 epoch (loss 1.5583): 92%|ββββββββββ| 1145/1250 [06:26<00:34, 3.03it/s]
Training 1/1 epoch (loss 1.4931): 92%|ββββββββββ| 1145/1250 [06:26<00:34, 3.03it/s]
Training 1/1 epoch (loss 1.4931): 92%|ββββββββββ| 1146/1250 [06:26<00:33, 3.09it/s]
Training 1/1 epoch (loss 1.5167): 92%|ββββββββββ| 1146/1250 [06:26<00:33, 3.09it/s]
Training 1/1 epoch (loss 1.5167): 92%|ββββββββββ| 1147/1250 [06:26<00:33, 3.05it/s]
Training 1/1 epoch (loss 1.6796): 92%|ββββββββββ| 1147/1250 [06:27<00:33, 3.05it/s]
Training 1/1 epoch (loss 1.6796): 92%|ββββββββββ| 1148/1250 [06:27<00:33, 3.08it/s]
Training 1/1 epoch (loss 1.4733): 92%|ββββββββββ| 1148/1250 [06:27<00:33, 3.08it/s]
Training 1/1 epoch (loss 1.4733): 92%|ββββββββββ| 1149/1250 [06:27<00:32, 3.12it/s]
Training 1/1 epoch (loss 1.5034): 92%|ββββββββββ| 1149/1250 [06:27<00:32, 3.12it/s]
Training 1/1 epoch (loss 1.5034): 92%|ββββββββββ| 1150/1250 [06:27<00:31, 3.17it/s]
Training 1/1 epoch (loss 1.7206): 92%|ββββββββββ| 1150/1250 [06:28<00:31, 3.17it/s]
Training 1/1 epoch (loss 1.7206): 92%|ββββββββββ| 1151/1250 [06:28<00:31, 3.19it/s]
Training 1/1 epoch (loss 1.7148): 92%|ββββββββββ| 1151/1250 [06:28<00:31, 3.19it/s]
Training 1/1 epoch (loss 1.7148): 92%|ββββββββββ| 1152/1250 [06:28<00:31, 3.14it/s]
Training 1/1 epoch (loss 1.5488): 92%|ββββββββββ| 1152/1250 [06:28<00:31, 3.14it/s]
Training 1/1 epoch (loss 1.5488): 92%|ββββββββββ| 1153/1250 [06:28<00:31, 3.12it/s]
Training 1/1 epoch (loss 1.6425): 92%|ββββββββββ| 1153/1250 [06:29<00:31, 3.12it/s]
Training 1/1 epoch (loss 1.6425): 92%|ββββββββββ| 1154/1250 [06:29<00:31, 3.08it/s]
Training 1/1 epoch (loss 1.3657): 92%|ββββββββββ| 1154/1250 [06:29<00:31, 3.08it/s]
Training 1/1 epoch (loss 1.3657): 92%|ββββββββββ| 1155/1250 [06:29<00:30, 3.15it/s]
Training 1/1 epoch (loss 1.4765): 92%|ββββββββββ| 1155/1250 [06:29<00:30, 3.15it/s]
Training 1/1 epoch (loss 1.4765): 92%|ββββββββββ| 1156/1250 [06:29<00:29, 3.15it/s]
Training 1/1 epoch (loss 1.5515): 92%|ββββββββββ| 1156/1250 [06:30<00:29, 3.15it/s]
Training 1/1 epoch (loss 1.5515): 93%|ββββββββββ| 1157/1250 [06:30<00:30, 3.04it/s]
Training 1/1 epoch (loss 1.4198): 93%|ββββββββββ| 1157/1250 [06:30<00:30, 3.04it/s]
Training 1/1 epoch (loss 1.4198): 93%|ββββββββββ| 1158/1250 [06:30<00:30, 3.00it/s]
Training 1/1 epoch (loss 1.4835): 93%|ββββββββββ| 1158/1250 [06:30<00:30, 3.00it/s]
Training 1/1 epoch (loss 1.4835): 93%|ββββββββββ| 1159/1250 [06:30<00:31, 2.85it/s]
Training 1/1 epoch (loss 1.4312): 93%|ββββββββββ| 1159/1250 [06:31<00:31, 2.85it/s]
Training 1/1 epoch (loss 1.4312): 93%|ββββββββββ| 1160/1250 [06:31<00:31, 2.82it/s]
Training 1/1 epoch (loss 1.5838): 93%|ββββββββββ| 1160/1250 [06:31<00:31, 2.82it/s]
Training 1/1 epoch (loss 1.5838): 93%|ββββββββββ| 1161/1250 [06:31<00:30, 2.93it/s]
Training 1/1 epoch (loss 1.5117): 93%|ββββββββββ| 1161/1250 [06:31<00:30, 2.93it/s]
Training 1/1 epoch (loss 1.5117): 93%|ββββββββββ| 1162/1250 [06:31<00:28, 3.05it/s]
Training 1/1 epoch (loss 1.4153): 93%|ββββββββββ| 1162/1250 [06:32<00:28, 3.05it/s]
Training 1/1 epoch (loss 1.4153): 93%|ββββββββββ| 1163/1250 [06:32<00:28, 3.06it/s]
Training 1/1 epoch (loss 1.5018): 93%|ββββββββββ| 1163/1250 [06:32<00:28, 3.06it/s]
Training 1/1 epoch (loss 1.5018): 93%|ββββββββββ| 1164/1250 [06:32<00:27, 3.13it/s]
Training 1/1 epoch (loss 1.3986): 93%|ββββββββββ| 1164/1250 [06:32<00:27, 3.13it/s]
Training 1/1 epoch (loss 1.3986): 93%|ββββββββββ| 1165/1250 [06:32<00:27, 3.10it/s]
Training 1/1 epoch (loss 1.6476): 93%|ββββββββββ| 1165/1250 [06:33<00:27, 3.10it/s]
Training 1/1 epoch (loss 1.6476): 93%|ββββββββββ| 1166/1250 [06:33<00:27, 3.06it/s]
Training 1/1 epoch (loss 1.5272): 93%|ββββββββββ| 1166/1250 [06:33<00:27, 3.06it/s]
Training 1/1 epoch (loss 1.5272): 93%|ββββββββββ| 1167/1250 [06:33<00:27, 3.04it/s]
Training 1/1 epoch (loss 1.6202): 93%|ββββββββββ| 1167/1250 [06:33<00:27, 3.04it/s]
Training 1/1 epoch (loss 1.6202): 93%|ββββββββββ| 1168/1250 [06:33<00:27, 3.02it/s]
Training 1/1 epoch (loss 1.6121): 93%|ββββββββββ| 1168/1250 [06:34<00:27, 3.02it/s]
Training 1/1 epoch (loss 1.6121): 94%|ββββββββββ| 1169/1250 [06:34<00:26, 3.08it/s]
Training 1/1 epoch (loss 1.6579): 94%|ββββββββββ| 1169/1250 [06:34<00:26, 3.08it/s]
Training 1/1 epoch (loss 1.6579): 94%|ββββββββββ| 1170/1250 [06:34<00:26, 3.03it/s]
Training 1/1 epoch (loss 1.4513): 94%|ββββββββββ| 1170/1250 [06:34<00:26, 3.03it/s]
Training 1/1 epoch (loss 1.4513): 94%|ββββββββββ| 1171/1250 [06:34<00:26, 2.93it/s]
Training 1/1 epoch (loss 1.5072): 94%|ββββββββββ| 1171/1250 [06:35<00:26, 2.93it/s]
Training 1/1 epoch (loss 1.5072): 94%|ββββββββββ| 1172/1250 [06:35<00:25, 3.04it/s]
Training 1/1 epoch (loss 1.4343): 94%|ββββββββββ| 1172/1250 [06:35<00:25, 3.04it/s]
Training 1/1 epoch (loss 1.4343): 94%|ββββββββββ| 1173/1250 [06:35<00:25, 2.97it/s]
Training 1/1 epoch (loss 1.5964): 94%|ββββββββββ| 1173/1250 [06:35<00:25, 2.97it/s]
Training 1/1 epoch (loss 1.5964): 94%|ββββββββββ| 1174/1250 [06:35<00:25, 3.03it/s]
Training 1/1 epoch (loss 1.5049): 94%|ββββββββββ| 1174/1250 [06:36<00:25, 3.03it/s]
Training 1/1 epoch (loss 1.5049): 94%|ββββββββββ| 1175/1250 [06:36<00:24, 3.07it/s]
Training 1/1 epoch (loss 1.5689): 94%|ββββββββββ| 1175/1250 [06:36<00:24, 3.07it/s]
Training 1/1 epoch (loss 1.5689): 94%|ββββββββββ| 1176/1250 [06:36<00:24, 3.00it/s]
Training 1/1 epoch (loss 1.5663): 94%|ββββββββββ| 1176/1250 [06:36<00:24, 3.00it/s]
Training 1/1 epoch (loss 1.5663): 94%|ββββββββββ| 1177/1250 [06:36<00:24, 3.01it/s]
Training 1/1 epoch (loss 1.5215): 94%|ββββββββββ| 1177/1250 [06:37<00:24, 3.01it/s]
Training 1/1 epoch (loss 1.5215): 94%|ββββββββββ| 1178/1250 [06:37<00:23, 3.00it/s]
Training 1/1 epoch (loss 1.4202): 94%|ββββββββββ| 1178/1250 [06:37<00:23, 3.00it/s]
Training 1/1 epoch (loss 1.4202): 94%|ββββββββββ| 1179/1250 [06:37<00:22, 3.12it/s]
Training 1/1 epoch (loss 1.5137): 94%|ββββββββββ| 1179/1250 [06:37<00:22, 3.12it/s]
Training 1/1 epoch (loss 1.5137): 94%|ββββββββββ| 1180/1250 [06:37<00:22, 3.14it/s]
Training 1/1 epoch (loss 1.4328): 94%|ββββββββββ| 1180/1250 [06:38<00:22, 3.14it/s]
Training 1/1 epoch (loss 1.4328): 94%|ββββββββββ| 1181/1250 [06:38<00:21, 3.15it/s]
Training 1/1 epoch (loss 1.4496): 94%|ββββββββββ| 1181/1250 [06:38<00:21, 3.15it/s]
Training 1/1 epoch (loss 1.4496): 95%|ββββββββββ| 1182/1250 [06:38<00:21, 3.12it/s]
Training 1/1 epoch (loss 1.5729): 95%|ββββββββββ| 1182/1250 [06:38<00:21, 3.12it/s]
Training 1/1 epoch (loss 1.5729): 95%|ββββββββββ| 1183/1250 [06:38<00:21, 3.16it/s]
Training 1/1 epoch (loss 1.4681): 95%|ββββββββββ| 1183/1250 [06:39<00:21, 3.16it/s]
Training 1/1 epoch (loss 1.4681): 95%|ββββββββββ| 1184/1250 [06:39<00:21, 3.02it/s]
Training 1/1 epoch (loss 1.4797): 95%|ββββββββββ| 1184/1250 [06:39<00:21, 3.02it/s]
Training 1/1 epoch (loss 1.4797): 95%|ββββββββββ| 1185/1250 [06:39<00:21, 3.03it/s]
Training 1/1 epoch (loss 1.4507): 95%|ββββββββββ| 1185/1250 [06:39<00:21, 3.03it/s]
Training 1/1 epoch (loss 1.4507): 95%|ββββββββββ| 1186/1250 [06:39<00:20, 3.11it/s]
Training 1/1 epoch (loss 1.4444): 95%|ββββββββββ| 1186/1250 [06:40<00:20, 3.11it/s]
Training 1/1 epoch (loss 1.4444): 95%|ββββββββββ| 1187/1250 [06:40<00:20, 3.12it/s]
Training 1/1 epoch (loss 1.5227): 95%|ββββββββββ| 1187/1250 [06:40<00:20, 3.12it/s]
Training 1/1 epoch (loss 1.5227): 95%|ββββββββββ| 1188/1250 [06:40<00:19, 3.11it/s]
Training 1/1 epoch (loss 1.4971): 95%|ββββββββββ| 1188/1250 [06:40<00:19, 3.11it/s]
Training 1/1 epoch (loss 1.4971): 95%|ββββββββββ| 1189/1250 [06:40<00:20, 2.99it/s]
Training 1/1 epoch (loss 1.4389): 95%|ββββββββββ| 1189/1250 [06:41<00:20, 2.99it/s]
Training 1/1 epoch (loss 1.4389): 95%|ββββββββββ| 1190/1250 [06:41<00:21, 2.85it/s]
Training 1/1 epoch (loss 1.5841): 95%|ββββββββββ| 1190/1250 [06:41<00:21, 2.85it/s]
Training 1/1 epoch (loss 1.5841): 95%|ββββββββββ| 1191/1250 [06:41<00:20, 2.93it/s]
Training 1/1 epoch (loss 1.5726): 95%|ββββββββββ| 1191/1250 [06:41<00:20, 2.93it/s]
Training 1/1 epoch (loss 1.5726): 95%|ββββββββββ| 1192/1250 [06:41<00:19, 2.97it/s]
Training 1/1 epoch (loss 1.4500): 95%|ββββββββββ| 1192/1250 [06:42<00:19, 2.97it/s]
Training 1/1 epoch (loss 1.4500): 95%|ββββββββββ| 1193/1250 [06:42<00:18, 3.02it/s]
Training 1/1 epoch (loss 1.4855): 95%|ββββββββββ| 1193/1250 [06:42<00:18, 3.02it/s]
Training 1/1 epoch (loss 1.4855): 96%|ββββββββββ| 1194/1250 [06:42<00:18, 3.09it/s]
Training 1/1 epoch (loss 1.5814): 96%|ββββββββββ| 1194/1250 [06:42<00:18, 3.09it/s]
Training 1/1 epoch (loss 1.5814): 96%|ββββββββββ| 1195/1250 [06:42<00:17, 3.11it/s]
Training 1/1 epoch (loss 1.5712): 96%|ββββββββββ| 1195/1250 [06:43<00:17, 3.11it/s]
Training 1/1 epoch (loss 1.5712): 96%|ββββββββββ| 1196/1250 [06:43<00:17, 3.08it/s]
Training 1/1 epoch (loss 1.3277): 96%|ββββββββββ| 1196/1250 [06:43<00:17, 3.08it/s]
Training 1/1 epoch (loss 1.3277): 96%|ββββββββββ| 1197/1250 [06:43<00:17, 3.05it/s]
Training 1/1 epoch (loss 1.5310): 96%|ββββββββββ| 1197/1250 [06:43<00:17, 3.05it/s]
Training 1/1 epoch (loss 1.5310): 96%|ββββββββββ| 1198/1250 [06:43<00:16, 3.09it/s]
Training 1/1 epoch (loss 1.4462): 96%|ββββββββββ| 1198/1250 [06:43<00:16, 3.09it/s]
Training 1/1 epoch (loss 1.4462): 96%|ββββββββββ| 1199/1250 [06:43<00:15, 3.19it/s]
Training 1/1 epoch (loss 1.4144): 96%|ββββββββββ| 1199/1250 [06:44<00:15, 3.19it/s]
Training 1/1 epoch (loss 1.4144): 96%|ββββββββββ| 1200/1250 [06:44<00:15, 3.13it/s]
Training 1/1 epoch (loss 1.4437): 96%|ββββββββββ| 1200/1250 [06:44<00:15, 3.13it/s]
Training 1/1 epoch (loss 1.4437): 96%|ββββββββββ| 1201/1250 [06:44<00:15, 3.12it/s]
Training 1/1 epoch (loss 1.4944): 96%|ββββββββββ| 1201/1250 [06:45<00:15, 3.12it/s]
Training 1/1 epoch (loss 1.4944): 96%|ββββββββββ| 1202/1250 [06:45<00:17, 2.81it/s]
Training 1/1 epoch (loss 1.5560): 96%|ββββββββββ| 1202/1250 [06:45<00:17, 2.81it/s]
Training 1/1 epoch (loss 1.5560): 96%|ββββββββββ| 1203/1250 [06:45<00:16, 2.77it/s]
Training 1/1 epoch (loss 1.4725): 96%|ββββββββββ| 1203/1250 [06:45<00:16, 2.77it/s]
Training 1/1 epoch (loss 1.4725): 96%|ββββββββββ| 1204/1250 [06:45<00:16, 2.78it/s]
Training 1/1 epoch (loss 1.4574): 96%|ββββββββββ| 1204/1250 [06:46<00:16, 2.78it/s]
Training 1/1 epoch (loss 1.4574): 96%|ββββββββββ| 1205/1250 [06:46<00:16, 2.70it/s]
Training 1/1 epoch (loss 1.5634): 96%|ββββββββββ| 1205/1250 [06:46<00:16, 2.70it/s]
Training 1/1 epoch (loss 1.5634): 96%|ββββββββββ| 1206/1250 [06:46<00:15, 2.84it/s]
Training 1/1 epoch (loss 1.4670): 96%|ββββββββββ| 1206/1250 [06:46<00:15, 2.84it/s]
Training 1/1 epoch (loss 1.4670): 97%|ββββββββββ| 1207/1250 [06:46<00:15, 2.81it/s]
Training 1/1 epoch (loss 1.5262): 97%|ββββββββββ| 1207/1250 [06:47<00:15, 2.81it/s]
Training 1/1 epoch (loss 1.5262): 97%|ββββββββββ| 1208/1250 [06:47<00:15, 2.71it/s]
Training 1/1 epoch (loss 1.5171): 97%|ββββββββββ| 1208/1250 [06:47<00:15, 2.71it/s]
Training 1/1 epoch (loss 1.5171): 97%|ββββββββββ| 1209/1250 [06:47<00:14, 2.77it/s]
Training 1/1 epoch (loss 1.5384): 97%|ββββββββββ| 1209/1250 [06:47<00:14, 2.77it/s]
Training 1/1 epoch (loss 1.5384): 97%|ββββββββββ| 1210/1250 [06:47<00:14, 2.77it/s]
Training 1/1 epoch (loss 1.5710): 97%|ββββββββββ| 1210/1250 [06:48<00:14, 2.77it/s]
Training 1/1 epoch (loss 1.5710): 97%|ββββββββββ| 1211/1250 [06:48<00:13, 2.81it/s]
Training 1/1 epoch (loss 1.4317): 97%|ββββββββββ| 1211/1250 [06:48<00:13, 2.81it/s]
Training 1/1 epoch (loss 1.4317): 97%|ββββββββββ| 1212/1250 [06:48<00:13, 2.89it/s]
Training 1/1 epoch (loss 1.4706): 97%|ββββββββββ| 1212/1250 [06:48<00:13, 2.89it/s]
Training 1/1 epoch (loss 1.4706): 97%|ββββββββββ| 1213/1250 [06:48<00:12, 2.90it/s]
Training 1/1 epoch (loss 1.5718): 97%|ββββββββββ| 1213/1250 [06:49<00:12, 2.90it/s]
Training 1/1 epoch (loss 1.5718): 97%|ββββββββββ| 1214/1250 [06:49<00:12, 2.89it/s]
Training 1/1 epoch (loss 1.5113): 97%|ββββββββββ| 1214/1250 [06:49<00:12, 2.89it/s]
Training 1/1 epoch (loss 1.5113): 97%|ββββββββββ| 1215/1250 [06:49<00:11, 2.94it/s]
Training 1/1 epoch (loss 1.5213): 97%|ββββββββββ| 1215/1250 [06:50<00:11, 2.94it/s]
Training 1/1 epoch (loss 1.5213): 97%|ββββββββββ| 1216/1250 [06:50<00:11, 2.90it/s]
Training 1/1 epoch (loss 1.4996): 97%|ββββββββββ| 1216/1250 [06:50<00:11, 2.90it/s]
Training 1/1 epoch (loss 1.4996): 97%|ββββββββββ| 1217/1250 [06:50<00:11, 2.96it/s]
Training 1/1 epoch (loss 1.4384): 97%|ββββββββββ| 1217/1250 [06:50<00:11, 2.96it/s]
Training 1/1 epoch (loss 1.4384): 97%|ββββββββββ| 1218/1250 [06:50<00:10, 3.04it/s]
Training 1/1 epoch (loss 1.5890): 97%|ββββββββββ| 1218/1250 [06:50<00:10, 3.04it/s]
Training 1/1 epoch (loss 1.5890): 98%|ββββββββββ| 1219/1250 [06:50<00:10, 2.99it/s]
Training 1/1 epoch (loss 1.3565): 98%|ββββββββββ| 1219/1250 [06:51<00:10, 2.99it/s]
Training 1/1 epoch (loss 1.3565): 98%|ββββββββββ| 1220/1250 [06:51<00:10, 2.98it/s]
Training 1/1 epoch (loss 1.5299): 98%|ββββββββββ| 1220/1250 [06:51<00:10, 2.98it/s]
Training 1/1 epoch (loss 1.5299): 98%|ββββββββββ| 1221/1250 [06:51<00:09, 3.06it/s]
Training 1/1 epoch (loss 1.4789): 98%|ββββββββββ| 1221/1250 [06:51<00:09, 3.06it/s]
Training 1/1 epoch (loss 1.4789): 98%|ββββββββββ| 1222/1250 [06:51<00:08, 3.15it/s]
Training 1/1 epoch (loss 1.5145): 98%|ββββββββββ| 1222/1250 [06:52<00:08, 3.15it/s]
Training 1/1 epoch (loss 1.5145): 98%|ββββββββββ| 1223/1250 [06:52<00:08, 3.15it/s]
Training 1/1 epoch (loss 1.4646): 98%|ββββββββββ| 1223/1250 [06:52<00:08, 3.15it/s]
Training 1/1 epoch (loss 1.4646): 98%|ββββββββββ| 1224/1250 [06:52<00:08, 3.07it/s]
Training 1/1 epoch (loss 1.4852): 98%|ββββββββββ| 1224/1250 [06:52<00:08, 3.07it/s]
Training 1/1 epoch (loss 1.4852): 98%|ββββββββββ| 1225/1250 [06:52<00:08, 3.12it/s]
Training 1/1 epoch (loss 1.6219): 98%|ββββββββββ| 1225/1250 [06:53<00:08, 3.12it/s]
Training 1/1 epoch (loss 1.6219): 98%|ββββββββββ| 1226/1250 [06:53<00:08, 2.83it/s]
Training 1/1 epoch (loss 1.4843): 98%|ββββββββββ| 1226/1250 [06:53<00:08, 2.83it/s]
Training 1/1 epoch (loss 1.4843): 98%|ββββββββββ| 1227/1250 [06:53<00:07, 2.94it/s]
Training 1/1 epoch (loss 1.3903): 98%|ββββββββββ| 1227/1250 [06:53<00:07, 2.94it/s]
Training 1/1 epoch (loss 1.3903): 98%|ββββββββββ| 1228/1250 [06:53<00:07, 3.05it/s]
Training 1/1 epoch (loss 1.6273): 98%|ββββββββββ| 1228/1250 [06:54<00:07, 3.05it/s]
Training 1/1 epoch (loss 1.6273): 98%|ββββββββββ| 1229/1250 [06:54<00:06, 3.11it/s]
Training 1/1 epoch (loss 1.4770): 98%|ββββββββββ| 1229/1250 [06:54<00:06, 3.11it/s]
Training 1/1 epoch (loss 1.4770): 98%|ββββββββββ| 1230/1250 [06:54<00:06, 3.07it/s]
Training 1/1 epoch (loss 1.5178): 98%|ββββββββββ| 1230/1250 [06:54<00:06, 3.07it/s]
Training 1/1 epoch (loss 1.5178): 98%|ββββββββββ| 1231/1250 [06:54<00:06, 3.00it/s]
Training 1/1 epoch (loss 1.5180): 98%|ββββββββββ| 1231/1250 [06:55<00:06, 3.00it/s]
Training 1/1 epoch (loss 1.5180): 99%|ββββββββββ| 1232/1250 [06:55<00:06, 2.73it/s]
Training 1/1 epoch (loss 1.4576): 99%|ββββββββββ| 1232/1250 [06:55<00:06, 2.73it/s]
Training 1/1 epoch (loss 1.4576): 99%|ββββββββββ| 1233/1250 [06:55<00:06, 2.75it/s]
Training 1/1 epoch (loss 1.5662): 99%|ββββββββββ| 1233/1250 [06:56<00:06, 2.75it/s]
Training 1/1 epoch (loss 1.5662): 99%|ββββββββββ| 1234/1250 [06:56<00:05, 2.78it/s]
Training 1/1 epoch (loss 1.4561): 99%|ββββββββββ| 1234/1250 [06:56<00:05, 2.78it/s]
Training 1/1 epoch (loss 1.4561): 99%|ββββββββββ| 1235/1250 [06:56<00:05, 2.82it/s]
Training 1/1 epoch (loss 1.5677): 99%|ββββββββββ| 1235/1250 [06:56<00:05, 2.82it/s]
Training 1/1 epoch (loss 1.5677): 99%|ββββββββββ| 1236/1250 [06:56<00:05, 2.77it/s]
Training 1/1 epoch (loss 1.4726): 99%|ββββββββββ| 1236/1250 [06:57<00:05, 2.77it/s]
Training 1/1 epoch (loss 1.4726): 99%|ββββββββββ| 1237/1250 [06:57<00:04, 2.73it/s]
Training 1/1 epoch (loss 1.5596): 99%|ββββββββββ| 1237/1250 [06:57<00:04, 2.73it/s]
Training 1/1 epoch (loss 1.5596): 99%|ββββββββββ| 1238/1250 [06:57<00:04, 2.80it/s]
Training 1/1 epoch (loss 1.3780): 99%|ββββββββββ| 1238/1250 [06:57<00:04, 2.80it/s]
Training 1/1 epoch (loss 1.3780): 99%|ββββββββββ| 1239/1250 [06:57<00:03, 2.92it/s]
Training 1/1 epoch (loss 1.6324): 99%|ββββββββββ| 1239/1250 [06:58<00:03, 2.92it/s]
Training 1/1 epoch (loss 1.6324): 99%|ββββββββββ| 1240/1250 [06:58<00:03, 2.93it/s]
Training 1/1 epoch (loss 1.4356): 99%|ββββββββββ| 1240/1250 [06:58<00:03, 2.93it/s]
Training 1/1 epoch (loss 1.4356): 99%|ββββββββββ| 1241/1250 [06:58<00:02, 3.02it/s]
Training 1/1 epoch (loss 1.4644): 99%|ββββββββββ| 1241/1250 [06:58<00:02, 3.02it/s]
Training 1/1 epoch (loss 1.4644): 99%|ββββββββββ| 1242/1250 [06:58<00:02, 3.04it/s]
Training 1/1 epoch (loss 1.3843): 99%|ββββββββββ| 1242/1250 [06:59<00:02, 3.04it/s]
Training 1/1 epoch (loss 1.3843): 99%|ββββββββββ| 1243/1250 [06:59<00:02, 3.07it/s]
Training 1/1 epoch (loss 1.5747): 99%|ββββββββββ| 1243/1250 [06:59<00:02, 3.07it/s]
Training 1/1 epoch (loss 1.5747): 100%|ββββββββββ| 1244/1250 [06:59<00:01, 3.09it/s]
Training 1/1 epoch (loss 1.5588): 100%|ββββββββββ| 1244/1250 [06:59<00:01, 3.09it/s]
Training 1/1 epoch (loss 1.5588): 100%|ββββββββββ| 1245/1250 [06:59<00:01, 3.14it/s]
Training 1/1 epoch (loss 1.5203): 100%|ββββββββββ| 1245/1250 [07:00<00:01, 3.14it/s]
Training 1/1 epoch (loss 1.5203): 100%|ββββββββββ| 1246/1250 [07:00<00:01, 3.07it/s]
Training 1/1 epoch (loss 1.3928): 100%|ββββββββββ| 1246/1250 [07:00<00:01, 3.07it/s]
Training 1/1 epoch (loss 1.3928): 100%|ββββββββββ| 1247/1250 [07:00<00:00, 3.14it/s]
Training 1/1 epoch (loss 1.4681): 100%|ββββββββββ| 1247/1250 [07:00<00:00, 3.14it/s]
Training 1/1 epoch (loss 1.4681): 100%|ββββββββββ| 1248/1250 [07:00<00:00, 2.90it/s]
Training 1/1 epoch (loss 1.5618): 100%|ββββββββββ| 1248/1250 [07:01<00:00, 2.90it/s]
Training 1/1 epoch (loss 1.5618): 100%|ββββββββββ| 1249/1250 [07:01<00:00, 2.76it/s]
Training 1/1 epoch (loss 1.5405): 100%|ββββββββββ| 1249/1250 [07:01<00:00, 2.76it/s]
Training 1/1 epoch (loss 1.5405): 100%|ββββββββββ| 1250/1250 [07:01<00:00, 2.95it/s]
Training 1/1 epoch (loss 1.5405): 100%|ββββββββββ| 1250/1250 [07:01<00:00, 2.97it/s] |