|
Training 1/1 epoch (loss 2.9431): 0%| | 0/1250 [00:05<?, ?it/s]
Training 1/1 epoch (loss 2.9431): 0%| | 1/1250 [00:05<1:53:28, 5.45s/it]
Training 1/1 epoch (loss 2.9469): 0%| | 1/1250 [00:07<1:53:28, 5.45s/it]
Training 1/1 epoch (loss 2.9469): 0%| | 2/1250 [00:07<1:06:30, 3.20s/it]
Training 1/1 epoch (loss 2.8404): 0%| | 2/1250 [00:07<1:06:30, 3.20s/it]
Training 1/1 epoch (loss 2.8404): 0%| | 3/1250 [00:07<39:16, 1.89s/it]
Training 1/1 epoch (loss 3.2492): 0%| | 3/1250 [00:07<39:16, 1.89s/it]
Training 1/1 epoch (loss 3.2492): 0%| | 4/1250 [00:07<26:19, 1.27s/it]
Training 1/1 epoch (loss 2.9856): 0%| | 4/1250 [00:08<26:19, 1.27s/it]
Training 1/1 epoch (loss 2.9856): 0%| | 5/1250 [00:08<19:52, 1.04it/s]
Training 1/1 epoch (loss 3.2084): 0%| | 5/1250 [00:08<19:52, 1.04it/s]
Training 1/1 epoch (loss 3.2084): 0%| | 6/1250 [00:08<15:21, 1.35it/s]
Training 1/1 epoch (loss 2.9686): 0%| | 6/1250 [00:08<15:21, 1.35it/s]
Training 1/1 epoch (loss 2.9686): 1%| | 7/1250 [00:08<12:48, 1.62it/s]
Training 1/1 epoch (loss 2.8229): 1%| | 7/1250 [00:09<12:48, 1.62it/s]
Training 1/1 epoch (loss 2.8229): 1%| | 8/1250 [00:09<11:25, 1.81it/s]
Training 1/1 epoch (loss 2.9485): 1%| | 8/1250 [00:09<11:25, 1.81it/s]
Training 1/1 epoch (loss 2.9485): 1%| | 9/1250 [00:09<10:04, 2.05it/s]
Training 1/1 epoch (loss 2.7161): 1%| | 9/1250 [00:09<10:04, 2.05it/s]
Training 1/1 epoch (loss 2.7161): 1%| | 10/1250 [00:09<08:55, 2.32it/s]
Training 1/1 epoch (loss 3.0050): 1%| | 10/1250 [00:10<08:55, 2.32it/s]
Training 1/1 epoch (loss 3.0050): 1%| | 11/1250 [00:10<08:40, 2.38it/s]
Training 1/1 epoch (loss 2.9368): 1%| | 11/1250 [00:10<08:40, 2.38it/s]
Training 1/1 epoch (loss 2.9368): 1%| | 12/1250 [00:10<08:01, 2.57it/s]
Training 1/1 epoch (loss 2.8264): 1%| | 12/1250 [00:10<08:01, 2.57it/s]
Training 1/1 epoch (loss 2.8264): 1%| | 13/1250 [00:10<07:33, 2.73it/s]
Training 1/1 epoch (loss 2.6714): 1%| | 13/1250 [00:11<07:33, 2.73it/s]
Training 1/1 epoch (loss 2.6714): 1%| | 14/1250 [00:11<07:07, 2.89it/s]
Training 1/1 epoch (loss 3.0056): 1%| | 14/1250 [00:11<07:07, 2.89it/s]
Training 1/1 epoch (loss 3.0056): 1%| | 15/1250 [00:11<07:06, 2.89it/s]
Training 1/1 epoch (loss 2.8702): 1%| | 15/1250 [00:11<07:06, 2.89it/s]
Training 1/1 epoch (loss 2.8702): 1%|β | 16/1250 [00:11<06:55, 2.97it/s]
Training 1/1 epoch (loss 2.9951): 1%|β | 16/1250 [00:12<06:55, 2.97it/s]
Training 1/1 epoch (loss 2.9951): 1%|β | 17/1250 [00:12<07:17, 2.82it/s]
Training 1/1 epoch (loss 3.0666): 1%|β | 17/1250 [00:12<07:17, 2.82it/s]
Training 1/1 epoch (loss 3.0666): 1%|β | 18/1250 [00:12<07:08, 2.87it/s]
Training 1/1 epoch (loss 2.9809): 1%|β | 18/1250 [00:12<07:08, 2.87it/s]
Training 1/1 epoch (loss 2.9809): 2%|β | 19/1250 [00:12<06:53, 2.98it/s]
Training 1/1 epoch (loss 2.9604): 2%|β | 19/1250 [00:13<06:53, 2.98it/s]
Training 1/1 epoch (loss 2.9604): 2%|β | 20/1250 [00:13<06:46, 3.02it/s]
Training 1/1 epoch (loss 2.8485): 2%|β | 20/1250 [00:13<06:46, 3.02it/s]
Training 1/1 epoch (loss 2.8485): 2%|β | 21/1250 [00:13<06:35, 3.11it/s]
Training 1/1 epoch (loss 3.1385): 2%|β | 21/1250 [00:13<06:35, 3.11it/s]
Training 1/1 epoch (loss 3.1385): 2%|β | 22/1250 [00:13<07:05, 2.89it/s]
Training 1/1 epoch (loss 3.0077): 2%|β | 22/1250 [00:14<07:05, 2.89it/s]
Training 1/1 epoch (loss 3.0077): 2%|β | 23/1250 [00:14<07:10, 2.85it/s]
Training 1/1 epoch (loss 2.5717): 2%|β | 23/1250 [00:14<07:10, 2.85it/s]
Training 1/1 epoch (loss 2.5717): 2%|β | 24/1250 [00:14<07:13, 2.82it/s]
Training 1/1 epoch (loss 2.8567): 2%|β | 24/1250 [00:14<07:13, 2.82it/s]
Training 1/1 epoch (loss 2.8567): 2%|β | 25/1250 [00:14<07:12, 2.83it/s]
Training 1/1 epoch (loss 2.7675): 2%|β | 25/1250 [00:15<07:12, 2.83it/s]
Training 1/1 epoch (loss 2.7675): 2%|β | 26/1250 [00:15<06:55, 2.94it/s]
Training 1/1 epoch (loss 3.0234): 2%|β | 26/1250 [00:15<06:55, 2.94it/s]
Training 1/1 epoch (loss 3.0234): 2%|β | 27/1250 [00:15<07:09, 2.85it/s]
Training 1/1 epoch (loss 3.0146): 2%|β | 27/1250 [00:15<07:09, 2.85it/s]
Training 1/1 epoch (loss 3.0146): 2%|β | 28/1250 [00:15<06:57, 2.92it/s]
Training 1/1 epoch (loss 2.9705): 2%|β | 28/1250 [00:16<06:57, 2.92it/s]
Training 1/1 epoch (loss 2.9705): 2%|β | 29/1250 [00:16<06:59, 2.91it/s]
Training 1/1 epoch (loss 2.9443): 2%|β | 29/1250 [00:16<06:59, 2.91it/s]
Training 1/1 epoch (loss 2.9443): 2%|β | 30/1250 [00:16<07:07, 2.85it/s]
Training 1/1 epoch (loss 3.1438): 2%|β | 30/1250 [00:17<07:07, 2.85it/s]
Training 1/1 epoch (loss 3.1438): 2%|β | 31/1250 [00:17<07:03, 2.88it/s]
Training 1/1 epoch (loss 2.9828): 2%|β | 31/1250 [00:17<07:03, 2.88it/s]
Training 1/1 epoch (loss 2.9828): 3%|β | 32/1250 [00:17<06:59, 2.90it/s]
Training 1/1 epoch (loss 3.0442): 3%|β | 32/1250 [00:17<06:59, 2.90it/s]
Training 1/1 epoch (loss 3.0442): 3%|β | 33/1250 [00:17<07:02, 2.88it/s]
Training 1/1 epoch (loss 2.7946): 3%|β | 33/1250 [00:18<07:02, 2.88it/s]
Training 1/1 epoch (loss 2.7946): 3%|β | 34/1250 [00:18<07:05, 2.86it/s]
Training 1/1 epoch (loss 2.9816): 3%|β | 34/1250 [00:18<07:05, 2.86it/s]
Training 1/1 epoch (loss 2.9816): 3%|β | 35/1250 [00:18<07:00, 2.89it/s]
Training 1/1 epoch (loss 2.9471): 3%|β | 35/1250 [00:18<07:00, 2.89it/s]
Training 1/1 epoch (loss 2.9471): 3%|β | 36/1250 [00:18<06:58, 2.90it/s]
Training 1/1 epoch (loss 2.7825): 3%|β | 36/1250 [00:19<06:58, 2.90it/s]
Training 1/1 epoch (loss 2.7825): 3%|β | 37/1250 [00:19<06:58, 2.90it/s]
Training 1/1 epoch (loss 2.7514): 3%|β | 37/1250 [00:19<06:58, 2.90it/s]
Training 1/1 epoch (loss 2.7514): 3%|β | 38/1250 [00:19<06:47, 2.97it/s]
Training 1/1 epoch (loss 3.0962): 3%|β | 38/1250 [00:19<06:47, 2.97it/s]
Training 1/1 epoch (loss 3.0962): 3%|β | 39/1250 [00:19<06:44, 2.99it/s]
Training 1/1 epoch (loss 2.8145): 3%|β | 39/1250 [00:20<06:44, 2.99it/s]
Training 1/1 epoch (loss 2.8145): 3%|β | 40/1250 [00:20<07:05, 2.84it/s]
Training 1/1 epoch (loss 2.9322): 3%|β | 40/1250 [00:20<07:05, 2.84it/s]
Training 1/1 epoch (loss 2.9322): 3%|β | 41/1250 [00:20<06:52, 2.93it/s]
Training 1/1 epoch (loss 2.7641): 3%|β | 41/1250 [00:20<06:52, 2.93it/s]
Training 1/1 epoch (loss 2.7641): 3%|β | 42/1250 [00:20<07:14, 2.78it/s]
Training 1/1 epoch (loss 2.8599): 3%|β | 42/1250 [00:21<07:14, 2.78it/s]
Training 1/1 epoch (loss 2.8599): 3%|β | 43/1250 [00:21<06:58, 2.88it/s]
Training 1/1 epoch (loss 2.9119): 3%|β | 43/1250 [00:21<06:58, 2.88it/s]
Training 1/1 epoch (loss 2.9119): 4%|β | 44/1250 [00:21<06:46, 2.96it/s]
Training 1/1 epoch (loss 2.8669): 4%|β | 44/1250 [00:21<06:46, 2.96it/s]
Training 1/1 epoch (loss 2.8669): 4%|β | 45/1250 [00:21<06:51, 2.93it/s]
Training 1/1 epoch (loss 2.7548): 4%|β | 45/1250 [00:22<06:51, 2.93it/s]
Training 1/1 epoch (loss 2.7548): 4%|β | 46/1250 [00:22<06:58, 2.88it/s]
Training 1/1 epoch (loss 2.9779): 4%|β | 46/1250 [00:22<06:58, 2.88it/s]
Training 1/1 epoch (loss 2.9779): 4%|β | 47/1250 [00:22<06:36, 3.03it/s]
Training 1/1 epoch (loss 2.9614): 4%|β | 47/1250 [00:22<06:36, 3.03it/s]
Training 1/1 epoch (loss 2.9614): 4%|β | 48/1250 [00:22<06:46, 2.95it/s]
Training 1/1 epoch (loss 2.7782): 4%|β | 48/1250 [00:23<06:46, 2.95it/s]
Training 1/1 epoch (loss 2.7782): 4%|β | 49/1250 [00:23<08:02, 2.49it/s]
Training 1/1 epoch (loss 2.9420): 4%|β | 49/1250 [00:23<08:02, 2.49it/s]
Training 1/1 epoch (loss 2.9420): 4%|β | 50/1250 [00:23<07:40, 2.60it/s]
Training 1/1 epoch (loss 2.8061): 4%|β | 50/1250 [00:24<07:40, 2.60it/s]
Training 1/1 epoch (loss 2.8061): 4%|β | 51/1250 [00:24<07:57, 2.51it/s]
Training 1/1 epoch (loss 2.8840): 4%|β | 51/1250 [00:24<07:57, 2.51it/s]
Training 1/1 epoch (loss 2.8840): 4%|β | 52/1250 [00:24<07:33, 2.64it/s]
Training 1/1 epoch (loss 2.9800): 4%|β | 52/1250 [00:24<07:33, 2.64it/s]
Training 1/1 epoch (loss 2.9800): 4%|β | 53/1250 [00:24<07:25, 2.69it/s]
Training 1/1 epoch (loss 2.7454): 4%|β | 53/1250 [00:25<07:25, 2.69it/s]
Training 1/1 epoch (loss 2.7454): 4%|β | 54/1250 [00:25<07:23, 2.70it/s]
Training 1/1 epoch (loss 2.8337): 4%|β | 54/1250 [00:25<07:23, 2.70it/s]
Training 1/1 epoch (loss 2.8337): 4%|β | 55/1250 [00:25<07:07, 2.80it/s]
Training 1/1 epoch (loss 2.8162): 4%|β | 55/1250 [00:25<07:07, 2.80it/s]
Training 1/1 epoch (loss 2.8162): 4%|β | 56/1250 [00:25<07:22, 2.70it/s]
Training 1/1 epoch (loss 2.7496): 4%|β | 56/1250 [00:26<07:22, 2.70it/s]
Training 1/1 epoch (loss 2.7496): 5%|β | 57/1250 [00:26<07:33, 2.63it/s]
Training 1/1 epoch (loss 2.6391): 5%|β | 57/1250 [00:26<07:33, 2.63it/s]
Training 1/1 epoch (loss 2.6391): 5%|β | 58/1250 [00:26<07:13, 2.75it/s]
Training 1/1 epoch (loss 2.7713): 5%|β | 58/1250 [00:27<07:13, 2.75it/s]
Training 1/1 epoch (loss 2.7713): 5%|β | 59/1250 [00:27<06:49, 2.91it/s]
Training 1/1 epoch (loss 2.9023): 5%|β | 59/1250 [00:27<06:49, 2.91it/s]
Training 1/1 epoch (loss 2.9023): 5%|β | 60/1250 [00:27<06:49, 2.91it/s]
Training 1/1 epoch (loss 2.7720): 5%|β | 60/1250 [00:27<06:49, 2.91it/s]
Training 1/1 epoch (loss 2.7720): 5%|β | 61/1250 [00:27<06:50, 2.89it/s]
Training 1/1 epoch (loss 2.8423): 5%|β | 61/1250 [00:28<06:50, 2.89it/s]
Training 1/1 epoch (loss 2.8423): 5%|β | 62/1250 [00:28<07:55, 2.50it/s]
Training 1/1 epoch (loss 2.9824): 5%|β | 62/1250 [00:28<07:55, 2.50it/s]
Training 1/1 epoch (loss 2.9824): 5%|β | 63/1250 [00:28<08:24, 2.35it/s]
Training 1/1 epoch (loss 3.1130): 5%|β | 63/1250 [00:29<08:24, 2.35it/s]
Training 1/1 epoch (loss 3.1130): 5%|β | 64/1250 [00:29<07:47, 2.54it/s]
Training 1/1 epoch (loss 2.8913): 5%|β | 64/1250 [00:29<07:47, 2.54it/s]
Training 1/1 epoch (loss 2.8913): 5%|β | 65/1250 [00:29<07:41, 2.57it/s]
Training 1/1 epoch (loss 2.7743): 5%|β | 65/1250 [00:29<07:41, 2.57it/s]
Training 1/1 epoch (loss 2.7743): 5%|β | 66/1250 [00:29<07:34, 2.61it/s]
Training 1/1 epoch (loss 2.8412): 5%|β | 66/1250 [00:30<07:34, 2.61it/s]
Training 1/1 epoch (loss 2.8412): 5%|β | 67/1250 [00:30<07:39, 2.57it/s]
Training 1/1 epoch (loss 2.9112): 5%|β | 67/1250 [00:30<07:39, 2.57it/s]
Training 1/1 epoch (loss 2.9112): 5%|β | 68/1250 [00:30<07:09, 2.75it/s]
Training 1/1 epoch (loss 2.7371): 5%|β | 68/1250 [00:30<07:09, 2.75it/s]
Training 1/1 epoch (loss 2.7371): 6%|β | 69/1250 [00:30<07:02, 2.80it/s]
Training 1/1 epoch (loss 2.9480): 6%|β | 69/1250 [00:31<07:02, 2.80it/s]
Training 1/1 epoch (loss 2.9480): 6%|β | 70/1250 [00:31<07:01, 2.80it/s]
Training 1/1 epoch (loss 2.7706): 6%|β | 70/1250 [00:31<07:01, 2.80it/s]
Training 1/1 epoch (loss 2.7706): 6%|β | 71/1250 [00:31<06:46, 2.90it/s]
Training 1/1 epoch (loss 2.7700): 6%|β | 71/1250 [00:31<06:46, 2.90it/s]
Training 1/1 epoch (loss 2.7700): 6%|β | 72/1250 [00:31<06:48, 2.88it/s]
Training 1/1 epoch (loss 2.9774): 6%|β | 72/1250 [00:32<06:48, 2.88it/s]
Training 1/1 epoch (loss 2.9774): 6%|β | 73/1250 [00:32<06:51, 2.86it/s]
Training 1/1 epoch (loss 2.9490): 6%|β | 73/1250 [00:32<06:51, 2.86it/s]
Training 1/1 epoch (loss 2.9490): 6%|β | 74/1250 [00:32<06:38, 2.95it/s]
Training 1/1 epoch (loss 2.8017): 6%|β | 74/1250 [00:32<06:38, 2.95it/s]
Training 1/1 epoch (loss 2.8017): 6%|β | 75/1250 [00:32<06:37, 2.96it/s]
Training 1/1 epoch (loss 2.8868): 6%|β | 75/1250 [00:33<06:37, 2.96it/s]
Training 1/1 epoch (loss 2.8868): 6%|β | 76/1250 [00:33<06:33, 2.98it/s]
Training 1/1 epoch (loss 2.7817): 6%|β | 76/1250 [00:33<06:33, 2.98it/s]
Training 1/1 epoch (loss 2.7817): 6%|β | 77/1250 [00:33<06:38, 2.94it/s]
Training 1/1 epoch (loss 2.9865): 6%|β | 77/1250 [00:33<06:38, 2.94it/s]
Training 1/1 epoch (loss 2.9865): 6%|β | 78/1250 [00:33<06:54, 2.83it/s]
Training 1/1 epoch (loss 2.8642): 6%|β | 78/1250 [00:34<06:54, 2.83it/s]
Training 1/1 epoch (loss 2.8642): 6%|β | 79/1250 [00:34<07:08, 2.73it/s]
Training 1/1 epoch (loss 2.8344): 6%|β | 79/1250 [00:34<07:08, 2.73it/s]
Training 1/1 epoch (loss 2.8344): 6%|β | 80/1250 [00:34<06:53, 2.83it/s]
Training 1/1 epoch (loss 2.9445): 6%|β | 80/1250 [00:34<06:53, 2.83it/s]
Training 1/1 epoch (loss 2.9445): 6%|β | 81/1250 [00:34<06:46, 2.88it/s]
Training 1/1 epoch (loss 2.9105): 6%|β | 81/1250 [00:35<06:46, 2.88it/s]
Training 1/1 epoch (loss 2.9105): 7%|β | 82/1250 [00:35<06:35, 2.95it/s]
Training 1/1 epoch (loss 2.9879): 7%|β | 82/1250 [00:35<06:35, 2.95it/s]
Training 1/1 epoch (loss 2.9879): 7%|β | 83/1250 [00:35<06:23, 3.04it/s]
Training 1/1 epoch (loss 2.8390): 7%|β | 83/1250 [00:35<06:23, 3.04it/s]
Training 1/1 epoch (loss 2.8390): 7%|β | 84/1250 [00:35<06:29, 2.99it/s]
Training 1/1 epoch (loss 2.7115): 7%|β | 84/1250 [00:36<06:29, 2.99it/s]
Training 1/1 epoch (loss 2.7115): 7%|β | 85/1250 [00:36<06:29, 2.99it/s]
Training 1/1 epoch (loss 2.8002): 7%|β | 85/1250 [00:36<06:29, 2.99it/s]
Training 1/1 epoch (loss 2.8002): 7%|β | 86/1250 [00:36<06:41, 2.90it/s]
Training 1/1 epoch (loss 2.9690): 7%|β | 86/1250 [00:37<06:41, 2.90it/s]
Training 1/1 epoch (loss 2.9690): 7%|β | 87/1250 [00:37<06:43, 2.88it/s]
Training 1/1 epoch (loss 2.7454): 7%|β | 87/1250 [00:37<06:43, 2.88it/s]
Training 1/1 epoch (loss 2.7454): 7%|β | 88/1250 [00:37<06:43, 2.88it/s]
Training 1/1 epoch (loss 2.8628): 7%|β | 88/1250 [00:37<06:43, 2.88it/s]
Training 1/1 epoch (loss 2.8628): 7%|β | 89/1250 [00:37<06:38, 2.91it/s]
Training 1/1 epoch (loss 3.0633): 7%|β | 89/1250 [00:38<06:38, 2.91it/s]
Training 1/1 epoch (loss 3.0633): 7%|β | 90/1250 [00:38<06:43, 2.88it/s]
Training 1/1 epoch (loss 2.7011): 7%|β | 90/1250 [00:38<06:43, 2.88it/s]
Training 1/1 epoch (loss 2.7011): 7%|β | 91/1250 [00:38<06:39, 2.90it/s]
Training 1/1 epoch (loss 2.7780): 7%|β | 91/1250 [00:38<06:39, 2.90it/s]
Training 1/1 epoch (loss 2.7780): 7%|β | 92/1250 [00:38<06:33, 2.94it/s]
Training 1/1 epoch (loss 2.7543): 7%|β | 92/1250 [00:39<06:33, 2.94it/s]
Training 1/1 epoch (loss 2.7543): 7%|β | 93/1250 [00:39<06:49, 2.82it/s]
Training 1/1 epoch (loss 2.7619): 7%|β | 93/1250 [00:39<06:49, 2.82it/s]
Training 1/1 epoch (loss 2.7619): 8%|β | 94/1250 [00:39<07:15, 2.66it/s]
Training 1/1 epoch (loss 2.8318): 8%|β | 94/1250 [00:39<07:15, 2.66it/s]
Training 1/1 epoch (loss 2.8318): 8%|β | 95/1250 [00:39<07:34, 2.54it/s]
Training 1/1 epoch (loss 2.8768): 8%|β | 95/1250 [00:40<07:34, 2.54it/s]
Training 1/1 epoch (loss 2.8768): 8%|β | 96/1250 [00:40<08:13, 2.34it/s]
Training 1/1 epoch (loss 2.7662): 8%|β | 96/1250 [00:40<08:13, 2.34it/s]
Training 1/1 epoch (loss 2.7662): 8%|β | 97/1250 [00:40<07:58, 2.41it/s]
Training 1/1 epoch (loss 2.7607): 8%|β | 97/1250 [00:41<07:58, 2.41it/s]
Training 1/1 epoch (loss 2.7607): 8%|β | 98/1250 [00:41<07:19, 2.62it/s]
Training 1/1 epoch (loss 2.8731): 8%|β | 98/1250 [00:41<07:19, 2.62it/s]
Training 1/1 epoch (loss 2.8731): 8%|β | 99/1250 [00:41<06:53, 2.79it/s]
Training 1/1 epoch (loss 3.0974): 8%|β | 99/1250 [00:41<06:53, 2.79it/s]
Training 1/1 epoch (loss 3.0974): 8%|β | 100/1250 [00:41<07:10, 2.67it/s]
Training 1/1 epoch (loss 2.8144): 8%|β | 100/1250 [00:42<07:10, 2.67it/s]
Training 1/1 epoch (loss 2.8144): 8%|β | 101/1250 [00:42<06:58, 2.75it/s]
Training 1/1 epoch (loss 2.7572): 8%|β | 101/1250 [00:42<06:58, 2.75it/s]
Training 1/1 epoch (loss 2.7572): 8%|β | 102/1250 [00:42<06:49, 2.81it/s]
Training 1/1 epoch (loss 2.8391): 8%|β | 102/1250 [00:42<06:49, 2.81it/s]
Training 1/1 epoch (loss 2.8391): 8%|β | 103/1250 [00:42<06:30, 2.94it/s]
Training 1/1 epoch (loss 2.9753): 8%|β | 103/1250 [00:43<06:30, 2.94it/s]
Training 1/1 epoch (loss 2.9753): 8%|β | 104/1250 [00:43<06:33, 2.91it/s]
Training 1/1 epoch (loss 2.8194): 8%|β | 104/1250 [00:43<06:33, 2.91it/s]
Training 1/1 epoch (loss 2.8194): 8%|β | 105/1250 [00:43<06:20, 3.01it/s]
Training 1/1 epoch (loss 2.8027): 8%|β | 105/1250 [00:43<06:20, 3.01it/s]
Training 1/1 epoch (loss 2.8027): 8%|β | 106/1250 [00:43<06:52, 2.78it/s]
Training 1/1 epoch (loss 2.8189): 8%|β | 106/1250 [00:44<06:52, 2.78it/s]
Training 1/1 epoch (loss 2.8189): 9%|β | 107/1250 [00:44<06:44, 2.82it/s]
Training 1/1 epoch (loss 2.9953): 9%|β | 107/1250 [00:44<06:44, 2.82it/s]
Training 1/1 epoch (loss 2.9953): 9%|β | 108/1250 [00:44<06:34, 2.89it/s]
Training 1/1 epoch (loss 2.9417): 9%|β | 108/1250 [00:44<06:34, 2.89it/s]
Training 1/1 epoch (loss 2.9417): 9%|β | 109/1250 [00:44<06:20, 3.00it/s]
Training 1/1 epoch (loss 2.8546): 9%|β | 109/1250 [00:45<06:20, 3.00it/s]
Training 1/1 epoch (loss 2.8546): 9%|β | 110/1250 [00:45<06:14, 3.04it/s]
Training 1/1 epoch (loss 2.8867): 9%|β | 110/1250 [00:45<06:14, 3.04it/s]
Training 1/1 epoch (loss 2.8867): 9%|β | 111/1250 [00:45<06:16, 3.02it/s]
Training 1/1 epoch (loss 2.7350): 9%|β | 111/1250 [00:45<06:16, 3.02it/s]
Training 1/1 epoch (loss 2.7350): 9%|β | 112/1250 [00:45<06:17, 3.01it/s]
Training 1/1 epoch (loss 2.7250): 9%|β | 112/1250 [00:46<06:17, 3.01it/s]
Training 1/1 epoch (loss 2.7250): 9%|β | 113/1250 [00:46<06:31, 2.90it/s]
Training 1/1 epoch (loss 2.6639): 9%|β | 113/1250 [00:46<06:31, 2.90it/s]
Training 1/1 epoch (loss 2.6639): 9%|β | 114/1250 [00:46<06:35, 2.87it/s]
Training 1/1 epoch (loss 2.7430): 9%|β | 114/1250 [00:46<06:35, 2.87it/s]
Training 1/1 epoch (loss 2.7430): 9%|β | 115/1250 [00:46<06:32, 2.89it/s]
Training 1/1 epoch (loss 2.8149): 9%|β | 115/1250 [00:47<06:32, 2.89it/s]
Training 1/1 epoch (loss 2.8149): 9%|β | 116/1250 [00:47<06:16, 3.01it/s]
Training 1/1 epoch (loss 2.7659): 9%|β | 116/1250 [00:47<06:16, 3.01it/s]
Training 1/1 epoch (loss 2.7659): 9%|β | 117/1250 [00:47<06:18, 2.99it/s]
Training 1/1 epoch (loss 2.7451): 9%|β | 117/1250 [00:47<06:18, 2.99it/s]
Training 1/1 epoch (loss 2.7451): 9%|β | 118/1250 [00:47<06:18, 2.99it/s]
Training 1/1 epoch (loss 2.8699): 9%|β | 118/1250 [00:48<06:18, 2.99it/s]
Training 1/1 epoch (loss 2.8699): 10%|β | 119/1250 [00:48<06:24, 2.94it/s]
Training 1/1 epoch (loss 2.9826): 10%|β | 119/1250 [00:48<06:24, 2.94it/s]
Training 1/1 epoch (loss 2.9826): 10%|β | 120/1250 [00:48<06:40, 2.82it/s]
Training 1/1 epoch (loss 2.9322): 10%|β | 120/1250 [00:49<06:40, 2.82it/s]
Training 1/1 epoch (loss 2.9322): 10%|β | 121/1250 [00:49<06:31, 2.89it/s]
Training 1/1 epoch (loss 2.7240): 10%|β | 121/1250 [00:49<06:31, 2.89it/s]
Training 1/1 epoch (loss 2.7240): 10%|β | 122/1250 [00:49<06:16, 2.99it/s]
Training 1/1 epoch (loss 2.8411): 10%|β | 122/1250 [00:49<06:16, 2.99it/s]
Training 1/1 epoch (loss 2.8411): 10%|β | 123/1250 [00:49<06:11, 3.04it/s]
Training 1/1 epoch (loss 2.6664): 10%|β | 123/1250 [00:49<06:11, 3.04it/s]
Training 1/1 epoch (loss 2.6664): 10%|β | 124/1250 [00:49<05:57, 3.15it/s]
Training 1/1 epoch (loss 2.5935): 10%|β | 124/1250 [00:50<05:57, 3.15it/s]
Training 1/1 epoch (loss 2.5935): 10%|β | 125/1250 [00:50<06:17, 2.98it/s]
Training 1/1 epoch (loss 2.8362): 10%|β | 125/1250 [00:50<06:17, 2.98it/s]
Training 1/1 epoch (loss 2.8362): 10%|β | 126/1250 [00:50<06:14, 3.00it/s]
Training 1/1 epoch (loss 2.6994): 10%|β | 126/1250 [00:50<06:14, 3.00it/s]
Training 1/1 epoch (loss 2.6994): 10%|β | 127/1250 [00:50<06:10, 3.03it/s]
Training 1/1 epoch (loss 2.8743): 10%|β | 127/1250 [00:51<06:10, 3.03it/s]
Training 1/1 epoch (loss 2.8743): 10%|β | 128/1250 [00:51<06:03, 3.09it/s]
Training 1/1 epoch (loss 2.7251): 10%|β | 128/1250 [00:51<06:03, 3.09it/s]
Training 1/1 epoch (loss 2.7251): 10%|β | 129/1250 [00:51<05:55, 3.16it/s]
Training 1/1 epoch (loss 2.7631): 10%|β | 129/1250 [00:51<05:55, 3.16it/s]
Training 1/1 epoch (loss 2.7631): 10%|β | 130/1250 [00:51<06:10, 3.03it/s]
Training 1/1 epoch (loss 2.9168): 10%|β | 130/1250 [00:52<06:10, 3.03it/s]
Training 1/1 epoch (loss 2.9168): 10%|β | 131/1250 [00:52<06:37, 2.81it/s]
Training 1/1 epoch (loss 2.9141): 10%|β | 131/1250 [00:52<06:37, 2.81it/s]
Training 1/1 epoch (loss 2.9141): 11%|β | 132/1250 [00:52<06:23, 2.92it/s]
Training 1/1 epoch (loss 2.8750): 11%|β | 132/1250 [00:52<06:23, 2.92it/s]
Training 1/1 epoch (loss 2.8750): 11%|β | 133/1250 [00:52<06:16, 2.97it/s]
Training 1/1 epoch (loss 2.9954): 11%|β | 133/1250 [00:53<06:16, 2.97it/s]
Training 1/1 epoch (loss 2.9954): 11%|β | 134/1250 [00:53<07:50, 2.37it/s]
Training 1/1 epoch (loss 2.8709): 11%|β | 134/1250 [00:54<07:50, 2.37it/s]
Training 1/1 epoch (loss 2.8709): 11%|β | 135/1250 [00:54<07:50, 2.37it/s]
Training 1/1 epoch (loss 2.8426): 11%|β | 135/1250 [00:54<07:50, 2.37it/s]
Training 1/1 epoch (loss 2.8426): 11%|β | 136/1250 [00:54<07:35, 2.45it/s]
Training 1/1 epoch (loss 2.7097): 11%|β | 136/1250 [00:54<07:35, 2.45it/s]
Training 1/1 epoch (loss 2.7097): 11%|β | 137/1250 [00:54<07:18, 2.54it/s]
Training 1/1 epoch (loss 2.7272): 11%|β | 137/1250 [00:55<07:18, 2.54it/s]
Training 1/1 epoch (loss 2.7272): 11%|β | 138/1250 [00:55<06:51, 2.70it/s]
Training 1/1 epoch (loss 2.9721): 11%|β | 138/1250 [00:55<06:51, 2.70it/s]
Training 1/1 epoch (loss 2.9721): 11%|β | 139/1250 [00:55<06:36, 2.80it/s]
Training 1/1 epoch (loss 2.8974): 11%|β | 139/1250 [00:55<06:36, 2.80it/s]
Training 1/1 epoch (loss 2.8974): 11%|β | 140/1250 [00:55<06:14, 2.97it/s]
Training 1/1 epoch (loss 2.7571): 11%|β | 140/1250 [00:56<06:14, 2.97it/s]
Training 1/1 epoch (loss 2.7571): 11%|ββ | 141/1250 [00:56<06:32, 2.83it/s]
Training 1/1 epoch (loss 2.9425): 11%|ββ | 141/1250 [00:56<06:32, 2.83it/s]
Training 1/1 epoch (loss 2.9425): 11%|ββ | 142/1250 [00:56<06:29, 2.84it/s]
Training 1/1 epoch (loss 2.7775): 11%|ββ | 142/1250 [00:56<06:29, 2.84it/s]
Training 1/1 epoch (loss 2.7775): 11%|ββ | 143/1250 [00:56<06:21, 2.90it/s]
Training 1/1 epoch (loss 2.9933): 11%|ββ | 143/1250 [00:57<06:21, 2.90it/s]
Training 1/1 epoch (loss 2.9933): 12%|ββ | 144/1250 [00:57<06:21, 2.90it/s]
Training 1/1 epoch (loss 2.8607): 12%|ββ | 144/1250 [00:57<06:21, 2.90it/s]
Training 1/1 epoch (loss 2.8607): 12%|ββ | 145/1250 [00:57<06:27, 2.85it/s]
Training 1/1 epoch (loss 2.9262): 12%|ββ | 145/1250 [00:57<06:27, 2.85it/s]
Training 1/1 epoch (loss 2.9262): 12%|ββ | 146/1250 [00:57<06:20, 2.90it/s]
Training 1/1 epoch (loss 2.7681): 12%|ββ | 146/1250 [00:58<06:20, 2.90it/s]
Training 1/1 epoch (loss 2.7681): 12%|ββ | 147/1250 [00:58<07:06, 2.58it/s]
Training 1/1 epoch (loss 2.9255): 12%|ββ | 147/1250 [00:58<07:06, 2.58it/s]
Training 1/1 epoch (loss 2.9255): 12%|ββ | 148/1250 [00:58<07:47, 2.36it/s]
Training 1/1 epoch (loss 2.7794): 12%|ββ | 148/1250 [00:59<07:47, 2.36it/s]
Training 1/1 epoch (loss 2.7794): 12%|ββ | 149/1250 [00:59<07:04, 2.59it/s]
Training 1/1 epoch (loss 3.0724): 12%|ββ | 149/1250 [00:59<07:04, 2.59it/s]
Training 1/1 epoch (loss 3.0724): 12%|ββ | 150/1250 [00:59<07:02, 2.61it/s]
Training 1/1 epoch (loss 2.8239): 12%|ββ | 150/1250 [00:59<07:02, 2.61it/s]
Training 1/1 epoch (loss 2.8239): 12%|ββ | 151/1250 [00:59<06:41, 2.74it/s]
Training 1/1 epoch (loss 2.8889): 12%|ββ | 151/1250 [01:00<06:41, 2.74it/s]
Training 1/1 epoch (loss 2.8889): 12%|ββ | 152/1250 [01:00<07:23, 2.48it/s]
Training 1/1 epoch (loss 2.9986): 12%|ββ | 152/1250 [01:00<07:23, 2.48it/s]
Training 1/1 epoch (loss 2.9986): 12%|ββ | 153/1250 [01:00<07:13, 2.53it/s]
Training 1/1 epoch (loss 2.9800): 12%|ββ | 153/1250 [01:00<07:13, 2.53it/s]
Training 1/1 epoch (loss 2.9800): 12%|ββ | 154/1250 [01:00<06:37, 2.75it/s]
Training 1/1 epoch (loss 2.8152): 12%|ββ | 154/1250 [01:01<06:37, 2.75it/s]
Training 1/1 epoch (loss 2.8152): 12%|ββ | 155/1250 [01:01<06:18, 2.89it/s]
Training 1/1 epoch (loss 3.0026): 12%|ββ | 155/1250 [01:01<06:18, 2.89it/s]
Training 1/1 epoch (loss 3.0026): 12%|ββ | 156/1250 [01:01<06:10, 2.96it/s]
Training 1/1 epoch (loss 2.9617): 12%|ββ | 156/1250 [01:01<06:10, 2.96it/s]
Training 1/1 epoch (loss 2.9617): 13%|ββ | 157/1250 [01:01<05:53, 3.09it/s]
Training 1/1 epoch (loss 2.9243): 13%|ββ | 157/1250 [01:02<05:53, 3.09it/s]
Training 1/1 epoch (loss 2.9243): 13%|ββ | 158/1250 [01:02<05:59, 3.04it/s]
Training 1/1 epoch (loss 3.1867): 13%|ββ | 158/1250 [01:02<05:59, 3.04it/s]
Training 1/1 epoch (loss 3.1867): 13%|ββ | 159/1250 [01:02<06:05, 2.99it/s]
Training 1/1 epoch (loss 2.7637): 13%|ββ | 159/1250 [01:02<06:05, 2.99it/s]
Training 1/1 epoch (loss 2.7637): 13%|ββ | 160/1250 [01:02<06:18, 2.88it/s]
Training 1/1 epoch (loss 2.8238): 13%|ββ | 160/1250 [01:03<06:18, 2.88it/s]
Training 1/1 epoch (loss 2.8238): 13%|ββ | 161/1250 [01:03<06:09, 2.95it/s]
Training 1/1 epoch (loss 2.7362): 13%|ββ | 161/1250 [01:03<06:09, 2.95it/s]
Training 1/1 epoch (loss 2.7362): 13%|ββ | 162/1250 [01:03<05:56, 3.05it/s]
Training 1/1 epoch (loss 2.9208): 13%|ββ | 162/1250 [01:03<05:56, 3.05it/s]
Training 1/1 epoch (loss 2.9208): 13%|ββ | 163/1250 [01:03<06:06, 2.97it/s]
Training 1/1 epoch (loss 2.7017): 13%|ββ | 163/1250 [01:04<06:06, 2.97it/s]
Training 1/1 epoch (loss 2.7017): 13%|ββ | 164/1250 [01:04<06:27, 2.80it/s]
Training 1/1 epoch (loss 2.8445): 13%|ββ | 164/1250 [01:04<06:27, 2.80it/s]
Training 1/1 epoch (loss 2.8445): 13%|ββ | 165/1250 [01:04<06:26, 2.81it/s]
Training 1/1 epoch (loss 2.8219): 13%|ββ | 165/1250 [01:04<06:26, 2.81it/s]
Training 1/1 epoch (loss 2.8219): 13%|ββ | 166/1250 [01:04<06:10, 2.93it/s]
Training 1/1 epoch (loss 2.7584): 13%|ββ | 166/1250 [01:05<06:10, 2.93it/s]
Training 1/1 epoch (loss 2.7584): 13%|ββ | 167/1250 [01:05<05:57, 3.03it/s]
Training 1/1 epoch (loss 2.7869): 13%|ββ | 167/1250 [01:05<05:57, 3.03it/s]
Training 1/1 epoch (loss 2.7869): 13%|ββ | 168/1250 [01:05<06:05, 2.96it/s]
Training 1/1 epoch (loss 2.8387): 13%|ββ | 168/1250 [01:05<06:05, 2.96it/s]
Training 1/1 epoch (loss 2.8387): 14%|ββ | 169/1250 [01:05<06:08, 2.93it/s]
Training 1/1 epoch (loss 2.9099): 14%|ββ | 169/1250 [01:06<06:08, 2.93it/s]
Training 1/1 epoch (loss 2.9099): 14%|ββ | 170/1250 [01:06<05:57, 3.02it/s]
Training 1/1 epoch (loss 2.8050): 14%|ββ | 170/1250 [01:06<05:57, 3.02it/s]
Training 1/1 epoch (loss 2.8050): 14%|ββ | 171/1250 [01:06<06:11, 2.90it/s]
Training 1/1 epoch (loss 2.7045): 14%|ββ | 171/1250 [01:06<06:11, 2.90it/s]
Training 1/1 epoch (loss 2.7045): 14%|ββ | 172/1250 [01:06<06:01, 2.98it/s]
Training 1/1 epoch (loss 2.6318): 14%|ββ | 172/1250 [01:07<06:01, 2.98it/s]
Training 1/1 epoch (loss 2.6318): 14%|ββ | 173/1250 [01:07<05:53, 3.05it/s]
Training 1/1 epoch (loss 2.9167): 14%|ββ | 173/1250 [01:07<05:53, 3.05it/s]
Training 1/1 epoch (loss 2.9167): 14%|ββ | 174/1250 [01:07<06:00, 2.99it/s]
Training 1/1 epoch (loss 3.0358): 14%|ββ | 174/1250 [01:07<06:00, 2.99it/s]
Training 1/1 epoch (loss 3.0358): 14%|ββ | 175/1250 [01:07<05:59, 2.99it/s]
Training 1/1 epoch (loss 2.7926): 14%|ββ | 175/1250 [01:08<05:59, 2.99it/s]
Training 1/1 epoch (loss 2.7926): 14%|ββ | 176/1250 [01:08<06:05, 2.94it/s]
Training 1/1 epoch (loss 3.0718): 14%|ββ | 176/1250 [01:08<06:05, 2.94it/s]
Training 1/1 epoch (loss 3.0718): 14%|ββ | 177/1250 [01:08<06:10, 2.89it/s]
Training 1/1 epoch (loss 2.9168): 14%|ββ | 177/1250 [01:09<06:10, 2.89it/s]
Training 1/1 epoch (loss 2.9168): 14%|ββ | 178/1250 [01:09<06:25, 2.78it/s]
Training 1/1 epoch (loss 3.0796): 14%|ββ | 178/1250 [01:09<06:25, 2.78it/s]
Training 1/1 epoch (loss 3.0796): 14%|ββ | 179/1250 [01:09<06:03, 2.95it/s]
Training 1/1 epoch (loss 2.9889): 14%|ββ | 179/1250 [01:09<06:03, 2.95it/s]
Training 1/1 epoch (loss 2.9889): 14%|ββ | 180/1250 [01:09<05:57, 2.99it/s]
Training 1/1 epoch (loss 2.9761): 14%|ββ | 180/1250 [01:09<05:57, 2.99it/s]
Training 1/1 epoch (loss 2.9761): 14%|ββ | 181/1250 [01:09<05:46, 3.09it/s]
Training 1/1 epoch (loss 2.8659): 14%|ββ | 181/1250 [01:10<05:46, 3.09it/s]
Training 1/1 epoch (loss 2.8659): 15%|ββ | 182/1250 [01:10<05:44, 3.10it/s]
Training 1/1 epoch (loss 2.9024): 15%|ββ | 182/1250 [01:10<05:44, 3.10it/s]
Training 1/1 epoch (loss 2.9024): 15%|ββ | 183/1250 [01:10<06:12, 2.86it/s]
Training 1/1 epoch (loss 2.9271): 15%|ββ | 183/1250 [01:11<06:12, 2.86it/s]
Training 1/1 epoch (loss 2.9271): 15%|ββ | 184/1250 [01:11<06:02, 2.94it/s]
Training 1/1 epoch (loss 2.8184): 15%|ββ | 184/1250 [01:11<06:02, 2.94it/s]
Training 1/1 epoch (loss 2.8184): 15%|ββ | 185/1250 [01:11<05:56, 2.99it/s]
Training 1/1 epoch (loss 2.8956): 15%|ββ | 185/1250 [01:11<05:56, 2.99it/s]
Training 1/1 epoch (loss 2.8956): 15%|ββ | 186/1250 [01:11<05:51, 3.03it/s]
Training 1/1 epoch (loss 2.9444): 15%|ββ | 186/1250 [01:11<05:51, 3.03it/s]
Training 1/1 epoch (loss 2.9444): 15%|ββ | 187/1250 [01:11<05:43, 3.09it/s]
Training 1/1 epoch (loss 2.6345): 15%|ββ | 187/1250 [01:12<05:43, 3.09it/s]
Training 1/1 epoch (loss 2.6345): 15%|ββ | 188/1250 [01:12<05:52, 3.01it/s]
Training 1/1 epoch (loss 2.7368): 15%|ββ | 188/1250 [01:12<05:52, 3.01it/s]
Training 1/1 epoch (loss 2.7368): 15%|ββ | 189/1250 [01:12<05:55, 2.99it/s]
Training 1/1 epoch (loss 2.7126): 15%|ββ | 189/1250 [01:12<05:55, 2.99it/s]
Training 1/1 epoch (loss 2.7126): 15%|ββ | 190/1250 [01:12<05:44, 3.08it/s]
Training 1/1 epoch (loss 2.9198): 15%|ββ | 190/1250 [01:13<05:44, 3.08it/s]
Training 1/1 epoch (loss 2.9198): 15%|ββ | 191/1250 [01:13<06:03, 2.92it/s]
Training 1/1 epoch (loss 3.0116): 15%|ββ | 191/1250 [01:13<06:03, 2.92it/s]
Training 1/1 epoch (loss 3.0116): 15%|ββ | 192/1250 [01:13<05:53, 2.99it/s]
Training 1/1 epoch (loss 2.8323): 15%|ββ | 192/1250 [01:14<05:53, 2.99it/s]
Training 1/1 epoch (loss 2.8323): 15%|ββ | 193/1250 [01:14<06:04, 2.90it/s]
Training 1/1 epoch (loss 2.9665): 15%|ββ | 193/1250 [01:14<06:04, 2.90it/s]
Training 1/1 epoch (loss 2.9665): 16%|ββ | 194/1250 [01:14<06:17, 2.79it/s]
Training 1/1 epoch (loss 2.8471): 16%|ββ | 194/1250 [01:14<06:17, 2.79it/s]
Training 1/1 epoch (loss 2.8471): 16%|ββ | 195/1250 [01:14<06:26, 2.73it/s]
Training 1/1 epoch (loss 2.7015): 16%|ββ | 195/1250 [01:15<06:26, 2.73it/s]
Training 1/1 epoch (loss 2.7015): 16%|ββ | 196/1250 [01:15<06:20, 2.77it/s]
Training 1/1 epoch (loss 2.8216): 16%|ββ | 196/1250 [01:15<06:20, 2.77it/s]
Training 1/1 epoch (loss 2.8216): 16%|ββ | 197/1250 [01:15<06:09, 2.85it/s]
Training 1/1 epoch (loss 2.9587): 16%|ββ | 197/1250 [01:15<06:09, 2.85it/s]
Training 1/1 epoch (loss 2.9587): 16%|ββ | 198/1250 [01:15<05:59, 2.93it/s]
Training 1/1 epoch (loss 2.9185): 16%|ββ | 198/1250 [01:16<05:59, 2.93it/s]
Training 1/1 epoch (loss 2.9185): 16%|ββ | 199/1250 [01:16<05:46, 3.04it/s]
Training 1/1 epoch (loss 2.8770): 16%|ββ | 199/1250 [01:16<05:46, 3.04it/s]
Training 1/1 epoch (loss 2.8770): 16%|ββ | 200/1250 [01:16<06:10, 2.84it/s]
Training 1/1 epoch (loss 2.9282): 16%|ββ | 200/1250 [01:16<06:10, 2.84it/s]
Training 1/1 epoch (loss 2.9282): 16%|ββ | 201/1250 [01:16<06:06, 2.87it/s]
Training 1/1 epoch (loss 2.7861): 16%|ββ | 201/1250 [01:17<06:06, 2.87it/s]
Training 1/1 epoch (loss 2.7861): 16%|ββ | 202/1250 [01:17<05:57, 2.93it/s]
Training 1/1 epoch (loss 2.7526): 16%|ββ | 202/1250 [01:17<05:57, 2.93it/s]
Training 1/1 epoch (loss 2.7526): 16%|ββ | 203/1250 [01:17<05:52, 2.97it/s]
Training 1/1 epoch (loss 2.7059): 16%|ββ | 203/1250 [01:17<05:52, 2.97it/s]
Training 1/1 epoch (loss 2.7059): 16%|ββ | 204/1250 [01:17<05:47, 3.01it/s]
Training 1/1 epoch (loss 2.7238): 16%|ββ | 204/1250 [01:18<05:47, 3.01it/s]
Training 1/1 epoch (loss 2.7238): 16%|ββ | 205/1250 [01:18<05:42, 3.05it/s]
Training 1/1 epoch (loss 2.9659): 16%|ββ | 205/1250 [01:18<05:42, 3.05it/s]
Training 1/1 epoch (loss 2.9659): 16%|ββ | 206/1250 [01:18<06:09, 2.83it/s]
Training 1/1 epoch (loss 2.8513): 16%|ββ | 206/1250 [01:18<06:09, 2.83it/s]
Training 1/1 epoch (loss 2.8513): 17%|ββ | 207/1250 [01:18<06:06, 2.84it/s]
Training 1/1 epoch (loss 2.8415): 17%|ββ | 207/1250 [01:19<06:06, 2.84it/s]
Training 1/1 epoch (loss 2.8415): 17%|ββ | 208/1250 [01:19<05:56, 2.93it/s]
Training 1/1 epoch (loss 2.8569): 17%|ββ | 208/1250 [01:19<05:56, 2.93it/s]
Training 1/1 epoch (loss 2.8569): 17%|ββ | 209/1250 [01:19<06:01, 2.88it/s]
Training 1/1 epoch (loss 2.9371): 17%|ββ | 209/1250 [01:19<06:01, 2.88it/s]
Training 1/1 epoch (loss 2.9371): 17%|ββ | 210/1250 [01:19<05:54, 2.94it/s]
Training 1/1 epoch (loss 2.8593): 17%|ββ | 210/1250 [01:20<05:54, 2.94it/s]
Training 1/1 epoch (loss 2.8593): 17%|ββ | 211/1250 [01:20<05:43, 3.02it/s]
Training 1/1 epoch (loss 3.0368): 17%|ββ | 211/1250 [01:20<05:43, 3.02it/s]
Training 1/1 epoch (loss 3.0368): 17%|ββ | 212/1250 [01:20<05:43, 3.02it/s]
Training 1/1 epoch (loss 2.7631): 17%|ββ | 212/1250 [01:21<05:43, 3.02it/s]
Training 1/1 epoch (loss 2.7631): 17%|ββ | 213/1250 [01:21<06:26, 2.68it/s]
Training 1/1 epoch (loss 2.7101): 17%|ββ | 213/1250 [01:21<06:26, 2.68it/s]
Training 1/1 epoch (loss 2.7101): 17%|ββ | 214/1250 [01:21<06:06, 2.82it/s]
Training 1/1 epoch (loss 2.6916): 17%|ββ | 214/1250 [01:21<06:06, 2.82it/s]
Training 1/1 epoch (loss 2.6916): 17%|ββ | 215/1250 [01:21<05:51, 2.94it/s]
Training 1/1 epoch (loss 2.5994): 17%|ββ | 215/1250 [01:21<05:51, 2.94it/s]
Training 1/1 epoch (loss 2.5994): 17%|ββ | 216/1250 [01:21<05:48, 2.96it/s]
Training 1/1 epoch (loss 2.7197): 17%|ββ | 216/1250 [01:22<05:48, 2.96it/s]
Training 1/1 epoch (loss 2.7197): 17%|ββ | 217/1250 [01:22<05:52, 2.93it/s]
Training 1/1 epoch (loss 2.9473): 17%|ββ | 217/1250 [01:22<05:52, 2.93it/s]
Training 1/1 epoch (loss 2.9473): 17%|ββ | 218/1250 [01:22<05:54, 2.91it/s]
Training 1/1 epoch (loss 2.9564): 17%|ββ | 218/1250 [01:23<05:54, 2.91it/s]
Training 1/1 epoch (loss 2.9564): 18%|ββ | 219/1250 [01:23<06:08, 2.80it/s]
Training 1/1 epoch (loss 2.6365): 18%|ββ | 219/1250 [01:23<06:08, 2.80it/s]
Training 1/1 epoch (loss 2.6365): 18%|ββ | 220/1250 [01:23<07:19, 2.34it/s]
Training 1/1 epoch (loss 2.7110): 18%|ββ | 220/1250 [01:24<07:19, 2.34it/s]
Training 1/1 epoch (loss 2.7110): 18%|ββ | 221/1250 [01:24<06:53, 2.49it/s]
Training 1/1 epoch (loss 2.9319): 18%|ββ | 221/1250 [01:24<06:53, 2.49it/s]
Training 1/1 epoch (loss 2.9319): 18%|ββ | 222/1250 [01:24<06:32, 2.62it/s]
Training 1/1 epoch (loss 2.7660): 18%|ββ | 222/1250 [01:24<06:32, 2.62it/s]
Training 1/1 epoch (loss 2.7660): 18%|ββ | 223/1250 [01:24<06:32, 2.62it/s]
Training 1/1 epoch (loss 2.7666): 18%|ββ | 223/1250 [01:25<06:32, 2.62it/s]
Training 1/1 epoch (loss 2.7666): 18%|ββ | 224/1250 [01:25<06:43, 2.54it/s]
Training 1/1 epoch (loss 2.6435): 18%|ββ | 224/1250 [01:25<06:43, 2.54it/s]
Training 1/1 epoch (loss 2.6435): 18%|ββ | 225/1250 [01:25<06:18, 2.71it/s]
Training 1/1 epoch (loss 2.9251): 18%|ββ | 225/1250 [01:25<06:18, 2.71it/s]
Training 1/1 epoch (loss 2.9251): 18%|ββ | 226/1250 [01:25<06:00, 2.84it/s]
Training 1/1 epoch (loss 2.9908): 18%|ββ | 226/1250 [01:26<06:00, 2.84it/s]
Training 1/1 epoch (loss 2.9908): 18%|ββ | 227/1250 [01:26<06:02, 2.82it/s]
Training 1/1 epoch (loss 2.8930): 18%|ββ | 227/1250 [01:26<06:02, 2.82it/s]
Training 1/1 epoch (loss 2.8930): 18%|ββ | 228/1250 [01:26<06:08, 2.78it/s]
Training 1/1 epoch (loss 2.8373): 18%|ββ | 228/1250 [01:26<06:08, 2.78it/s]
Training 1/1 epoch (loss 2.8373): 18%|ββ | 229/1250 [01:26<06:03, 2.81it/s]
Training 1/1 epoch (loss 2.7076): 18%|ββ | 229/1250 [01:27<06:03, 2.81it/s]
Training 1/1 epoch (loss 2.7076): 18%|ββ | 230/1250 [01:27<06:05, 2.79it/s]
Training 1/1 epoch (loss 2.9365): 18%|ββ | 230/1250 [01:27<06:05, 2.79it/s]
Training 1/1 epoch (loss 2.9365): 18%|ββ | 231/1250 [01:27<05:55, 2.87it/s]
Training 1/1 epoch (loss 2.9870): 18%|ββ | 231/1250 [01:27<05:55, 2.87it/s]
Training 1/1 epoch (loss 2.9870): 19%|ββ | 232/1250 [01:27<06:09, 2.76it/s]
Training 1/1 epoch (loss 2.8369): 19%|ββ | 232/1250 [01:28<06:09, 2.76it/s]
Training 1/1 epoch (loss 2.8369): 19%|ββ | 233/1250 [01:28<06:28, 2.62it/s]
Training 1/1 epoch (loss 2.8622): 19%|ββ | 233/1250 [01:28<06:28, 2.62it/s]
Training 1/1 epoch (loss 2.8622): 19%|ββ | 234/1250 [01:28<07:43, 2.19it/s]
Training 1/1 epoch (loss 2.8816): 19%|ββ | 234/1250 [01:29<07:43, 2.19it/s]
Training 1/1 epoch (loss 2.8816): 19%|ββ | 235/1250 [01:29<07:24, 2.28it/s]
Training 1/1 epoch (loss 2.9095): 19%|ββ | 235/1250 [01:29<07:24, 2.28it/s]
Training 1/1 epoch (loss 2.9095): 19%|ββ | 236/1250 [01:29<07:22, 2.29it/s]
Training 1/1 epoch (loss 2.6702): 19%|ββ | 236/1250 [01:30<07:22, 2.29it/s]
Training 1/1 epoch (loss 2.6702): 19%|ββ | 237/1250 [01:30<07:38, 2.21it/s]
Training 1/1 epoch (loss 2.8421): 19%|ββ | 237/1250 [01:30<07:38, 2.21it/s]
Training 1/1 epoch (loss 2.8421): 19%|ββ | 238/1250 [01:30<07:39, 2.20it/s]
Training 1/1 epoch (loss 2.7217): 19%|ββ | 238/1250 [01:31<07:39, 2.20it/s]
Training 1/1 epoch (loss 2.7217): 19%|ββ | 239/1250 [01:31<07:31, 2.24it/s]
Training 1/1 epoch (loss 2.7442): 19%|ββ | 239/1250 [01:31<07:31, 2.24it/s]
Training 1/1 epoch (loss 2.7442): 19%|ββ | 240/1250 [01:31<07:23, 2.28it/s]
Training 1/1 epoch (loss 2.6688): 19%|ββ | 240/1250 [01:31<07:23, 2.28it/s]
Training 1/1 epoch (loss 2.6688): 19%|ββ | 241/1250 [01:31<06:47, 2.48it/s]
Training 1/1 epoch (loss 2.7901): 19%|ββ | 241/1250 [01:32<06:47, 2.48it/s]
Training 1/1 epoch (loss 2.7901): 19%|ββ | 242/1250 [01:32<06:16, 2.67it/s]
Training 1/1 epoch (loss 2.6722): 19%|ββ | 242/1250 [01:32<06:16, 2.67it/s]
Training 1/1 epoch (loss 2.6722): 19%|ββ | 243/1250 [01:32<06:04, 2.76it/s]
Training 1/1 epoch (loss 2.6406): 19%|ββ | 243/1250 [01:32<06:04, 2.76it/s]
Training 1/1 epoch (loss 2.6406): 20%|ββ | 244/1250 [01:32<06:02, 2.77it/s]
Training 1/1 epoch (loss 2.7727): 20%|ββ | 244/1250 [01:33<06:02, 2.77it/s]
Training 1/1 epoch (loss 2.7727): 20%|ββ | 245/1250 [01:33<06:04, 2.76it/s]
Training 1/1 epoch (loss 2.8293): 20%|ββ | 245/1250 [01:33<06:04, 2.76it/s]
Training 1/1 epoch (loss 2.8293): 20%|ββ | 246/1250 [01:33<05:44, 2.91it/s]
Training 1/1 epoch (loss 2.7062): 20%|ββ | 246/1250 [01:33<05:44, 2.91it/s]
Training 1/1 epoch (loss 2.7062): 20%|ββ | 247/1250 [01:33<05:52, 2.85it/s]
Training 1/1 epoch (loss 2.7384): 20%|ββ | 247/1250 [01:34<05:52, 2.85it/s]
Training 1/1 epoch (loss 2.7384): 20%|ββ | 248/1250 [01:34<05:54, 2.82it/s]
Training 1/1 epoch (loss 2.9392): 20%|ββ | 248/1250 [01:34<05:54, 2.82it/s]
Training 1/1 epoch (loss 2.9392): 20%|ββ | 249/1250 [01:34<06:01, 2.77it/s]
Training 1/1 epoch (loss 2.8576): 20%|ββ | 249/1250 [01:35<06:01, 2.77it/s]
Training 1/1 epoch (loss 2.8576): 20%|ββ | 250/1250 [01:35<05:48, 2.87it/s]
Training 1/1 epoch (loss 2.7553): 20%|ββ | 250/1250 [01:35<05:48, 2.87it/s]
Training 1/1 epoch (loss 2.7553): 20%|ββ | 251/1250 [01:35<05:39, 2.94it/s]
Training 1/1 epoch (loss 2.8370): 20%|ββ | 251/1250 [01:35<05:39, 2.94it/s]
Training 1/1 epoch (loss 2.8370): 20%|ββ | 252/1250 [01:35<05:29, 3.03it/s]
Training 1/1 epoch (loss 2.7298): 20%|ββ | 252/1250 [01:35<05:29, 3.03it/s]
Training 1/1 epoch (loss 2.7298): 20%|ββ | 253/1250 [01:35<05:29, 3.02it/s]
Training 1/1 epoch (loss 2.8798): 20%|ββ | 253/1250 [01:36<05:29, 3.02it/s]
Training 1/1 epoch (loss 2.8798): 20%|ββ | 254/1250 [01:36<05:39, 2.93it/s]
Training 1/1 epoch (loss 2.7551): 20%|ββ | 254/1250 [01:36<05:39, 2.93it/s]
Training 1/1 epoch (loss 2.7551): 20%|ββ | 255/1250 [01:36<05:45, 2.88it/s]
Training 1/1 epoch (loss 2.8803): 20%|ββ | 255/1250 [01:37<05:45, 2.88it/s]
Training 1/1 epoch (loss 2.8803): 20%|ββ | 256/1250 [01:37<05:54, 2.80it/s]
Training 1/1 epoch (loss 2.8360): 20%|ββ | 256/1250 [01:37<05:54, 2.80it/s]
Training 1/1 epoch (loss 2.8360): 21%|ββ | 257/1250 [01:37<05:55, 2.79it/s]
Training 1/1 epoch (loss 2.8185): 21%|ββ | 257/1250 [01:37<05:55, 2.79it/s]
Training 1/1 epoch (loss 2.8185): 21%|ββ | 258/1250 [01:37<05:44, 2.88it/s]
Training 1/1 epoch (loss 2.7580): 21%|ββ | 258/1250 [01:38<05:44, 2.88it/s]
Training 1/1 epoch (loss 2.7580): 21%|ββ | 259/1250 [01:38<05:36, 2.95it/s]
Training 1/1 epoch (loss 2.8601): 21%|ββ | 259/1250 [01:38<05:36, 2.95it/s]
Training 1/1 epoch (loss 2.8601): 21%|ββ | 260/1250 [01:38<05:45, 2.86it/s]
Training 1/1 epoch (loss 2.7503): 21%|ββ | 260/1250 [01:38<05:45, 2.86it/s]
Training 1/1 epoch (loss 2.7503): 21%|ββ | 261/1250 [01:38<05:52, 2.81it/s]
Training 1/1 epoch (loss 2.7581): 21%|ββ | 261/1250 [01:39<05:52, 2.81it/s]
Training 1/1 epoch (loss 2.7581): 21%|ββ | 262/1250 [01:39<05:57, 2.76it/s]
Training 1/1 epoch (loss 2.7755): 21%|ββ | 262/1250 [01:39<05:57, 2.76it/s]
Training 1/1 epoch (loss 2.7755): 21%|ββ | 263/1250 [01:39<05:53, 2.79it/s]
Training 1/1 epoch (loss 2.5074): 21%|ββ | 263/1250 [01:39<05:53, 2.79it/s]
Training 1/1 epoch (loss 2.5074): 21%|ββ | 264/1250 [01:39<05:48, 2.83it/s]
Training 1/1 epoch (loss 2.8189): 21%|ββ | 264/1250 [01:40<05:48, 2.83it/s]
Training 1/1 epoch (loss 2.8189): 21%|ββ | 265/1250 [01:40<05:41, 2.88it/s]
Training 1/1 epoch (loss 2.8258): 21%|ββ | 265/1250 [01:40<05:41, 2.88it/s]
Training 1/1 epoch (loss 2.8258): 21%|βββ | 266/1250 [01:40<05:34, 2.94it/s]
Training 1/1 epoch (loss 2.9070): 21%|βββ | 266/1250 [01:40<05:34, 2.94it/s]
Training 1/1 epoch (loss 2.9070): 21%|βββ | 267/1250 [01:40<05:32, 2.96it/s]
Training 1/1 epoch (loss 2.8604): 21%|βββ | 267/1250 [01:41<05:32, 2.96it/s]
Training 1/1 epoch (loss 2.8604): 21%|βββ | 268/1250 [01:41<05:32, 2.95it/s]
Training 1/1 epoch (loss 2.8039): 21%|βββ | 268/1250 [01:41<05:32, 2.95it/s]
Training 1/1 epoch (loss 2.8039): 22%|βββ | 269/1250 [01:41<05:44, 2.85it/s]
Training 1/1 epoch (loss 2.7712): 22%|βββ | 269/1250 [01:41<05:44, 2.85it/s]
Training 1/1 epoch (loss 2.7712): 22%|βββ | 270/1250 [01:41<05:31, 2.96it/s]
Training 1/1 epoch (loss 2.6228): 22%|βββ | 270/1250 [01:42<05:31, 2.96it/s]
Training 1/1 epoch (loss 2.6228): 22%|βββ | 271/1250 [01:42<05:20, 3.05it/s]
Training 1/1 epoch (loss 2.5480): 22%|βββ | 271/1250 [01:42<05:20, 3.05it/s]
Training 1/1 epoch (loss 2.5480): 22%|βββ | 272/1250 [01:42<05:26, 2.99it/s]
Training 1/1 epoch (loss 2.7337): 22%|βββ | 272/1250 [01:42<05:26, 2.99it/s]
Training 1/1 epoch (loss 2.7337): 22%|βββ | 273/1250 [01:42<05:42, 2.85it/s]
Training 1/1 epoch (loss 2.8524): 22%|βββ | 273/1250 [01:43<05:42, 2.85it/s]
Training 1/1 epoch (loss 2.8524): 22%|βββ | 274/1250 [01:43<05:35, 2.91it/s]
Training 1/1 epoch (loss 2.9081): 22%|βββ | 274/1250 [01:43<05:35, 2.91it/s]
Training 1/1 epoch (loss 2.9081): 22%|βββ | 275/1250 [01:43<05:29, 2.96it/s]
Training 1/1 epoch (loss 2.8290): 22%|βββ | 275/1250 [01:43<05:29, 2.96it/s]
Training 1/1 epoch (loss 2.8290): 22%|βββ | 276/1250 [01:43<05:40, 2.86it/s]
Training 1/1 epoch (loss 2.7824): 22%|βββ | 276/1250 [01:44<05:40, 2.86it/s]
Training 1/1 epoch (loss 2.7824): 22%|βββ | 277/1250 [01:44<05:52, 2.76it/s]
Training 1/1 epoch (loss 2.9044): 22%|βββ | 277/1250 [01:44<05:52, 2.76it/s]
Training 1/1 epoch (loss 2.9044): 22%|βββ | 278/1250 [01:44<05:56, 2.73it/s]
Training 1/1 epoch (loss 2.9523): 22%|βββ | 278/1250 [01:45<05:56, 2.73it/s]
Training 1/1 epoch (loss 2.9523): 22%|βββ | 279/1250 [01:45<05:59, 2.70it/s]
Training 1/1 epoch (loss 2.7367): 22%|βββ | 279/1250 [01:45<05:59, 2.70it/s]
Training 1/1 epoch (loss 2.7367): 22%|βββ | 280/1250 [01:45<05:48, 2.78it/s]
Training 1/1 epoch (loss 2.9114): 22%|βββ | 280/1250 [01:45<05:48, 2.78it/s]
Training 1/1 epoch (loss 2.9114): 22%|βββ | 281/1250 [01:45<05:44, 2.81it/s]
Training 1/1 epoch (loss 2.7701): 22%|βββ | 281/1250 [01:46<05:44, 2.81it/s]
Training 1/1 epoch (loss 2.7701): 23%|βββ | 282/1250 [01:46<05:31, 2.92it/s]
Training 1/1 epoch (loss 2.8238): 23%|βββ | 282/1250 [01:46<05:31, 2.92it/s]
Training 1/1 epoch (loss 2.8238): 23%|βββ | 283/1250 [01:46<05:32, 2.91it/s]
Training 1/1 epoch (loss 2.9327): 23%|βββ | 283/1250 [01:46<05:32, 2.91it/s]
Training 1/1 epoch (loss 2.9327): 23%|βββ | 284/1250 [01:46<05:31, 2.92it/s]
Training 1/1 epoch (loss 2.7839): 23%|βββ | 284/1250 [01:47<05:31, 2.92it/s]
Training 1/1 epoch (loss 2.7839): 23%|βββ | 285/1250 [01:47<05:22, 2.99it/s]
Training 1/1 epoch (loss 2.6116): 23%|βββ | 285/1250 [01:47<05:22, 2.99it/s]
Training 1/1 epoch (loss 2.6116): 23%|βββ | 286/1250 [01:47<05:19, 3.02it/s]
Training 1/1 epoch (loss 2.7865): 23%|βββ | 286/1250 [01:47<05:19, 3.02it/s]
Training 1/1 epoch (loss 2.7865): 23%|βββ | 287/1250 [01:47<05:16, 3.04it/s]
Training 1/1 epoch (loss 2.7880): 23%|βββ | 287/1250 [01:48<05:16, 3.04it/s]
Training 1/1 epoch (loss 2.7880): 23%|βββ | 288/1250 [01:48<05:13, 3.07it/s]
Training 1/1 epoch (loss 2.6956): 23%|βββ | 288/1250 [01:48<05:13, 3.07it/s]
Training 1/1 epoch (loss 2.6956): 23%|βββ | 289/1250 [01:48<05:19, 3.01it/s]
Training 1/1 epoch (loss 2.6824): 23%|βββ | 289/1250 [01:48<05:19, 3.01it/s]
Training 1/1 epoch (loss 2.6824): 23%|βββ | 290/1250 [01:48<05:34, 2.87it/s]
Training 1/1 epoch (loss 2.7482): 23%|βββ | 290/1250 [01:49<05:34, 2.87it/s]
Training 1/1 epoch (loss 2.7482): 23%|βββ | 291/1250 [01:49<05:36, 2.85it/s]
Training 1/1 epoch (loss 2.9872): 23%|βββ | 291/1250 [01:49<05:36, 2.85it/s]
Training 1/1 epoch (loss 2.9872): 23%|βββ | 292/1250 [01:49<05:33, 2.87it/s]
Training 1/1 epoch (loss 2.9354): 23%|βββ | 292/1250 [01:49<05:33, 2.87it/s]
Training 1/1 epoch (loss 2.9354): 23%|βββ | 293/1250 [01:49<05:44, 2.77it/s]
Training 1/1 epoch (loss 2.7712): 23%|βββ | 293/1250 [01:50<05:44, 2.77it/s]
Training 1/1 epoch (loss 2.7712): 24%|βββ | 294/1250 [01:50<05:29, 2.90it/s]
Training 1/1 epoch (loss 2.7200): 24%|βββ | 294/1250 [01:50<05:29, 2.90it/s]
Training 1/1 epoch (loss 2.7200): 24%|βββ | 295/1250 [01:50<05:24, 2.95it/s]
Training 1/1 epoch (loss 2.8250): 24%|βββ | 295/1250 [01:51<05:24, 2.95it/s]
Training 1/1 epoch (loss 2.8250): 24%|βββ | 296/1250 [01:51<05:54, 2.69it/s]
Training 1/1 epoch (loss 2.7569): 24%|βββ | 296/1250 [01:51<05:54, 2.69it/s]
Training 1/1 epoch (loss 2.7569): 24%|βββ | 297/1250 [01:51<05:47, 2.74it/s]
Training 1/1 epoch (loss 2.8134): 24%|βββ | 297/1250 [01:51<05:47, 2.74it/s]
Training 1/1 epoch (loss 2.8134): 24%|βββ | 298/1250 [01:51<05:46, 2.75it/s]
Training 1/1 epoch (loss 2.6956): 24%|βββ | 298/1250 [01:52<05:46, 2.75it/s]
Training 1/1 epoch (loss 2.6956): 24%|βββ | 299/1250 [01:52<05:31, 2.87it/s]
Training 1/1 epoch (loss 2.8514): 24%|βββ | 299/1250 [01:52<05:31, 2.87it/s]
Training 1/1 epoch (loss 2.8514): 24%|βββ | 300/1250 [01:52<05:23, 2.94it/s]
Training 1/1 epoch (loss 2.7176): 24%|βββ | 300/1250 [01:52<05:23, 2.94it/s]
Training 1/1 epoch (loss 2.7176): 24%|βββ | 301/1250 [01:52<05:16, 3.00it/s]
Training 1/1 epoch (loss 2.6137): 24%|βββ | 301/1250 [01:53<05:16, 3.00it/s]
Training 1/1 epoch (loss 2.6137): 24%|βββ | 302/1250 [01:53<05:39, 2.79it/s]
Training 1/1 epoch (loss 2.4917): 24%|βββ | 302/1250 [01:53<05:39, 2.79it/s]
Training 1/1 epoch (loss 2.4917): 24%|βββ | 303/1250 [01:53<06:32, 2.41it/s]
Training 1/1 epoch (loss 2.9178): 24%|βββ | 303/1250 [01:54<06:32, 2.41it/s]
Training 1/1 epoch (loss 2.9178): 24%|βββ | 304/1250 [01:54<06:23, 2.47it/s]
Training 1/1 epoch (loss 3.0055): 24%|βββ | 304/1250 [01:54<06:23, 2.47it/s]
Training 1/1 epoch (loss 3.0055): 24%|βββ | 305/1250 [01:54<06:11, 2.54it/s]
Training 1/1 epoch (loss 2.9180): 24%|βββ | 305/1250 [01:54<06:11, 2.54it/s]
Training 1/1 epoch (loss 2.9180): 24%|βββ | 306/1250 [01:54<05:56, 2.65it/s]
Training 1/1 epoch (loss 2.8254): 24%|βββ | 306/1250 [01:55<05:56, 2.65it/s]
Training 1/1 epoch (loss 2.8254): 25%|βββ | 307/1250 [01:55<05:46, 2.72it/s]
Training 1/1 epoch (loss 2.7462): 25%|βββ | 307/1250 [01:55<05:46, 2.72it/s]
Training 1/1 epoch (loss 2.7462): 25%|βββ | 308/1250 [01:55<05:27, 2.88it/s]
Training 1/1 epoch (loss 2.6662): 25%|βββ | 308/1250 [01:55<05:27, 2.88it/s]
Training 1/1 epoch (loss 2.6662): 25%|βββ | 309/1250 [01:55<05:15, 2.98it/s]
Training 1/1 epoch (loss 2.6979): 25%|βββ | 309/1250 [01:55<05:15, 2.98it/s]
Training 1/1 epoch (loss 2.6979): 25%|βββ | 310/1250 [01:55<05:12, 3.01it/s]
Training 1/1 epoch (loss 2.8578): 25%|βββ | 310/1250 [01:56<05:12, 3.01it/s]
Training 1/1 epoch (loss 2.8578): 25%|βββ | 311/1250 [01:56<05:15, 2.98it/s]
Training 1/1 epoch (loss 2.7132): 25%|βββ | 311/1250 [01:56<05:15, 2.98it/s]
Training 1/1 epoch (loss 2.7132): 25%|βββ | 312/1250 [01:56<05:26, 2.88it/s]
Training 1/1 epoch (loss 2.5502): 25%|βββ | 312/1250 [01:57<05:26, 2.88it/s]
Training 1/1 epoch (loss 2.5502): 25%|βββ | 313/1250 [01:57<05:39, 2.76it/s]
Training 1/1 epoch (loss 2.6810): 25%|βββ | 313/1250 [01:57<05:39, 2.76it/s]
Training 1/1 epoch (loss 2.6810): 25%|βββ | 314/1250 [01:57<05:31, 2.82it/s]
Training 1/1 epoch (loss 2.8947): 25%|βββ | 314/1250 [01:57<05:31, 2.82it/s]
Training 1/1 epoch (loss 2.8947): 25%|βββ | 315/1250 [01:57<05:23, 2.89it/s]
Training 1/1 epoch (loss 2.7441): 25%|βββ | 315/1250 [01:58<05:23, 2.89it/s]
Training 1/1 epoch (loss 2.7441): 25%|βββ | 316/1250 [01:58<05:49, 2.67it/s]
Training 1/1 epoch (loss 2.8273): 25%|βββ | 316/1250 [01:58<05:49, 2.67it/s]
Training 1/1 epoch (loss 2.8273): 25%|βββ | 317/1250 [01:58<06:10, 2.52it/s]
Training 1/1 epoch (loss 2.9711): 25%|βββ | 317/1250 [01:59<06:10, 2.52it/s]
Training 1/1 epoch (loss 2.9711): 25%|βββ | 318/1250 [01:59<05:57, 2.60it/s]
Training 1/1 epoch (loss 2.7190): 25%|βββ | 318/1250 [01:59<05:57, 2.60it/s]
Training 1/1 epoch (loss 2.7190): 26%|βββ | 319/1250 [01:59<05:32, 2.80it/s]
Training 1/1 epoch (loss 2.8411): 26%|βββ | 319/1250 [01:59<05:32, 2.80it/s]
Training 1/1 epoch (loss 2.8411): 26%|βββ | 320/1250 [01:59<05:31, 2.80it/s]
Training 1/1 epoch (loss 2.7386): 26%|βββ | 320/1250 [02:00<05:31, 2.80it/s]
Training 1/1 epoch (loss 2.7386): 26%|βββ | 321/1250 [02:00<05:38, 2.74it/s]
Training 1/1 epoch (loss 2.8214): 26%|βββ | 321/1250 [02:00<05:38, 2.74it/s]
Training 1/1 epoch (loss 2.8214): 26%|βββ | 322/1250 [02:00<05:39, 2.73it/s]
Training 1/1 epoch (loss 2.6663): 26%|βββ | 322/1250 [02:00<05:39, 2.73it/s]
Training 1/1 epoch (loss 2.6663): 26%|βββ | 323/1250 [02:00<05:31, 2.80it/s]
Training 1/1 epoch (loss 2.9963): 26%|βββ | 323/1250 [02:01<05:31, 2.80it/s]
Training 1/1 epoch (loss 2.9963): 26%|βββ | 324/1250 [02:01<05:16, 2.92it/s]
Training 1/1 epoch (loss 2.5491): 26%|βββ | 324/1250 [02:01<05:16, 2.92it/s]
Training 1/1 epoch (loss 2.5491): 26%|βββ | 325/1250 [02:01<05:09, 2.99it/s]
Training 1/1 epoch (loss 2.8459): 26%|βββ | 325/1250 [02:01<05:09, 2.99it/s]
Training 1/1 epoch (loss 2.8459): 26%|βββ | 326/1250 [02:01<05:08, 2.99it/s]
Training 1/1 epoch (loss 2.6885): 26%|βββ | 326/1250 [02:02<05:08, 2.99it/s]
Training 1/1 epoch (loss 2.6885): 26%|βββ | 327/1250 [02:02<05:13, 2.94it/s]
Training 1/1 epoch (loss 2.4939): 26%|βββ | 327/1250 [02:02<05:13, 2.94it/s]
Training 1/1 epoch (loss 2.4939): 26%|βββ | 328/1250 [02:02<05:06, 3.01it/s]
Training 1/1 epoch (loss 3.0906): 26%|βββ | 328/1250 [02:02<05:06, 3.01it/s]
Training 1/1 epoch (loss 3.0906): 26%|βββ | 329/1250 [02:02<05:08, 2.98it/s]
Training 1/1 epoch (loss 2.6919): 26%|βββ | 329/1250 [02:03<05:08, 2.98it/s]
Training 1/1 epoch (loss 2.6919): 26%|βββ | 330/1250 [02:03<05:10, 2.96it/s]
Training 1/1 epoch (loss 2.7513): 26%|βββ | 330/1250 [02:03<05:10, 2.96it/s]
Training 1/1 epoch (loss 2.7513): 26%|βββ | 331/1250 [02:03<05:08, 2.98it/s]
Training 1/1 epoch (loss 2.9258): 26%|βββ | 331/1250 [02:03<05:08, 2.98it/s]
Training 1/1 epoch (loss 2.9258): 27%|βββ | 332/1250 [02:03<05:06, 3.00it/s]
Training 1/1 epoch (loss 2.6962): 27%|βββ | 332/1250 [02:04<05:06, 3.00it/s]
Training 1/1 epoch (loss 2.6962): 27%|βββ | 333/1250 [02:04<05:04, 3.01it/s]
Training 1/1 epoch (loss 2.8803): 27%|βββ | 333/1250 [02:04<05:04, 3.01it/s]
Training 1/1 epoch (loss 2.8803): 27%|βββ | 334/1250 [02:04<05:04, 3.01it/s]
Training 1/1 epoch (loss 2.7907): 27%|βββ | 334/1250 [02:04<05:04, 3.01it/s]
Training 1/1 epoch (loss 2.7907): 27%|βββ | 335/1250 [02:04<05:00, 3.04it/s]
Training 1/1 epoch (loss 2.8659): 27%|βββ | 335/1250 [02:05<05:00, 3.04it/s]
Training 1/1 epoch (loss 2.8659): 27%|βββ | 336/1250 [02:05<05:14, 2.91it/s]
Training 1/1 epoch (loss 3.0631): 27%|βββ | 336/1250 [02:05<05:14, 2.91it/s]
Training 1/1 epoch (loss 3.0631): 27%|βββ | 337/1250 [02:05<05:12, 2.92it/s]
Training 1/1 epoch (loss 2.8838): 27%|βββ | 337/1250 [02:05<05:12, 2.92it/s]
Training 1/1 epoch (loss 2.8838): 27%|βββ | 338/1250 [02:05<05:23, 2.82it/s]
Training 1/1 epoch (loss 2.8658): 27%|βββ | 338/1250 [02:06<05:23, 2.82it/s]
Training 1/1 epoch (loss 2.8658): 27%|βββ | 339/1250 [02:06<05:25, 2.80it/s]
Training 1/1 epoch (loss 2.8144): 27%|βββ | 339/1250 [02:06<05:25, 2.80it/s]
Training 1/1 epoch (loss 2.8144): 27%|βββ | 340/1250 [02:06<05:21, 2.83it/s]
Training 1/1 epoch (loss 2.6536): 27%|βββ | 340/1250 [02:06<05:21, 2.83it/s]
Training 1/1 epoch (loss 2.6536): 27%|βββ | 341/1250 [02:06<05:15, 2.88it/s]
Training 1/1 epoch (loss 2.7537): 27%|βββ | 341/1250 [02:07<05:15, 2.88it/s]
Training 1/1 epoch (loss 2.7537): 27%|βββ | 342/1250 [02:07<05:16, 2.87it/s]
Training 1/1 epoch (loss 2.6642): 27%|βββ | 342/1250 [02:07<05:16, 2.87it/s]
Training 1/1 epoch (loss 2.6642): 27%|βββ | 343/1250 [02:07<05:03, 2.99it/s]
Training 1/1 epoch (loss 2.8087): 27%|βββ | 343/1250 [02:07<05:03, 2.99it/s]
Training 1/1 epoch (loss 2.8087): 28%|βββ | 344/1250 [02:07<05:07, 2.94it/s]
Training 1/1 epoch (loss 2.7556): 28%|βββ | 344/1250 [02:08<05:07, 2.94it/s]
Training 1/1 epoch (loss 2.7556): 28%|βββ | 345/1250 [02:08<05:10, 2.92it/s]
Training 1/1 epoch (loss 2.7583): 28%|βββ | 345/1250 [02:08<05:10, 2.92it/s]
Training 1/1 epoch (loss 2.7583): 28%|βββ | 346/1250 [02:08<05:06, 2.95it/s]
Training 1/1 epoch (loss 2.8910): 28%|βββ | 346/1250 [02:08<05:06, 2.95it/s]
Training 1/1 epoch (loss 2.8910): 28%|βββ | 347/1250 [02:08<05:03, 2.97it/s]
Training 1/1 epoch (loss 2.9117): 28%|βββ | 347/1250 [02:09<05:03, 2.97it/s]
Training 1/1 epoch (loss 2.9117): 28%|βββ | 348/1250 [02:09<05:04, 2.96it/s]
Training 1/1 epoch (loss 2.7459): 28%|βββ | 348/1250 [02:09<05:04, 2.96it/s]
Training 1/1 epoch (loss 2.7459): 28%|βββ | 349/1250 [02:09<04:57, 3.03it/s]
Training 1/1 epoch (loss 2.8986): 28%|βββ | 349/1250 [02:09<04:57, 3.03it/s]
Training 1/1 epoch (loss 2.8986): 28%|βββ | 350/1250 [02:09<04:50, 3.10it/s]
Training 1/1 epoch (loss 2.7822): 28%|βββ | 350/1250 [02:10<04:50, 3.10it/s]
Training 1/1 epoch (loss 2.7822): 28%|βββ | 351/1250 [02:10<04:46, 3.13it/s]
Training 1/1 epoch (loss 2.9602): 28%|βββ | 351/1250 [02:10<04:46, 3.13it/s]
Training 1/1 epoch (loss 2.9602): 28%|βββ | 352/1250 [02:10<05:07, 2.92it/s]
Training 1/1 epoch (loss 2.9708): 28%|βββ | 352/1250 [02:10<05:07, 2.92it/s]
Training 1/1 epoch (loss 2.9708): 28%|βββ | 353/1250 [02:10<05:13, 2.86it/s]
Training 1/1 epoch (loss 2.7881): 28%|βββ | 353/1250 [02:11<05:13, 2.86it/s]
Training 1/1 epoch (loss 2.7881): 28%|βββ | 354/1250 [02:11<05:11, 2.88it/s]
Training 1/1 epoch (loss 2.6643): 28%|βββ | 354/1250 [02:11<05:11, 2.88it/s]
Training 1/1 epoch (loss 2.6643): 28%|βββ | 355/1250 [02:11<05:05, 2.93it/s]
Training 1/1 epoch (loss 2.7512): 28%|βββ | 355/1250 [02:11<05:05, 2.93it/s]
Training 1/1 epoch (loss 2.7512): 28%|βββ | 356/1250 [02:11<05:02, 2.95it/s]
Training 1/1 epoch (loss 2.9928): 28%|βββ | 356/1250 [02:12<05:02, 2.95it/s]
Training 1/1 epoch (loss 2.9928): 29%|βββ | 357/1250 [02:12<05:14, 2.84it/s]
Training 1/1 epoch (loss 2.9059): 29%|βββ | 357/1250 [02:12<05:14, 2.84it/s]
Training 1/1 epoch (loss 2.9059): 29%|βββ | 358/1250 [02:12<05:16, 2.81it/s]
Training 1/1 epoch (loss 2.7712): 29%|βββ | 358/1250 [02:12<05:16, 2.81it/s]
Training 1/1 epoch (loss 2.7712): 29%|βββ | 359/1250 [02:12<05:05, 2.92it/s]
Training 1/1 epoch (loss 2.6026): 29%|βββ | 359/1250 [02:13<05:05, 2.92it/s]
Training 1/1 epoch (loss 2.6026): 29%|βββ | 360/1250 [02:13<05:10, 2.87it/s]
Training 1/1 epoch (loss 2.6747): 29%|βββ | 360/1250 [02:13<05:10, 2.87it/s]
Training 1/1 epoch (loss 2.6747): 29%|βββ | 361/1250 [02:13<05:08, 2.88it/s]
Training 1/1 epoch (loss 2.7518): 29%|βββ | 361/1250 [02:14<05:08, 2.88it/s]
Training 1/1 epoch (loss 2.7518): 29%|βββ | 362/1250 [02:14<05:09, 2.87it/s]
Training 1/1 epoch (loss 2.8559): 29%|βββ | 362/1250 [02:14<05:09, 2.87it/s]
Training 1/1 epoch (loss 2.8559): 29%|βββ | 363/1250 [02:14<04:57, 2.98it/s]
Training 1/1 epoch (loss 2.9245): 29%|βββ | 363/1250 [02:14<04:57, 2.98it/s]
Training 1/1 epoch (loss 2.9245): 29%|βββ | 364/1250 [02:14<04:57, 2.98it/s]
Training 1/1 epoch (loss 2.6623): 29%|βββ | 364/1250 [02:15<04:57, 2.98it/s]
Training 1/1 epoch (loss 2.6623): 29%|βββ | 365/1250 [02:15<05:18, 2.78it/s]
Training 1/1 epoch (loss 2.5515): 29%|βββ | 365/1250 [02:15<05:18, 2.78it/s]
Training 1/1 epoch (loss 2.5515): 29%|βββ | 366/1250 [02:15<05:12, 2.83it/s]
Training 1/1 epoch (loss 2.6564): 29%|βββ | 366/1250 [02:15<05:12, 2.83it/s]
Training 1/1 epoch (loss 2.6564): 29%|βββ | 367/1250 [02:15<05:03, 2.91it/s]
Training 1/1 epoch (loss 2.7153): 29%|βββ | 367/1250 [02:16<05:03, 2.91it/s]
Training 1/1 epoch (loss 2.7153): 29%|βββ | 368/1250 [02:16<05:03, 2.91it/s]
Training 1/1 epoch (loss 2.8273): 29%|βββ | 368/1250 [02:16<05:03, 2.91it/s]
Training 1/1 epoch (loss 2.8273): 30%|βββ | 369/1250 [02:16<05:03, 2.91it/s]
Training 1/1 epoch (loss 2.8851): 30%|βββ | 369/1250 [02:16<05:03, 2.91it/s]
Training 1/1 epoch (loss 2.8851): 30%|βββ | 370/1250 [02:16<05:02, 2.91it/s]
Training 1/1 epoch (loss 2.6380): 30%|βββ | 370/1250 [02:17<05:02, 2.91it/s]
Training 1/1 epoch (loss 2.6380): 30%|βββ | 371/1250 [02:17<04:59, 2.94it/s]
Training 1/1 epoch (loss 2.7528): 30%|βββ | 371/1250 [02:17<04:59, 2.94it/s]
Training 1/1 epoch (loss 2.7528): 30%|βββ | 372/1250 [02:17<04:56, 2.96it/s]
Training 1/1 epoch (loss 2.6994): 30%|βββ | 372/1250 [02:17<04:56, 2.96it/s]
Training 1/1 epoch (loss 2.6994): 30%|βββ | 373/1250 [02:17<04:48, 3.04it/s]
Training 1/1 epoch (loss 2.5205): 30%|βββ | 373/1250 [02:18<04:48, 3.04it/s]
Training 1/1 epoch (loss 2.5205): 30%|βββ | 374/1250 [02:18<04:40, 3.12it/s]
Training 1/1 epoch (loss 2.7465): 30%|βββ | 374/1250 [02:18<04:40, 3.12it/s]
Training 1/1 epoch (loss 2.7465): 30%|βββ | 375/1250 [02:18<04:40, 3.12it/s]
Training 1/1 epoch (loss 2.9521): 30%|βββ | 375/1250 [02:18<04:40, 3.12it/s]
Training 1/1 epoch (loss 2.9521): 30%|βββ | 376/1250 [02:18<04:54, 2.97it/s]
Training 1/1 epoch (loss 2.9065): 30%|βββ | 376/1250 [02:19<04:54, 2.97it/s]
Training 1/1 epoch (loss 2.9065): 30%|βββ | 377/1250 [02:19<05:09, 2.82it/s]
Training 1/1 epoch (loss 2.7587): 30%|βββ | 377/1250 [02:19<05:09, 2.82it/s]
Training 1/1 epoch (loss 2.7587): 30%|βββ | 378/1250 [02:19<05:01, 2.89it/s]
Training 1/1 epoch (loss 2.6394): 30%|βββ | 378/1250 [02:19<05:01, 2.89it/s]
Training 1/1 epoch (loss 2.6394): 30%|βββ | 379/1250 [02:19<04:50, 3.00it/s]
Training 1/1 epoch (loss 2.8081): 30%|βββ | 379/1250 [02:20<04:50, 3.00it/s]
Training 1/1 epoch (loss 2.8081): 30%|βββ | 380/1250 [02:20<04:44, 3.06it/s]
Training 1/1 epoch (loss 2.8622): 30%|βββ | 380/1250 [02:20<04:44, 3.06it/s]
Training 1/1 epoch (loss 2.8622): 30%|βββ | 381/1250 [02:20<04:43, 3.06it/s]
Training 1/1 epoch (loss 2.9108): 30%|βββ | 381/1250 [02:20<04:43, 3.06it/s]
Training 1/1 epoch (loss 2.9108): 31%|βββ | 382/1250 [02:20<04:50, 2.99it/s]
Training 1/1 epoch (loss 2.7704): 31%|βββ | 382/1250 [02:21<04:50, 2.99it/s]
Training 1/1 epoch (loss 2.7704): 31%|βββ | 383/1250 [02:21<04:47, 3.02it/s]
Training 1/1 epoch (loss 2.6240): 31%|βββ | 383/1250 [02:21<04:47, 3.02it/s]
Training 1/1 epoch (loss 2.6240): 31%|βββ | 384/1250 [02:21<05:05, 2.83it/s]
Training 1/1 epoch (loss 2.7724): 31%|βββ | 384/1250 [02:21<05:05, 2.83it/s]
Training 1/1 epoch (loss 2.7724): 31%|βββ | 385/1250 [02:21<05:03, 2.85it/s]
Training 1/1 epoch (loss 2.5598): 31%|βββ | 385/1250 [02:22<05:03, 2.85it/s]
Training 1/1 epoch (loss 2.5598): 31%|βββ | 386/1250 [02:22<04:47, 3.01it/s]
Training 1/1 epoch (loss 2.6407): 31%|βββ | 386/1250 [02:22<04:47, 3.01it/s]
Training 1/1 epoch (loss 2.6407): 31%|βββ | 387/1250 [02:22<05:09, 2.79it/s]
Training 1/1 epoch (loss 2.7247): 31%|βββ | 387/1250 [02:22<05:09, 2.79it/s]
Training 1/1 epoch (loss 2.7247): 31%|βββ | 388/1250 [02:22<05:02, 2.85it/s]
Training 1/1 epoch (loss 2.7953): 31%|βββ | 388/1250 [02:23<05:02, 2.85it/s]
Training 1/1 epoch (loss 2.7953): 31%|βββ | 389/1250 [02:23<06:21, 2.26it/s]
Training 1/1 epoch (loss 2.7305): 31%|βββ | 389/1250 [02:23<06:21, 2.26it/s]
Training 1/1 epoch (loss 2.7305): 31%|βββ | 390/1250 [02:23<05:47, 2.48it/s]
Training 1/1 epoch (loss 2.9599): 31%|βββ | 390/1250 [02:24<05:47, 2.48it/s]
Training 1/1 epoch (loss 2.9599): 31%|ββββ | 391/1250 [02:24<05:37, 2.55it/s]
Training 1/1 epoch (loss 2.7857): 31%|ββββ | 391/1250 [02:24<05:37, 2.55it/s]
Training 1/1 epoch (loss 2.7857): 31%|ββββ | 392/1250 [02:24<05:29, 2.61it/s]
Training 1/1 epoch (loss 2.8636): 31%|ββββ | 392/1250 [02:24<05:29, 2.61it/s]
Training 1/1 epoch (loss 2.8636): 31%|ββββ | 393/1250 [02:24<05:32, 2.58it/s]
Training 1/1 epoch (loss 2.6808): 31%|ββββ | 393/1250 [02:25<05:32, 2.58it/s]
Training 1/1 epoch (loss 2.6808): 32%|ββββ | 394/1250 [02:25<05:25, 2.63it/s]
Training 1/1 epoch (loss 2.4952): 32%|ββββ | 394/1250 [02:25<05:25, 2.63it/s]
Training 1/1 epoch (loss 2.4952): 32%|ββββ | 395/1250 [02:25<05:27, 2.61it/s]
Training 1/1 epoch (loss 2.7016): 32%|ββββ | 395/1250 [02:26<05:27, 2.61it/s]
Training 1/1 epoch (loss 2.7016): 32%|ββββ | 396/1250 [02:26<05:12, 2.74it/s]
Training 1/1 epoch (loss 2.5014): 32%|ββββ | 396/1250 [02:26<05:12, 2.74it/s]
Training 1/1 epoch (loss 2.5014): 32%|ββββ | 397/1250 [02:26<05:01, 2.83it/s]
Training 1/1 epoch (loss 2.6134): 32%|ββββ | 397/1250 [02:26<05:01, 2.83it/s]
Training 1/1 epoch (loss 2.6134): 32%|ββββ | 398/1250 [02:26<05:19, 2.67it/s]
Training 1/1 epoch (loss 2.8076): 32%|ββββ | 398/1250 [02:27<05:19, 2.67it/s]
Training 1/1 epoch (loss 2.8076): 32%|ββββ | 399/1250 [02:27<05:11, 2.73it/s]
Training 1/1 epoch (loss 2.8822): 32%|ββββ | 399/1250 [02:27<05:11, 2.73it/s]
Training 1/1 epoch (loss 2.8822): 32%|ββββ | 400/1250 [02:27<05:13, 2.71it/s]
Training 1/1 epoch (loss 2.6062): 32%|ββββ | 400/1250 [02:27<05:13, 2.71it/s]
Training 1/1 epoch (loss 2.6062): 32%|ββββ | 401/1250 [02:27<05:22, 2.63it/s]
Training 1/1 epoch (loss 2.9317): 32%|ββββ | 401/1250 [02:28<05:22, 2.63it/s]
Training 1/1 epoch (loss 2.9317): 32%|ββββ | 402/1250 [02:28<05:21, 2.64it/s]
Training 1/1 epoch (loss 2.8921): 32%|ββββ | 402/1250 [02:28<05:21, 2.64it/s]
Training 1/1 epoch (loss 2.8921): 32%|ββββ | 403/1250 [02:28<05:47, 2.43it/s]
Training 1/1 epoch (loss 2.6160): 32%|ββββ | 403/1250 [02:29<05:47, 2.43it/s]
Training 1/1 epoch (loss 2.6160): 32%|ββββ | 404/1250 [02:29<05:39, 2.49it/s]
Training 1/1 epoch (loss 2.7883): 32%|ββββ | 404/1250 [02:29<05:39, 2.49it/s]
Training 1/1 epoch (loss 2.7883): 32%|ββββ | 405/1250 [02:29<05:09, 2.73it/s]
Training 1/1 epoch (loss 2.7512): 32%|ββββ | 405/1250 [02:29<05:09, 2.73it/s]
Training 1/1 epoch (loss 2.7512): 32%|ββββ | 406/1250 [02:29<04:55, 2.85it/s]
Training 1/1 epoch (loss 2.6987): 32%|ββββ | 406/1250 [02:30<04:55, 2.85it/s]
Training 1/1 epoch (loss 2.6987): 33%|ββββ | 407/1250 [02:30<04:50, 2.90it/s]
Training 1/1 epoch (loss 2.7689): 33%|ββββ | 407/1250 [02:30<04:50, 2.90it/s]
Training 1/1 epoch (loss 2.7689): 33%|ββββ | 408/1250 [02:30<04:47, 2.93it/s]
Training 1/1 epoch (loss 2.9144): 33%|ββββ | 408/1250 [02:30<04:47, 2.93it/s]
Training 1/1 epoch (loss 2.9144): 33%|ββββ | 409/1250 [02:30<04:50, 2.89it/s]
Training 1/1 epoch (loss 2.9180): 33%|ββββ | 409/1250 [02:31<04:50, 2.89it/s]
Training 1/1 epoch (loss 2.9180): 33%|ββββ | 410/1250 [02:31<04:59, 2.81it/s]
Training 1/1 epoch (loss 2.6599): 33%|ββββ | 410/1250 [02:31<04:59, 2.81it/s]
Training 1/1 epoch (loss 2.6599): 33%|ββββ | 411/1250 [02:31<04:45, 2.94it/s]
Training 1/1 epoch (loss 2.5595): 33%|ββββ | 411/1250 [02:31<04:45, 2.94it/s]
Training 1/1 epoch (loss 2.5595): 33%|ββββ | 412/1250 [02:31<04:35, 3.04it/s]
Training 1/1 epoch (loss 2.6950): 33%|ββββ | 412/1250 [02:32<04:35, 3.04it/s]
Training 1/1 epoch (loss 2.6950): 33%|ββββ | 413/1250 [02:32<04:36, 3.03it/s]
Training 1/1 epoch (loss 2.8636): 33%|ββββ | 413/1250 [02:32<04:36, 3.03it/s]
Training 1/1 epoch (loss 2.8636): 33%|ββββ | 414/1250 [02:32<04:32, 3.06it/s]
Training 1/1 epoch (loss 2.7607): 33%|ββββ | 414/1250 [02:32<04:32, 3.06it/s]
Training 1/1 epoch (loss 2.7607): 33%|ββββ | 415/1250 [02:32<04:29, 3.09it/s]
Training 1/1 epoch (loss 2.8561): 33%|ββββ | 415/1250 [02:33<04:29, 3.09it/s]
Training 1/1 epoch (loss 2.8561): 33%|ββββ | 416/1250 [02:33<04:43, 2.94it/s]
Training 1/1 epoch (loss 2.8872): 33%|ββββ | 416/1250 [02:33<04:43, 2.94it/s]
Training 1/1 epoch (loss 2.8872): 33%|ββββ | 417/1250 [02:33<04:45, 2.92it/s]
Training 1/1 epoch (loss 2.7830): 33%|ββββ | 417/1250 [02:33<04:45, 2.92it/s]
Training 1/1 epoch (loss 2.7830): 33%|ββββ | 418/1250 [02:33<04:47, 2.89it/s]
Training 1/1 epoch (loss 2.8993): 33%|ββββ | 418/1250 [02:34<04:47, 2.89it/s]
Training 1/1 epoch (loss 2.8993): 34%|ββββ | 419/1250 [02:34<04:59, 2.78it/s]
Training 1/1 epoch (loss 2.7858): 34%|ββββ | 419/1250 [02:34<04:59, 2.78it/s]
Training 1/1 epoch (loss 2.7858): 34%|ββββ | 420/1250 [02:34<04:56, 2.80it/s]
Training 1/1 epoch (loss 2.7142): 34%|ββββ | 420/1250 [02:34<04:56, 2.80it/s]
Training 1/1 epoch (loss 2.7142): 34%|ββββ | 421/1250 [02:34<04:57, 2.79it/s]
Training 1/1 epoch (loss 2.7387): 34%|ββββ | 421/1250 [02:35<04:57, 2.79it/s]
Training 1/1 epoch (loss 2.7387): 34%|ββββ | 422/1250 [02:35<05:13, 2.64it/s]
Training 1/1 epoch (loss 2.8256): 34%|ββββ | 422/1250 [02:35<05:13, 2.64it/s]
Training 1/1 epoch (loss 2.8256): 34%|ββββ | 423/1250 [02:35<05:04, 2.71it/s]
Training 1/1 epoch (loss 2.7409): 34%|ββββ | 423/1250 [02:36<05:04, 2.71it/s]
Training 1/1 epoch (loss 2.7409): 34%|ββββ | 424/1250 [02:36<05:04, 2.71it/s]
Training 1/1 epoch (loss 2.8812): 34%|ββββ | 424/1250 [02:36<05:04, 2.71it/s]
Training 1/1 epoch (loss 2.8812): 34%|ββββ | 425/1250 [02:36<05:06, 2.69it/s]
Training 1/1 epoch (loss 3.0137): 34%|ββββ | 425/1250 [02:36<05:06, 2.69it/s]
Training 1/1 epoch (loss 3.0137): 34%|ββββ | 426/1250 [02:36<04:59, 2.75it/s]
Training 1/1 epoch (loss 2.7769): 34%|ββββ | 426/1250 [02:37<04:59, 2.75it/s]
Training 1/1 epoch (loss 2.7769): 34%|ββββ | 427/1250 [02:37<04:50, 2.83it/s]
Training 1/1 epoch (loss 2.6991): 34%|ββββ | 427/1250 [02:37<04:50, 2.83it/s]
Training 1/1 epoch (loss 2.6991): 34%|ββββ | 428/1250 [02:37<04:59, 2.74it/s]
Training 1/1 epoch (loss 2.7067): 34%|ββββ | 428/1250 [02:37<04:59, 2.74it/s]
Training 1/1 epoch (loss 2.7067): 34%|ββββ | 429/1250 [02:37<04:45, 2.88it/s]
Training 1/1 epoch (loss 2.5678): 34%|ββββ | 429/1250 [02:38<04:45, 2.88it/s]
Training 1/1 epoch (loss 2.5678): 34%|ββββ | 430/1250 [02:38<04:35, 2.98it/s]
Training 1/1 epoch (loss 2.7060): 34%|ββββ | 430/1250 [02:38<04:35, 2.98it/s]
Training 1/1 epoch (loss 2.7060): 34%|ββββ | 431/1250 [02:38<04:31, 3.02it/s]
Training 1/1 epoch (loss 2.8099): 34%|ββββ | 431/1250 [02:38<04:31, 3.02it/s]
Training 1/1 epoch (loss 2.8099): 35%|ββββ | 432/1250 [02:38<04:28, 3.05it/s]
Training 1/1 epoch (loss 2.6792): 35%|ββββ | 432/1250 [02:39<04:28, 3.05it/s]
Training 1/1 epoch (loss 2.6792): 35%|ββββ | 433/1250 [02:39<04:38, 2.93it/s]
Training 1/1 epoch (loss 2.8604): 35%|ββββ | 433/1250 [02:39<04:38, 2.93it/s]
Training 1/1 epoch (loss 2.8604): 35%|ββββ | 434/1250 [02:39<04:31, 3.00it/s]
Training 1/1 epoch (loss 2.8567): 35%|ββββ | 434/1250 [02:39<04:31, 3.00it/s]
Training 1/1 epoch (loss 2.8567): 35%|ββββ | 435/1250 [02:39<04:27, 3.04it/s]
Training 1/1 epoch (loss 2.8899): 35%|ββββ | 435/1250 [02:40<04:27, 3.04it/s]
Training 1/1 epoch (loss 2.8899): 35%|ββββ | 436/1250 [02:40<04:18, 3.14it/s]
Training 1/1 epoch (loss 2.7012): 35%|ββββ | 436/1250 [02:40<04:18, 3.14it/s]
Training 1/1 epoch (loss 2.7012): 35%|ββββ | 437/1250 [02:40<04:17, 3.16it/s]
Training 1/1 epoch (loss 2.6956): 35%|ββββ | 437/1250 [02:40<04:17, 3.16it/s]
Training 1/1 epoch (loss 2.6956): 35%|ββββ | 438/1250 [02:40<04:19, 3.13it/s]
Training 1/1 epoch (loss 2.8203): 35%|ββββ | 438/1250 [02:41<04:19, 3.13it/s]
Training 1/1 epoch (loss 2.8203): 35%|ββββ | 439/1250 [02:41<04:23, 3.07it/s]
Training 1/1 epoch (loss 3.1007): 35%|ββββ | 439/1250 [02:41<04:23, 3.07it/s]
Training 1/1 epoch (loss 3.1007): 35%|ββββ | 440/1250 [02:41<04:37, 2.92it/s]
Training 1/1 epoch (loss 2.6527): 35%|ββββ | 440/1250 [02:41<04:37, 2.92it/s]
Training 1/1 epoch (loss 2.6527): 35%|ββββ | 441/1250 [02:41<04:37, 2.92it/s]
Training 1/1 epoch (loss 2.7218): 35%|ββββ | 441/1250 [02:42<04:37, 2.92it/s]
Training 1/1 epoch (loss 2.7218): 35%|ββββ | 442/1250 [02:42<04:24, 3.06it/s]
Training 1/1 epoch (loss 2.7354): 35%|ββββ | 442/1250 [02:42<04:24, 3.06it/s]
Training 1/1 epoch (loss 2.7354): 35%|ββββ | 443/1250 [02:42<04:19, 3.11it/s]
Training 1/1 epoch (loss 2.9662): 35%|ββββ | 443/1250 [02:42<04:19, 3.11it/s]
Training 1/1 epoch (loss 2.9662): 36%|ββββ | 444/1250 [02:42<04:38, 2.90it/s]
Training 1/1 epoch (loss 2.6802): 36%|ββββ | 444/1250 [02:43<04:38, 2.90it/s]
Training 1/1 epoch (loss 2.6802): 36%|ββββ | 445/1250 [02:43<04:32, 2.95it/s]
Training 1/1 epoch (loss 2.6602): 36%|ββββ | 445/1250 [02:43<04:32, 2.95it/s]
Training 1/1 epoch (loss 2.6602): 36%|ββββ | 446/1250 [02:43<04:34, 2.93it/s]
Training 1/1 epoch (loss 2.7585): 36%|ββββ | 446/1250 [02:43<04:34, 2.93it/s]
Training 1/1 epoch (loss 2.7585): 36%|ββββ | 447/1250 [02:43<04:28, 2.99it/s]
Training 1/1 epoch (loss 2.5878): 36%|ββββ | 447/1250 [02:44<04:28, 2.99it/s]
Training 1/1 epoch (loss 2.5878): 36%|ββββ | 448/1250 [02:44<04:36, 2.91it/s]
Training 1/1 epoch (loss 2.8178): 36%|ββββ | 448/1250 [02:44<04:36, 2.91it/s]
Training 1/1 epoch (loss 2.8178): 36%|ββββ | 449/1250 [02:44<04:31, 2.95it/s]
Training 1/1 epoch (loss 2.7410): 36%|ββββ | 449/1250 [02:44<04:31, 2.95it/s]
Training 1/1 epoch (loss 2.7410): 36%|ββββ | 450/1250 [02:44<04:29, 2.97it/s]
Training 1/1 epoch (loss 2.6744): 36%|ββββ | 450/1250 [02:45<04:29, 2.97it/s]
Training 1/1 epoch (loss 2.6744): 36%|ββββ | 451/1250 [02:45<04:42, 2.83it/s]
Training 1/1 epoch (loss 2.6418): 36%|ββββ | 451/1250 [02:45<04:42, 2.83it/s]
Training 1/1 epoch (loss 2.6418): 36%|ββββ | 452/1250 [02:45<04:32, 2.93it/s]
Training 1/1 epoch (loss 2.8310): 36%|ββββ | 452/1250 [02:45<04:32, 2.93it/s]
Training 1/1 epoch (loss 2.8310): 36%|ββββ | 453/1250 [02:45<04:30, 2.95it/s]
Training 1/1 epoch (loss 2.7421): 36%|ββββ | 453/1250 [02:46<04:30, 2.95it/s]
Training 1/1 epoch (loss 2.7421): 36%|ββββ | 454/1250 [02:46<04:25, 3.00it/s]
Training 1/1 epoch (loss 2.7024): 36%|ββββ | 454/1250 [02:46<04:25, 3.00it/s]
Training 1/1 epoch (loss 2.7024): 36%|ββββ | 455/1250 [02:46<04:31, 2.93it/s]
Training 1/1 epoch (loss 2.8726): 36%|ββββ | 455/1250 [02:46<04:31, 2.93it/s]
Training 1/1 epoch (loss 2.8726): 36%|ββββ | 456/1250 [02:46<04:47, 2.76it/s]
Training 1/1 epoch (loss 2.9177): 36%|ββββ | 456/1250 [02:47<04:47, 2.76it/s]
Training 1/1 epoch (loss 2.9177): 37%|ββββ | 457/1250 [02:47<04:48, 2.75it/s]
Training 1/1 epoch (loss 2.7573): 37%|ββββ | 457/1250 [02:47<04:48, 2.75it/s]
Training 1/1 epoch (loss 2.7573): 37%|ββββ | 458/1250 [02:47<04:36, 2.86it/s]
Training 1/1 epoch (loss 2.9776): 37%|ββββ | 458/1250 [02:47<04:36, 2.86it/s]
Training 1/1 epoch (loss 2.9776): 37%|ββββ | 459/1250 [02:47<04:38, 2.84it/s]
Training 1/1 epoch (loss 2.8129): 37%|ββββ | 459/1250 [02:48<04:38, 2.84it/s]
Training 1/1 epoch (loss 2.8129): 37%|ββββ | 460/1250 [02:48<04:31, 2.91it/s]
Training 1/1 epoch (loss 2.6111): 37%|ββββ | 460/1250 [02:48<04:31, 2.91it/s]
Training 1/1 epoch (loss 2.6111): 37%|ββββ | 461/1250 [02:48<04:19, 3.04it/s]
Training 1/1 epoch (loss 2.6947): 37%|ββββ | 461/1250 [02:48<04:19, 3.04it/s]
Training 1/1 epoch (loss 2.6947): 37%|ββββ | 462/1250 [02:48<04:19, 3.04it/s]
Training 1/1 epoch (loss 2.7731): 37%|ββββ | 462/1250 [02:49<04:19, 3.04it/s]
Training 1/1 epoch (loss 2.7731): 37%|ββββ | 463/1250 [02:49<04:24, 2.97it/s]
Training 1/1 epoch (loss 2.6580): 37%|ββββ | 463/1250 [02:49<04:24, 2.97it/s]
Training 1/1 epoch (loss 2.6580): 37%|ββββ | 464/1250 [02:49<04:29, 2.92it/s]
Training 1/1 epoch (loss 2.4423): 37%|ββββ | 464/1250 [02:49<04:29, 2.92it/s]
Training 1/1 epoch (loss 2.4423): 37%|ββββ | 465/1250 [02:49<04:25, 2.95it/s]
Training 1/1 epoch (loss 2.5912): 37%|ββββ | 465/1250 [02:50<04:25, 2.95it/s]
Training 1/1 epoch (loss 2.5912): 37%|ββββ | 466/1250 [02:50<04:15, 3.07it/s]
Training 1/1 epoch (loss 2.6140): 37%|ββββ | 466/1250 [02:50<04:15, 3.07it/s]
Training 1/1 epoch (loss 2.6140): 37%|ββββ | 467/1250 [02:50<04:13, 3.09it/s]
Training 1/1 epoch (loss 2.7745): 37%|ββββ | 467/1250 [02:50<04:13, 3.09it/s]
Training 1/1 epoch (loss 2.7745): 37%|ββββ | 468/1250 [02:50<04:20, 3.00it/s]
Training 1/1 epoch (loss 2.8650): 37%|ββββ | 468/1250 [02:51<04:20, 3.00it/s]
Training 1/1 epoch (loss 2.8650): 38%|ββββ | 469/1250 [02:51<04:21, 2.98it/s]
Training 1/1 epoch (loss 2.8666): 38%|ββββ | 469/1250 [02:51<04:21, 2.98it/s]
Training 1/1 epoch (loss 2.8666): 38%|ββββ | 470/1250 [02:51<04:18, 3.01it/s]
Training 1/1 epoch (loss 2.8657): 38%|ββββ | 470/1250 [02:51<04:18, 3.01it/s]
Training 1/1 epoch (loss 2.8657): 38%|ββββ | 471/1250 [02:51<04:21, 2.98it/s]
Training 1/1 epoch (loss 2.6816): 38%|ββββ | 471/1250 [02:52<04:21, 2.98it/s]
Training 1/1 epoch (loss 2.6816): 38%|ββββ | 472/1250 [02:52<04:28, 2.90it/s]
Training 1/1 epoch (loss 2.6764): 38%|ββββ | 472/1250 [02:52<04:28, 2.90it/s]
Training 1/1 epoch (loss 2.6764): 38%|ββββ | 473/1250 [02:52<04:24, 2.94it/s]
Training 1/1 epoch (loss 2.6320): 38%|ββββ | 473/1250 [02:52<04:24, 2.94it/s]
Training 1/1 epoch (loss 2.6320): 38%|ββββ | 474/1250 [02:52<04:32, 2.84it/s]
Training 1/1 epoch (loss 2.7111): 38%|ββββ | 474/1250 [02:53<04:32, 2.84it/s]
Training 1/1 epoch (loss 2.7111): 38%|ββββ | 475/1250 [02:53<06:06, 2.11it/s]
Training 1/1 epoch (loss 2.8139): 38%|ββββ | 475/1250 [02:54<06:06, 2.11it/s]
Training 1/1 epoch (loss 2.8139): 38%|ββββ | 476/1250 [02:54<05:56, 2.17it/s]
Training 1/1 epoch (loss 2.7732): 38%|ββββ | 476/1250 [02:54<05:56, 2.17it/s]
Training 1/1 epoch (loss 2.7732): 38%|ββββ | 477/1250 [02:54<05:48, 2.22it/s]
Training 1/1 epoch (loss 2.9237): 38%|ββββ | 477/1250 [02:54<05:48, 2.22it/s]
Training 1/1 epoch (loss 2.9237): 38%|ββββ | 478/1250 [02:54<05:34, 2.31it/s]
Training 1/1 epoch (loss 2.6960): 38%|ββββ | 478/1250 [02:55<05:34, 2.31it/s]
Training 1/1 epoch (loss 2.6960): 38%|ββββ | 479/1250 [02:55<05:26, 2.36it/s]
Training 1/1 epoch (loss 2.7336): 38%|ββββ | 479/1250 [02:56<05:26, 2.36it/s]
Training 1/1 epoch (loss 2.7336): 38%|ββββ | 480/1250 [02:56<06:11, 2.07it/s]
Training 1/1 epoch (loss 2.6960): 38%|ββββ | 480/1250 [02:56<06:11, 2.07it/s]
Training 1/1 epoch (loss 2.6960): 38%|ββββ | 481/1250 [02:56<06:17, 2.04it/s]
Training 1/1 epoch (loss 2.9209): 38%|ββββ | 481/1250 [02:57<06:17, 2.04it/s]
Training 1/1 epoch (loss 2.9209): 39%|ββββ | 482/1250 [02:57<06:52, 1.86it/s]
Training 1/1 epoch (loss 2.6867): 39%|ββββ | 482/1250 [02:57<06:52, 1.86it/s]
Training 1/1 epoch (loss 2.6867): 39%|ββββ | 483/1250 [02:57<07:35, 1.68it/s]
Training 1/1 epoch (loss 2.6091): 39%|ββββ | 483/1250 [02:58<07:35, 1.68it/s]
Training 1/1 epoch (loss 2.6091): 39%|ββββ | 484/1250 [02:58<08:41, 1.47it/s]
Training 1/1 epoch (loss 2.6082): 39%|ββββ | 484/1250 [02:59<08:41, 1.47it/s]
Training 1/1 epoch (loss 2.6082): 39%|ββββ | 485/1250 [02:59<08:49, 1.44it/s]
Training 1/1 epoch (loss 2.7619): 39%|ββββ | 485/1250 [03:00<08:49, 1.44it/s]
Training 1/1 epoch (loss 2.7619): 39%|ββββ | 486/1250 [03:00<08:28, 1.50it/s]
Training 1/1 epoch (loss 2.7679): 39%|ββββ | 486/1250 [03:00<08:28, 1.50it/s]
Training 1/1 epoch (loss 2.7679): 39%|ββββ | 487/1250 [03:00<07:06, 1.79it/s]
Training 1/1 epoch (loss 2.6222): 39%|ββββ | 487/1250 [03:00<07:06, 1.79it/s]
Training 1/1 epoch (loss 2.6222): 39%|ββββ | 488/1250 [03:00<06:15, 2.03it/s]
Training 1/1 epoch (loss 2.7442): 39%|ββββ | 488/1250 [03:01<06:15, 2.03it/s]
Training 1/1 epoch (loss 2.7442): 39%|ββββ | 489/1250 [03:01<05:52, 2.16it/s]
Training 1/1 epoch (loss 2.9563): 39%|ββββ | 489/1250 [03:01<05:52, 2.16it/s]
Training 1/1 epoch (loss 2.9563): 39%|ββββ | 490/1250 [03:01<05:30, 2.30it/s]
Training 1/1 epoch (loss 2.7571): 39%|ββββ | 490/1250 [03:01<05:30, 2.30it/s]
Training 1/1 epoch (loss 2.7571): 39%|ββββ | 491/1250 [03:01<05:01, 2.52it/s]
Training 1/1 epoch (loss 2.6694): 39%|ββββ | 491/1250 [03:02<05:01, 2.52it/s]
Training 1/1 epoch (loss 2.6694): 39%|ββββ | 492/1250 [03:02<04:54, 2.57it/s]
Training 1/1 epoch (loss 2.7813): 39%|ββββ | 492/1250 [03:02<04:54, 2.57it/s]
Training 1/1 epoch (loss 2.7813): 39%|ββββ | 493/1250 [03:02<04:34, 2.76it/s]
Training 1/1 epoch (loss 2.5468): 39%|ββββ | 493/1250 [03:02<04:34, 2.76it/s]
Training 1/1 epoch (loss 2.5468): 40%|ββββ | 494/1250 [03:02<04:30, 2.80it/s]
Training 1/1 epoch (loss 2.8784): 40%|ββββ | 494/1250 [03:03<04:30, 2.80it/s]
Training 1/1 epoch (loss 2.8784): 40%|ββββ | 495/1250 [03:03<04:32, 2.77it/s]
Training 1/1 epoch (loss 2.8191): 40%|ββββ | 495/1250 [03:03<04:32, 2.77it/s]
Training 1/1 epoch (loss 2.8191): 40%|ββββ | 496/1250 [03:03<04:41, 2.68it/s]
Training 1/1 epoch (loss 2.6459): 40%|ββββ | 496/1250 [03:03<04:41, 2.68it/s]
Training 1/1 epoch (loss 2.6459): 40%|ββββ | 497/1250 [03:03<04:44, 2.65it/s]
Training 1/1 epoch (loss 2.8724): 40%|ββββ | 497/1250 [03:04<04:44, 2.65it/s]
Training 1/1 epoch (loss 2.8724): 40%|ββββ | 498/1250 [03:04<04:51, 2.58it/s]
Training 1/1 epoch (loss 2.4903): 40%|ββββ | 498/1250 [03:04<04:51, 2.58it/s]
Training 1/1 epoch (loss 2.4903): 40%|ββββ | 499/1250 [03:04<04:50, 2.58it/s]
Training 1/1 epoch (loss 2.8719): 40%|ββββ | 499/1250 [03:05<04:50, 2.58it/s]
Training 1/1 epoch (loss 2.8719): 40%|ββββ | 500/1250 [03:05<04:35, 2.72it/s]
Training 1/1 epoch (loss 2.7786): 40%|ββββ | 500/1250 [03:05<04:35, 2.72it/s]
Training 1/1 epoch (loss 2.7786): 40%|ββββ | 501/1250 [03:05<04:32, 2.75it/s]
Training 1/1 epoch (loss 2.9849): 40%|ββββ | 501/1250 [03:05<04:32, 2.75it/s]
Training 1/1 epoch (loss 2.9849): 40%|ββββ | 502/1250 [03:05<04:32, 2.74it/s]
Training 1/1 epoch (loss 3.0052): 40%|ββββ | 502/1250 [03:06<04:32, 2.74it/s]
Training 1/1 epoch (loss 3.0052): 40%|ββββ | 503/1250 [03:06<04:31, 2.76it/s]
Training 1/1 epoch (loss 2.5883): 40%|ββββ | 503/1250 [03:06<04:31, 2.76it/s]
Training 1/1 epoch (loss 2.5883): 40%|ββββ | 504/1250 [03:06<04:30, 2.76it/s]
Training 1/1 epoch (loss 2.9588): 40%|ββββ | 504/1250 [03:06<04:30, 2.76it/s]
Training 1/1 epoch (loss 2.9588): 40%|ββββ | 505/1250 [03:06<04:28, 2.78it/s]
Training 1/1 epoch (loss 2.9290): 40%|ββββ | 505/1250 [03:07<04:28, 2.78it/s]
Training 1/1 epoch (loss 2.9290): 40%|ββββ | 506/1250 [03:07<04:28, 2.77it/s]
Training 1/1 epoch (loss 2.7336): 40%|ββββ | 506/1250 [03:07<04:28, 2.77it/s]
Training 1/1 epoch (loss 2.7336): 41%|ββββ | 507/1250 [03:07<04:40, 2.65it/s]
Training 1/1 epoch (loss 2.7791): 41%|ββββ | 507/1250 [03:08<04:40, 2.65it/s]
Training 1/1 epoch (loss 2.7791): 41%|ββββ | 508/1250 [03:08<04:44, 2.60it/s]
Training 1/1 epoch (loss 2.8120): 41%|ββββ | 508/1250 [03:08<04:44, 2.60it/s]
Training 1/1 epoch (loss 2.8120): 41%|ββββ | 509/1250 [03:08<04:41, 2.63it/s]
Training 1/1 epoch (loss 2.6461): 41%|ββββ | 509/1250 [03:08<04:41, 2.63it/s]
Training 1/1 epoch (loss 2.6461): 41%|ββββ | 510/1250 [03:08<04:38, 2.66it/s]
Training 1/1 epoch (loss 2.7195): 41%|ββββ | 510/1250 [03:09<04:38, 2.66it/s]
Training 1/1 epoch (loss 2.7195): 41%|ββββ | 511/1250 [03:09<04:30, 2.74it/s]
Training 1/1 epoch (loss 2.7153): 41%|ββββ | 511/1250 [03:09<04:30, 2.74it/s]
Training 1/1 epoch (loss 2.7153): 41%|ββββ | 512/1250 [03:09<04:42, 2.61it/s]
Training 1/1 epoch (loss 2.7367): 41%|ββββ | 512/1250 [03:10<04:42, 2.61it/s]
Training 1/1 epoch (loss 2.7367): 41%|ββββ | 513/1250 [03:10<04:53, 2.51it/s]
Training 1/1 epoch (loss 2.6231): 41%|ββββ | 513/1250 [03:10<04:53, 2.51it/s]
Training 1/1 epoch (loss 2.6231): 41%|ββββ | 514/1250 [03:10<04:40, 2.62it/s]
Training 1/1 epoch (loss 2.6504): 41%|ββββ | 514/1250 [03:10<04:40, 2.62it/s]
Training 1/1 epoch (loss 2.6504): 41%|ββββ | 515/1250 [03:10<04:28, 2.74it/s]
Training 1/1 epoch (loss 2.7719): 41%|ββββ | 515/1250 [03:11<04:28, 2.74it/s]
Training 1/1 epoch (loss 2.7719): 41%|βββββ | 516/1250 [03:11<04:17, 2.86it/s]
Training 1/1 epoch (loss 2.8487): 41%|βββββ | 516/1250 [03:11<04:17, 2.86it/s]
Training 1/1 epoch (loss 2.8487): 41%|βββββ | 517/1250 [03:11<04:06, 2.97it/s]
Training 1/1 epoch (loss 2.7260): 41%|βββββ | 517/1250 [03:11<04:06, 2.97it/s]
Training 1/1 epoch (loss 2.7260): 41%|βββββ | 518/1250 [03:11<04:09, 2.93it/s]
Training 1/1 epoch (loss 2.6551): 41%|βββββ | 518/1250 [03:12<04:09, 2.93it/s]
Training 1/1 epoch (loss 2.6551): 42%|βββββ | 519/1250 [03:12<04:12, 2.90it/s]
Training 1/1 epoch (loss 2.5670): 42%|βββββ | 519/1250 [03:12<04:12, 2.90it/s]
Training 1/1 epoch (loss 2.5670): 42%|βββββ | 520/1250 [03:12<04:06, 2.96it/s]
Training 1/1 epoch (loss 2.8664): 42%|βββββ | 520/1250 [03:12<04:06, 2.96it/s]
Training 1/1 epoch (loss 2.8664): 42%|βββββ | 521/1250 [03:12<04:04, 2.98it/s]
Training 1/1 epoch (loss 2.8194): 42%|βββββ | 521/1250 [03:13<04:04, 2.98it/s]
Training 1/1 epoch (loss 2.8194): 42%|βββββ | 522/1250 [03:13<04:03, 2.99it/s]
Training 1/1 epoch (loss 2.9336): 42%|βββββ | 522/1250 [03:13<04:03, 2.99it/s]
Training 1/1 epoch (loss 2.9336): 42%|βββββ | 523/1250 [03:13<04:02, 3.00it/s]
Training 1/1 epoch (loss 2.8083): 42%|βββββ | 523/1250 [03:13<04:02, 3.00it/s]
Training 1/1 epoch (loss 2.8083): 42%|βββββ | 524/1250 [03:13<04:00, 3.01it/s]
Training 1/1 epoch (loss 2.7809): 42%|βββββ | 524/1250 [03:14<04:00, 3.01it/s]
Training 1/1 epoch (loss 2.7809): 42%|βββββ | 525/1250 [03:14<04:17, 2.81it/s]
Training 1/1 epoch (loss 2.6987): 42%|βββββ | 525/1250 [03:14<04:17, 2.81it/s]
Training 1/1 epoch (loss 2.6987): 42%|βββββ | 526/1250 [03:14<04:17, 2.82it/s]
Training 1/1 epoch (loss 2.6630): 42%|βββββ | 526/1250 [03:14<04:17, 2.82it/s]
Training 1/1 epoch (loss 2.6630): 42%|βββββ | 527/1250 [03:14<04:18, 2.80it/s]
Training 1/1 epoch (loss 2.7548): 42%|βββββ | 527/1250 [03:15<04:18, 2.80it/s]
Training 1/1 epoch (loss 2.7548): 42%|βββββ | 528/1250 [03:15<04:51, 2.48it/s]
Training 1/1 epoch (loss 2.6078): 42%|βββββ | 528/1250 [03:15<04:51, 2.48it/s]
Training 1/1 epoch (loss 2.6078): 42%|βββββ | 529/1250 [03:15<04:43, 2.54it/s]
Training 1/1 epoch (loss 2.7915): 42%|βββββ | 529/1250 [03:15<04:43, 2.54it/s]
Training 1/1 epoch (loss 2.7915): 42%|βββββ | 530/1250 [03:15<04:24, 2.72it/s]
Training 1/1 epoch (loss 2.5355): 42%|βββββ | 530/1250 [03:16<04:24, 2.72it/s]
Training 1/1 epoch (loss 2.5355): 42%|βββββ | 531/1250 [03:16<04:20, 2.76it/s]
Training 1/1 epoch (loss 2.7242): 42%|βββββ | 531/1250 [03:16<04:20, 2.76it/s]
Training 1/1 epoch (loss 2.7242): 43%|βββββ | 532/1250 [03:16<04:13, 2.83it/s]
Training 1/1 epoch (loss 2.8145): 43%|βββββ | 532/1250 [03:16<04:13, 2.83it/s]
Training 1/1 epoch (loss 2.8145): 43%|βββββ | 533/1250 [03:16<04:01, 2.97it/s]
Training 1/1 epoch (loss 2.8687): 43%|βββββ | 533/1250 [03:17<04:01, 2.97it/s]
Training 1/1 epoch (loss 2.8687): 43%|βββββ | 534/1250 [03:17<04:06, 2.91it/s]
Training 1/1 epoch (loss 2.9662): 43%|βββββ | 534/1250 [03:17<04:06, 2.91it/s]
Training 1/1 epoch (loss 2.9662): 43%|βββββ | 535/1250 [03:17<04:03, 2.93it/s]
Training 1/1 epoch (loss 2.8878): 43%|βββββ | 535/1250 [03:17<04:03, 2.93it/s]
Training 1/1 epoch (loss 2.8878): 43%|βββββ | 536/1250 [03:17<03:59, 2.98it/s]
Training 1/1 epoch (loss 2.9633): 43%|βββββ | 536/1250 [03:18<03:59, 2.98it/s]
Training 1/1 epoch (loss 2.9633): 43%|βββββ | 537/1250 [03:18<04:08, 2.87it/s]
Training 1/1 epoch (loss 2.6094): 43%|βββββ | 537/1250 [03:18<04:08, 2.87it/s]
Training 1/1 epoch (loss 2.6094): 43%|βββββ | 538/1250 [03:18<03:58, 2.99it/s]
Training 1/1 epoch (loss 2.8166): 43%|βββββ | 538/1250 [03:18<03:58, 2.99it/s]
Training 1/1 epoch (loss 2.8166): 43%|βββββ | 539/1250 [03:18<03:55, 3.02it/s]
Training 1/1 epoch (loss 2.8963): 43%|βββββ | 539/1250 [03:19<03:55, 3.02it/s]
Training 1/1 epoch (loss 2.8963): 43%|βββββ | 540/1250 [03:19<03:52, 3.05it/s]
Training 1/1 epoch (loss 2.7442): 43%|βββββ | 540/1250 [03:19<03:52, 3.05it/s]
Training 1/1 epoch (loss 2.7442): 43%|βββββ | 541/1250 [03:19<03:54, 3.02it/s]
Training 1/1 epoch (loss 2.7688): 43%|βββββ | 541/1250 [03:19<03:54, 3.02it/s]
Training 1/1 epoch (loss 2.7688): 43%|βββββ | 542/1250 [03:19<03:47, 3.11it/s]
Training 1/1 epoch (loss 2.7116): 43%|βββββ | 542/1250 [03:20<03:47, 3.11it/s]
Training 1/1 epoch (loss 2.7116): 43%|βββββ | 543/1250 [03:20<03:54, 3.01it/s]
Training 1/1 epoch (loss 2.6899): 43%|βββββ | 543/1250 [03:20<03:54, 3.01it/s]
Training 1/1 epoch (loss 2.6899): 44%|βββββ | 544/1250 [03:20<04:03, 2.90it/s]
Training 1/1 epoch (loss 2.7599): 44%|βββββ | 544/1250 [03:20<04:03, 2.90it/s]
Training 1/1 epoch (loss 2.7599): 44%|βββββ | 545/1250 [03:20<03:59, 2.95it/s]
Training 1/1 epoch (loss 2.8811): 44%|βββββ | 545/1250 [03:21<03:59, 2.95it/s]
Training 1/1 epoch (loss 2.8811): 44%|βββββ | 546/1250 [03:21<03:54, 3.01it/s]
Training 1/1 epoch (loss 2.7996): 44%|βββββ | 546/1250 [03:21<03:54, 3.01it/s]
Training 1/1 epoch (loss 2.7996): 44%|βββββ | 547/1250 [03:21<04:05, 2.86it/s]
Training 1/1 epoch (loss 2.9047): 44%|βββββ | 547/1250 [03:22<04:05, 2.86it/s]
Training 1/1 epoch (loss 2.9047): 44%|βββββ | 548/1250 [03:22<03:59, 2.93it/s]
Training 1/1 epoch (loss 2.9886): 44%|βββββ | 548/1250 [03:22<03:59, 2.93it/s]
Training 1/1 epoch (loss 2.9886): 44%|βββββ | 549/1250 [03:22<04:04, 2.87it/s]
Training 1/1 epoch (loss 2.8122): 44%|βββββ | 549/1250 [03:22<04:04, 2.87it/s]
Training 1/1 epoch (loss 2.8122): 44%|βββββ | 550/1250 [03:22<04:00, 2.91it/s]
Training 1/1 epoch (loss 2.6779): 44%|βββββ | 550/1250 [03:23<04:00, 2.91it/s]
Training 1/1 epoch (loss 2.6779): 44%|βββββ | 551/1250 [03:23<03:48, 3.05it/s]
Training 1/1 epoch (loss 2.5751): 44%|βββββ | 551/1250 [03:23<03:48, 3.05it/s]
Training 1/1 epoch (loss 2.5751): 44%|βββββ | 552/1250 [03:23<04:49, 2.41it/s]
Training 1/1 epoch (loss 2.8549): 44%|βββββ | 552/1250 [03:23<04:49, 2.41it/s]
Training 1/1 epoch (loss 2.8549): 44%|βββββ | 553/1250 [03:23<04:37, 2.51it/s]
Training 1/1 epoch (loss 2.7576): 44%|βββββ | 553/1250 [03:24<04:37, 2.51it/s]
Training 1/1 epoch (loss 2.7576): 44%|βββββ | 554/1250 [03:24<04:51, 2.39it/s]
Training 1/1 epoch (loss 2.5462): 44%|βββββ | 554/1250 [03:24<04:51, 2.39it/s]
Training 1/1 epoch (loss 2.5462): 44%|βββββ | 555/1250 [03:24<04:31, 2.56it/s]
Training 1/1 epoch (loss 2.7674): 44%|βββββ | 555/1250 [03:25<04:31, 2.56it/s]
Training 1/1 epoch (loss 2.7674): 44%|βββββ | 556/1250 [03:25<04:11, 2.76it/s]
Training 1/1 epoch (loss 2.9608): 44%|βββββ | 556/1250 [03:25<04:11, 2.76it/s]
Training 1/1 epoch (loss 2.9608): 45%|βββββ | 557/1250 [03:25<04:06, 2.81it/s]
Training 1/1 epoch (loss 2.7740): 45%|βββββ | 557/1250 [03:25<04:06, 2.81it/s]
Training 1/1 epoch (loss 2.7740): 45%|βββββ | 558/1250 [03:25<04:20, 2.66it/s]
Training 1/1 epoch (loss 2.8548): 45%|βββββ | 558/1250 [03:26<04:20, 2.66it/s]
Training 1/1 epoch (loss 2.8548): 45%|βββββ | 559/1250 [03:26<04:07, 2.79it/s]
Training 1/1 epoch (loss 2.8711): 45%|βββββ | 559/1250 [03:26<04:07, 2.79it/s]
Training 1/1 epoch (loss 2.8711): 45%|βββββ | 560/1250 [03:26<04:03, 2.83it/s]
Training 1/1 epoch (loss 2.6154): 45%|βββββ | 560/1250 [03:26<04:03, 2.83it/s]
Training 1/1 epoch (loss 2.6154): 45%|βββββ | 561/1250 [03:26<04:04, 2.82it/s]
Training 1/1 epoch (loss 2.7962): 45%|βββββ | 561/1250 [03:27<04:04, 2.82it/s]
Training 1/1 epoch (loss 2.7962): 45%|βββββ | 562/1250 [03:27<04:00, 2.87it/s]
Training 1/1 epoch (loss 2.8797): 45%|βββββ | 562/1250 [03:27<04:00, 2.87it/s]
Training 1/1 epoch (loss 2.8797): 45%|βββββ | 563/1250 [03:27<03:49, 2.99it/s]
Training 1/1 epoch (loss 2.8930): 45%|βββββ | 563/1250 [03:27<03:49, 2.99it/s]
Training 1/1 epoch (loss 2.8930): 45%|βββββ | 564/1250 [03:27<03:56, 2.90it/s]
Training 1/1 epoch (loss 2.8607): 45%|βββββ | 564/1250 [03:28<03:56, 2.90it/s]
Training 1/1 epoch (loss 2.8607): 45%|βββββ | 565/1250 [03:28<04:16, 2.67it/s]
Training 1/1 epoch (loss 2.8291): 45%|βββββ | 565/1250 [03:28<04:16, 2.67it/s]
Training 1/1 epoch (loss 2.8291): 45%|βββββ | 566/1250 [03:28<04:03, 2.81it/s]
Training 1/1 epoch (loss 2.7973): 45%|βββββ | 566/1250 [03:29<04:03, 2.81it/s]
Training 1/1 epoch (loss 2.7973): 45%|βββββ | 567/1250 [03:29<04:17, 2.65it/s]
Training 1/1 epoch (loss 2.7298): 45%|βββββ | 567/1250 [03:29<04:17, 2.65it/s]
Training 1/1 epoch (loss 2.7298): 45%|βββββ | 568/1250 [03:29<04:24, 2.58it/s]
Training 1/1 epoch (loss 2.8594): 45%|βββββ | 568/1250 [03:29<04:24, 2.58it/s]
Training 1/1 epoch (loss 2.8594): 46%|βββββ | 569/1250 [03:29<04:44, 2.39it/s]
Training 1/1 epoch (loss 2.7135): 46%|βββββ | 569/1250 [03:30<04:44, 2.39it/s]
Training 1/1 epoch (loss 2.7135): 46%|βββββ | 570/1250 [03:30<04:43, 2.40it/s]
Training 1/1 epoch (loss 2.7890): 46%|βββββ | 570/1250 [03:30<04:43, 2.40it/s]
Training 1/1 epoch (loss 2.7890): 46%|βββββ | 571/1250 [03:30<04:20, 2.61it/s]
Training 1/1 epoch (loss 2.8807): 46%|βββββ | 571/1250 [03:30<04:20, 2.61it/s]
Training 1/1 epoch (loss 2.8807): 46%|βββββ | 572/1250 [03:30<04:07, 2.73it/s]
Training 1/1 epoch (loss 2.7958): 46%|βββββ | 572/1250 [03:31<04:07, 2.73it/s]
Training 1/1 epoch (loss 2.7958): 46%|βββββ | 573/1250 [03:31<04:18, 2.62it/s]
Training 1/1 epoch (loss 2.8241): 46%|βββββ | 573/1250 [03:31<04:18, 2.62it/s]
Training 1/1 epoch (loss 2.8241): 46%|βββββ | 574/1250 [03:31<04:22, 2.58it/s]
Training 1/1 epoch (loss 2.8871): 46%|βββββ | 574/1250 [03:32<04:22, 2.58it/s]
Training 1/1 epoch (loss 2.8871): 46%|βββββ | 575/1250 [03:32<04:16, 2.63it/s]
Training 1/1 epoch (loss 2.6851): 46%|βββββ | 575/1250 [03:32<04:16, 2.63it/s]
Training 1/1 epoch (loss 2.6851): 46%|βββββ | 576/1250 [03:32<04:09, 2.71it/s]
Training 1/1 epoch (loss 2.6616): 46%|βββββ | 576/1250 [03:32<04:09, 2.71it/s]
Training 1/1 epoch (loss 2.6616): 46%|βββββ | 577/1250 [03:32<04:00, 2.80it/s]
Training 1/1 epoch (loss 2.8728): 46%|βββββ | 577/1250 [03:33<04:00, 2.80it/s]
Training 1/1 epoch (loss 2.8728): 46%|βββββ | 578/1250 [03:33<04:00, 2.80it/s]
Training 1/1 epoch (loss 2.9902): 46%|βββββ | 578/1250 [03:33<04:00, 2.80it/s]
Training 1/1 epoch (loss 2.9902): 46%|βββββ | 579/1250 [03:33<03:51, 2.90it/s]
Training 1/1 epoch (loss 2.9352): 46%|βββββ | 579/1250 [03:33<03:51, 2.90it/s]
Training 1/1 epoch (loss 2.9352): 46%|βββββ | 580/1250 [03:33<03:57, 2.82it/s]
Training 1/1 epoch (loss 2.7541): 46%|βββββ | 580/1250 [03:34<03:57, 2.82it/s]
Training 1/1 epoch (loss 2.7541): 46%|βββββ | 581/1250 [03:34<04:01, 2.77it/s]
Training 1/1 epoch (loss 2.6526): 46%|βββββ | 581/1250 [03:34<04:01, 2.77it/s]
Training 1/1 epoch (loss 2.6526): 47%|βββββ | 582/1250 [03:34<04:03, 2.75it/s]
Training 1/1 epoch (loss 2.6189): 47%|βββββ | 582/1250 [03:34<04:03, 2.75it/s]
Training 1/1 epoch (loss 2.6189): 47%|βββββ | 583/1250 [03:34<03:54, 2.85it/s]
Training 1/1 epoch (loss 2.6658): 47%|βββββ | 583/1250 [03:35<03:54, 2.85it/s]
Training 1/1 epoch (loss 2.6658): 47%|βββββ | 584/1250 [03:35<04:00, 2.76it/s]
Training 1/1 epoch (loss 2.8424): 47%|βββββ | 584/1250 [03:35<04:00, 2.76it/s]
Training 1/1 epoch (loss 2.8424): 47%|βββββ | 585/1250 [03:35<04:02, 2.74it/s]
Training 1/1 epoch (loss 2.6215): 47%|βββββ | 585/1250 [03:36<04:02, 2.74it/s]
Training 1/1 epoch (loss 2.6215): 47%|βββββ | 586/1250 [03:36<03:53, 2.84it/s]
Training 1/1 epoch (loss 2.7304): 47%|βββββ | 586/1250 [03:36<03:53, 2.84it/s]
Training 1/1 epoch (loss 2.7304): 47%|βββββ | 587/1250 [03:36<04:00, 2.76it/s]
Training 1/1 epoch (loss 2.5255): 47%|βββββ | 587/1250 [03:36<04:00, 2.76it/s]
Training 1/1 epoch (loss 2.5255): 47%|βββββ | 588/1250 [03:36<03:52, 2.85it/s]
Training 1/1 epoch (loss 2.6795): 47%|βββββ | 588/1250 [03:37<03:52, 2.85it/s]
Training 1/1 epoch (loss 2.6795): 47%|βββββ | 589/1250 [03:37<03:46, 2.91it/s]
Training 1/1 epoch (loss 2.7579): 47%|βββββ | 589/1250 [03:37<03:46, 2.91it/s]
Training 1/1 epoch (loss 2.7579): 47%|βββββ | 590/1250 [03:37<03:46, 2.91it/s]
Training 1/1 epoch (loss 2.6447): 47%|βββββ | 590/1250 [03:37<03:46, 2.91it/s]
Training 1/1 epoch (loss 2.6447): 47%|βββββ | 591/1250 [03:37<03:50, 2.86it/s]
Training 1/1 epoch (loss 2.7672): 47%|βββββ | 591/1250 [03:38<03:50, 2.86it/s]
Training 1/1 epoch (loss 2.7672): 47%|βββββ | 592/1250 [03:38<03:48, 2.88it/s]
Training 1/1 epoch (loss 2.8807): 47%|βββββ | 592/1250 [03:38<03:48, 2.88it/s]
Training 1/1 epoch (loss 2.8807): 47%|βββββ | 593/1250 [03:38<03:49, 2.86it/s]
Training 1/1 epoch (loss 2.6761): 47%|βββββ | 593/1250 [03:38<03:49, 2.86it/s]
Training 1/1 epoch (loss 2.6761): 48%|βββββ | 594/1250 [03:38<03:49, 2.86it/s]
Training 1/1 epoch (loss 2.6565): 48%|βββββ | 594/1250 [03:39<03:49, 2.86it/s]
Training 1/1 epoch (loss 2.6565): 48%|βββββ | 595/1250 [03:39<03:46, 2.89it/s]
Training 1/1 epoch (loss 2.7000): 48%|βββββ | 595/1250 [03:39<03:46, 2.89it/s]
Training 1/1 epoch (loss 2.7000): 48%|βββββ | 596/1250 [03:39<03:35, 3.03it/s]
Training 1/1 epoch (loss 2.7839): 48%|βββββ | 596/1250 [03:39<03:35, 3.03it/s]
Training 1/1 epoch (loss 2.7839): 48%|βββββ | 597/1250 [03:39<03:47, 2.87it/s]
Training 1/1 epoch (loss 2.7255): 48%|βββββ | 597/1250 [03:40<03:47, 2.87it/s]
Training 1/1 epoch (loss 2.7255): 48%|βββββ | 598/1250 [03:40<03:41, 2.94it/s]
Training 1/1 epoch (loss 2.7522): 48%|βββββ | 598/1250 [03:40<03:41, 2.94it/s]
Training 1/1 epoch (loss 2.7522): 48%|βββββ | 599/1250 [03:40<03:43, 2.92it/s]
Training 1/1 epoch (loss 2.7732): 48%|βββββ | 599/1250 [03:40<03:43, 2.92it/s]
Training 1/1 epoch (loss 2.7732): 48%|βββββ | 600/1250 [03:40<03:49, 2.83it/s]
Training 1/1 epoch (loss 2.9322): 48%|βββββ | 600/1250 [03:41<03:49, 2.83it/s]
Training 1/1 epoch (loss 2.9322): 48%|βββββ | 601/1250 [03:41<03:50, 2.82it/s]
Training 1/1 epoch (loss 2.6647): 48%|βββββ | 601/1250 [03:41<03:50, 2.82it/s]
Training 1/1 epoch (loss 2.6647): 48%|βββββ | 602/1250 [03:41<03:44, 2.89it/s]
Training 1/1 epoch (loss 2.6633): 48%|βββββ | 602/1250 [03:41<03:44, 2.89it/s]
Training 1/1 epoch (loss 2.6633): 48%|βββββ | 603/1250 [03:41<03:51, 2.80it/s]
Training 1/1 epoch (loss 2.8071): 48%|βββββ | 603/1250 [03:42<03:51, 2.80it/s]
Training 1/1 epoch (loss 2.8071): 48%|βββββ | 604/1250 [03:42<03:39, 2.95it/s]
Training 1/1 epoch (loss 2.7215): 48%|βββββ | 604/1250 [03:42<03:39, 2.95it/s]
Training 1/1 epoch (loss 2.7215): 48%|βββββ | 605/1250 [03:42<03:34, 3.01it/s]
Training 1/1 epoch (loss 2.8361): 48%|βββββ | 605/1250 [03:42<03:34, 3.01it/s]
Training 1/1 epoch (loss 2.8361): 48%|βββββ | 606/1250 [03:42<03:37, 2.96it/s]
Training 1/1 epoch (loss 2.5946): 48%|βββββ | 606/1250 [03:43<03:37, 2.96it/s]
Training 1/1 epoch (loss 2.5946): 49%|βββββ | 607/1250 [03:43<03:32, 3.02it/s]
Training 1/1 epoch (loss 2.9467): 49%|βββββ | 607/1250 [03:43<03:32, 3.02it/s]
Training 1/1 epoch (loss 2.9467): 49%|βββββ | 608/1250 [03:43<03:36, 2.96it/s]
Training 1/1 epoch (loss 2.6877): 49%|βββββ | 608/1250 [03:43<03:36, 2.96it/s]
Training 1/1 epoch (loss 2.6877): 49%|βββββ | 609/1250 [03:43<03:44, 2.86it/s]
Training 1/1 epoch (loss 2.8047): 49%|βββββ | 609/1250 [03:44<03:44, 2.86it/s]
Training 1/1 epoch (loss 2.8047): 49%|βββββ | 610/1250 [03:44<03:49, 2.79it/s]
Training 1/1 epoch (loss 2.8422): 49%|βββββ | 610/1250 [03:44<03:49, 2.79it/s]
Training 1/1 epoch (loss 2.8422): 49%|βββββ | 611/1250 [03:44<03:43, 2.85it/s]
Training 1/1 epoch (loss 2.6157): 49%|βββββ | 611/1250 [03:45<03:43, 2.85it/s]
Training 1/1 epoch (loss 2.6157): 49%|βββββ | 612/1250 [03:45<03:49, 2.78it/s]
Training 1/1 epoch (loss 2.8673): 49%|βββββ | 612/1250 [03:45<03:49, 2.78it/s]
Training 1/1 epoch (loss 2.8673): 49%|βββββ | 613/1250 [03:45<03:44, 2.84it/s]
Training 1/1 epoch (loss 2.9131): 49%|βββββ | 613/1250 [03:45<03:44, 2.84it/s]
Training 1/1 epoch (loss 2.9131): 49%|βββββ | 614/1250 [03:45<03:39, 2.89it/s]
Training 1/1 epoch (loss 2.7717): 49%|βββββ | 614/1250 [03:46<03:39, 2.89it/s]
Training 1/1 epoch (loss 2.7717): 49%|βββββ | 615/1250 [03:46<03:36, 2.93it/s]
Training 1/1 epoch (loss 2.8407): 49%|βββββ | 615/1250 [03:46<03:36, 2.93it/s]
Training 1/1 epoch (loss 2.8407): 49%|βββββ | 616/1250 [03:46<03:43, 2.84it/s]
Training 1/1 epoch (loss 2.3852): 49%|βββββ | 616/1250 [03:46<03:43, 2.84it/s]
Training 1/1 epoch (loss 2.3852): 49%|βββββ | 617/1250 [03:46<03:50, 2.74it/s]
Training 1/1 epoch (loss 2.7459): 49%|βββββ | 617/1250 [03:47<03:50, 2.74it/s]
Training 1/1 epoch (loss 2.7459): 49%|βββββ | 618/1250 [03:47<03:46, 2.79it/s]
Training 1/1 epoch (loss 2.6517): 49%|βββββ | 618/1250 [03:47<03:46, 2.79it/s]
Training 1/1 epoch (loss 2.6517): 50%|βββββ | 619/1250 [03:47<03:44, 2.80it/s]
Training 1/1 epoch (loss 2.6508): 50%|βββββ | 619/1250 [03:47<03:44, 2.80it/s]
Training 1/1 epoch (loss 2.6508): 50%|βββββ | 620/1250 [03:47<03:39, 2.86it/s]
Training 1/1 epoch (loss 2.5915): 50%|βββββ | 620/1250 [03:48<03:39, 2.86it/s]
Training 1/1 epoch (loss 2.5915): 50%|βββββ | 621/1250 [03:48<03:32, 2.96it/s]
Training 1/1 epoch (loss 2.9703): 50%|βββββ | 621/1250 [03:48<03:32, 2.96it/s]
Training 1/1 epoch (loss 2.9703): 50%|βββββ | 622/1250 [03:48<03:32, 2.95it/s]
Training 1/1 epoch (loss 2.7742): 50%|βββββ | 622/1250 [03:48<03:32, 2.95it/s]
Training 1/1 epoch (loss 2.7742): 50%|βββββ | 623/1250 [03:48<03:29, 3.00it/s]
Training 1/1 epoch (loss 2.7100): 50%|βββββ | 623/1250 [03:49<03:29, 3.00it/s]
Training 1/1 epoch (loss 2.7100): 50%|βββββ | 624/1250 [03:49<03:28, 3.01it/s]
Training 1/1 epoch (loss 2.7804): 50%|βββββ | 624/1250 [03:49<03:28, 3.01it/s]
Training 1/1 epoch (loss 2.7804): 50%|βββββ | 625/1250 [03:49<03:42, 2.81it/s]
Training 1/1 epoch (loss 2.8126): 50%|βββββ | 625/1250 [03:49<03:42, 2.81it/s]
Training 1/1 epoch (loss 2.8126): 50%|βββββ | 626/1250 [03:49<03:44, 2.78it/s]
Training 1/1 epoch (loss 2.5222): 50%|βββββ | 626/1250 [03:50<03:44, 2.78it/s]
Training 1/1 epoch (loss 2.5222): 50%|βββββ | 627/1250 [03:50<03:35, 2.88it/s]
Training 1/1 epoch (loss 2.7319): 50%|βββββ | 627/1250 [03:50<03:35, 2.88it/s]
Training 1/1 epoch (loss 2.7319): 50%|βββββ | 628/1250 [03:50<03:30, 2.96it/s]
Training 1/1 epoch (loss 2.8925): 50%|βββββ | 628/1250 [03:50<03:30, 2.96it/s]
Training 1/1 epoch (loss 2.8925): 50%|βββββ | 629/1250 [03:50<03:26, 3.01it/s]
Training 1/1 epoch (loss 2.4747): 50%|βββββ | 629/1250 [03:51<03:26, 3.01it/s]
Training 1/1 epoch (loss 2.4747): 50%|βββββ | 630/1250 [03:51<03:26, 3.00it/s]
Training 1/1 epoch (loss 2.7060): 50%|βββββ | 630/1250 [03:51<03:26, 3.00it/s]
Training 1/1 epoch (loss 2.7060): 50%|βββββ | 631/1250 [03:51<03:21, 3.07it/s]
Training 1/1 epoch (loss 2.8996): 50%|βββββ | 631/1250 [03:51<03:21, 3.07it/s]
Training 1/1 epoch (loss 2.8996): 51%|βββββ | 632/1250 [03:51<03:38, 2.83it/s]
Training 1/1 epoch (loss 2.7837): 51%|βββββ | 632/1250 [03:52<03:38, 2.83it/s]
Training 1/1 epoch (loss 2.7837): 51%|βββββ | 633/1250 [03:52<03:38, 2.83it/s]
Training 1/1 epoch (loss 2.7224): 51%|βββββ | 633/1250 [03:52<03:38, 2.83it/s]
Training 1/1 epoch (loss 2.7224): 51%|βββββ | 634/1250 [03:52<03:30, 2.93it/s]
Training 1/1 epoch (loss 2.6303): 51%|βββββ | 634/1250 [03:52<03:30, 2.93it/s]
Training 1/1 epoch (loss 2.6303): 51%|βββββ | 635/1250 [03:52<03:31, 2.91it/s]
Training 1/1 epoch (loss 2.5313): 51%|βββββ | 635/1250 [03:53<03:31, 2.91it/s]
Training 1/1 epoch (loss 2.5313): 51%|βββββ | 636/1250 [03:53<04:03, 2.52it/s]
Training 1/1 epoch (loss 2.6430): 51%|βββββ | 636/1250 [03:53<04:03, 2.52it/s]
Training 1/1 epoch (loss 2.6430): 51%|βββββ | 637/1250 [03:53<03:53, 2.62it/s]
Training 1/1 epoch (loss 2.5807): 51%|βββββ | 637/1250 [03:54<03:53, 2.62it/s]
Training 1/1 epoch (loss 2.5807): 51%|βββββ | 638/1250 [03:54<03:48, 2.67it/s]
Training 1/1 epoch (loss 2.9059): 51%|βββββ | 638/1250 [03:54<03:48, 2.67it/s]
Training 1/1 epoch (loss 2.9059): 51%|βββββ | 639/1250 [03:54<03:44, 2.73it/s]
Training 1/1 epoch (loss 2.7133): 51%|βββββ | 639/1250 [03:54<03:44, 2.73it/s]
Training 1/1 epoch (loss 2.7133): 51%|βββββ | 640/1250 [03:54<03:44, 2.71it/s]
Training 1/1 epoch (loss 2.6685): 51%|βββββ | 640/1250 [03:55<03:44, 2.71it/s]
Training 1/1 epoch (loss 2.6685): 51%|ββββββ | 641/1250 [03:55<03:43, 2.72it/s]
Training 1/1 epoch (loss 2.6587): 51%|ββββββ | 641/1250 [03:55<03:43, 2.72it/s]
Training 1/1 epoch (loss 2.6587): 51%|ββββββ | 642/1250 [03:55<03:32, 2.86it/s]
Training 1/1 epoch (loss 2.7331): 51%|ββββββ | 642/1250 [03:56<03:32, 2.86it/s]
Training 1/1 epoch (loss 2.7331): 51%|ββββββ | 643/1250 [03:56<03:45, 2.69it/s]
Training 1/1 epoch (loss 2.7027): 51%|ββββββ | 643/1250 [03:56<03:45, 2.69it/s]
Training 1/1 epoch (loss 2.7027): 52%|ββββββ | 644/1250 [03:56<03:40, 2.75it/s]
Training 1/1 epoch (loss 2.6384): 52%|ββββββ | 644/1250 [03:56<03:40, 2.75it/s]
Training 1/1 epoch (loss 2.6384): 52%|ββββββ | 645/1250 [03:56<03:33, 2.83it/s]
Training 1/1 epoch (loss 2.3503): 52%|ββββββ | 645/1250 [03:57<03:33, 2.83it/s]
Training 1/1 epoch (loss 2.3503): 52%|ββββββ | 646/1250 [03:57<03:38, 2.76it/s]
Training 1/1 epoch (loss 2.6597): 52%|ββββββ | 646/1250 [03:57<03:38, 2.76it/s]
Training 1/1 epoch (loss 2.6597): 52%|ββββββ | 647/1250 [03:57<03:38, 2.76it/s]
Training 1/1 epoch (loss 2.9339): 52%|ββββββ | 647/1250 [03:57<03:38, 2.76it/s]
Training 1/1 epoch (loss 2.9339): 52%|ββββββ | 648/1250 [03:57<03:42, 2.71it/s]
Training 1/1 epoch (loss 2.5646): 52%|ββββββ | 648/1250 [03:58<03:42, 2.71it/s]
Training 1/1 epoch (loss 2.5646): 52%|ββββββ | 649/1250 [03:58<04:15, 2.35it/s]
Training 1/1 epoch (loss 2.8481): 52%|ββββββ | 649/1250 [03:58<04:15, 2.35it/s]
Training 1/1 epoch (loss 2.8481): 52%|ββββββ | 650/1250 [03:58<04:09, 2.41it/s]
Training 1/1 epoch (loss 2.7800): 52%|ββββββ | 650/1250 [03:59<04:09, 2.41it/s]
Training 1/1 epoch (loss 2.7800): 52%|ββββββ | 651/1250 [03:59<03:57, 2.53it/s]
Training 1/1 epoch (loss 2.6242): 52%|ββββββ | 651/1250 [03:59<03:57, 2.53it/s]
Training 1/1 epoch (loss 2.6242): 52%|ββββββ | 652/1250 [03:59<03:42, 2.69it/s]
Training 1/1 epoch (loss 2.6171): 52%|ββββββ | 652/1250 [03:59<03:42, 2.69it/s]
Training 1/1 epoch (loss 2.6171): 52%|ββββββ | 653/1250 [03:59<03:44, 2.66it/s]
Training 1/1 epoch (loss 2.5592): 52%|ββββββ | 653/1250 [04:00<03:44, 2.66it/s]
Training 1/1 epoch (loss 2.5592): 52%|ββββββ | 654/1250 [04:00<03:39, 2.72it/s]
Training 1/1 epoch (loss 2.7348): 52%|ββββββ | 654/1250 [04:00<03:39, 2.72it/s]
Training 1/1 epoch (loss 2.7348): 52%|ββββββ | 655/1250 [04:00<03:32, 2.80it/s]
Training 1/1 epoch (loss 2.9552): 52%|ββββββ | 655/1250 [04:00<03:32, 2.80it/s]
Training 1/1 epoch (loss 2.9552): 52%|ββββββ | 656/1250 [04:00<03:26, 2.87it/s]
Training 1/1 epoch (loss 2.8662): 52%|ββββββ | 656/1250 [04:01<03:26, 2.87it/s]
Training 1/1 epoch (loss 2.8662): 53%|ββββββ | 657/1250 [04:01<03:26, 2.87it/s]
Training 1/1 epoch (loss 2.8978): 53%|ββββββ | 657/1250 [04:01<03:26, 2.87it/s]
Training 1/1 epoch (loss 2.8978): 53%|ββββββ | 658/1250 [04:01<03:20, 2.96it/s]
Training 1/1 epoch (loss 2.6213): 53%|ββββββ | 658/1250 [04:01<03:20, 2.96it/s]
Training 1/1 epoch (loss 2.6213): 53%|ββββββ | 659/1250 [04:01<03:21, 2.93it/s]
Training 1/1 epoch (loss 2.4187): 53%|ββββββ | 659/1250 [04:02<03:21, 2.93it/s]
Training 1/1 epoch (loss 2.4187): 53%|ββββββ | 660/1250 [04:02<03:23, 2.90it/s]
Training 1/1 epoch (loss 2.5700): 53%|ββββββ | 660/1250 [04:02<03:23, 2.90it/s]
Training 1/1 epoch (loss 2.5700): 53%|ββββββ | 661/1250 [04:02<03:28, 2.83it/s]
Training 1/1 epoch (loss 2.6825): 53%|ββββββ | 661/1250 [04:02<03:28, 2.83it/s]
Training 1/1 epoch (loss 2.6825): 53%|ββββββ | 662/1250 [04:02<03:24, 2.88it/s]
Training 1/1 epoch (loss 2.7438): 53%|ββββββ | 662/1250 [04:03<03:24, 2.88it/s]
Training 1/1 epoch (loss 2.7438): 53%|ββββββ | 663/1250 [04:03<03:18, 2.96it/s]
Training 1/1 epoch (loss 2.7994): 53%|ββββββ | 663/1250 [04:03<03:18, 2.96it/s]
Training 1/1 epoch (loss 2.7994): 53%|ββββββ | 664/1250 [04:03<03:20, 2.93it/s]
Training 1/1 epoch (loss 2.7399): 53%|ββββββ | 664/1250 [04:03<03:20, 2.93it/s]
Training 1/1 epoch (loss 2.7399): 53%|ββββββ | 665/1250 [04:03<03:24, 2.86it/s]
Training 1/1 epoch (loss 2.7107): 53%|ββββββ | 665/1250 [04:04<03:24, 2.86it/s]
Training 1/1 epoch (loss 2.7107): 53%|ββββββ | 666/1250 [04:04<03:29, 2.78it/s]
Training 1/1 epoch (loss 2.9298): 53%|ββββββ | 666/1250 [04:04<03:29, 2.78it/s]
Training 1/1 epoch (loss 2.9298): 53%|ββββββ | 667/1250 [04:04<03:28, 2.80it/s]
Training 1/1 epoch (loss 2.7903): 53%|ββββββ | 667/1250 [04:05<03:28, 2.80it/s]
Training 1/1 epoch (loss 2.7903): 53%|ββββββ | 668/1250 [04:05<03:27, 2.81it/s]
Training 1/1 epoch (loss 2.7540): 53%|ββββββ | 668/1250 [04:05<03:27, 2.81it/s]
Training 1/1 epoch (loss 2.7540): 54%|ββββββ | 669/1250 [04:05<03:22, 2.86it/s]
Training 1/1 epoch (loss 3.0198): 54%|ββββββ | 669/1250 [04:05<03:22, 2.86it/s]
Training 1/1 epoch (loss 3.0198): 54%|ββββββ | 670/1250 [04:05<03:15, 2.97it/s]
Training 1/1 epoch (loss 3.0074): 54%|ββββββ | 670/1250 [04:06<03:15, 2.97it/s]
Training 1/1 epoch (loss 3.0074): 54%|ββββββ | 671/1250 [04:06<03:18, 2.92it/s]
Training 1/1 epoch (loss 2.9435): 54%|ββββββ | 671/1250 [04:06<03:18, 2.92it/s]
Training 1/1 epoch (loss 2.9435): 54%|ββββββ | 672/1250 [04:06<03:15, 2.96it/s]
Training 1/1 epoch (loss 2.6271): 54%|ββββββ | 672/1250 [04:06<03:15, 2.96it/s]
Training 1/1 epoch (loss 2.6271): 54%|ββββββ | 673/1250 [04:06<03:17, 2.92it/s]
Training 1/1 epoch (loss 2.8534): 54%|ββββββ | 673/1250 [04:07<03:17, 2.92it/s]
Training 1/1 epoch (loss 2.8534): 54%|ββββββ | 674/1250 [04:07<03:09, 3.04it/s]
Training 1/1 epoch (loss 2.5239): 54%|ββββββ | 674/1250 [04:07<03:09, 3.04it/s]
Training 1/1 epoch (loss 2.5239): 54%|ββββββ | 675/1250 [04:07<03:11, 3.00it/s]
Training 1/1 epoch (loss 2.7671): 54%|ββββββ | 675/1250 [04:07<03:11, 3.00it/s]
Training 1/1 epoch (loss 2.7671): 54%|ββββββ | 676/1250 [04:07<03:15, 2.94it/s]
Training 1/1 epoch (loss 2.5753): 54%|ββββββ | 676/1250 [04:08<03:15, 2.94it/s]
Training 1/1 epoch (loss 2.5753): 54%|ββββββ | 677/1250 [04:08<03:15, 2.93it/s]
Training 1/1 epoch (loss 2.8298): 54%|ββββββ | 677/1250 [04:08<03:15, 2.93it/s]
Training 1/1 epoch (loss 2.8298): 54%|ββββββ | 678/1250 [04:08<03:09, 3.01it/s]
Training 1/1 epoch (loss 2.8314): 54%|ββββββ | 678/1250 [04:08<03:09, 3.01it/s]
Training 1/1 epoch (loss 2.8314): 54%|ββββββ | 679/1250 [04:08<03:08, 3.03it/s]
Training 1/1 epoch (loss 2.7359): 54%|ββββββ | 679/1250 [04:09<03:08, 3.03it/s]
Training 1/1 epoch (loss 2.7359): 54%|ββββββ | 680/1250 [04:09<03:11, 2.98it/s]
Training 1/1 epoch (loss 2.6415): 54%|ββββββ | 680/1250 [04:09<03:11, 2.98it/s]
Training 1/1 epoch (loss 2.6415): 54%|ββββββ | 681/1250 [04:09<03:11, 2.97it/s]
Training 1/1 epoch (loss 2.8631): 54%|ββββββ | 681/1250 [04:09<03:11, 2.97it/s]
Training 1/1 epoch (loss 2.8631): 55%|ββββββ | 682/1250 [04:09<03:09, 3.00it/s]
Training 1/1 epoch (loss 2.8550): 55%|ββββββ | 682/1250 [04:10<03:09, 3.00it/s]
Training 1/1 epoch (loss 2.8550): 55%|ββββββ | 683/1250 [04:10<03:11, 2.96it/s]
Training 1/1 epoch (loss 2.8308): 55%|ββββββ | 683/1250 [04:10<03:11, 2.96it/s]
Training 1/1 epoch (loss 2.8308): 55%|ββββββ | 684/1250 [04:10<03:07, 3.02it/s]
Training 1/1 epoch (loss 2.5500): 55%|ββββββ | 684/1250 [04:10<03:07, 3.02it/s]
Training 1/1 epoch (loss 2.5500): 55%|ββββββ | 685/1250 [04:10<03:06, 3.03it/s]
Training 1/1 epoch (loss 2.8118): 55%|ββββββ | 685/1250 [04:11<03:06, 3.03it/s]
Training 1/1 epoch (loss 2.8118): 55%|ββββββ | 686/1250 [04:11<03:08, 3.00it/s]
Training 1/1 epoch (loss 2.6204): 55%|ββββββ | 686/1250 [04:11<03:08, 3.00it/s]
Training 1/1 epoch (loss 2.6204): 55%|ββββββ | 687/1250 [04:11<02:59, 3.13it/s]
Training 1/1 epoch (loss 2.7879): 55%|ββββββ | 687/1250 [04:11<02:59, 3.13it/s]
Training 1/1 epoch (loss 2.7879): 55%|ββββββ | 688/1250 [04:11<03:08, 2.99it/s]
Training 1/1 epoch (loss 2.8381): 55%|ββββββ | 688/1250 [04:12<03:08, 2.99it/s]
Training 1/1 epoch (loss 2.8381): 55%|ββββββ | 689/1250 [04:12<03:06, 3.00it/s]
Training 1/1 epoch (loss 2.8529): 55%|ββββββ | 689/1250 [04:12<03:06, 3.00it/s]
Training 1/1 epoch (loss 2.8529): 55%|ββββββ | 690/1250 [04:12<03:04, 3.03it/s]
Training 1/1 epoch (loss 2.6622): 55%|ββββββ | 690/1250 [04:12<03:04, 3.03it/s]
Training 1/1 epoch (loss 2.6622): 55%|ββββββ | 691/1250 [04:12<03:01, 3.08it/s]
Training 1/1 epoch (loss 2.7844): 55%|ββββββ | 691/1250 [04:13<03:01, 3.08it/s]
Training 1/1 epoch (loss 2.7844): 55%|ββββββ | 692/1250 [04:13<03:08, 2.96it/s]
Training 1/1 epoch (loss 2.7658): 55%|ββββββ | 692/1250 [04:13<03:08, 2.96it/s]
Training 1/1 epoch (loss 2.7658): 55%|ββββββ | 693/1250 [04:13<03:04, 3.02it/s]
Training 1/1 epoch (loss 2.9522): 55%|ββββββ | 693/1250 [04:13<03:04, 3.02it/s]
Training 1/1 epoch (loss 2.9522): 56%|ββββββ | 694/1250 [04:13<02:59, 3.10it/s]
Training 1/1 epoch (loss 2.7292): 56%|ββββββ | 694/1250 [04:14<02:59, 3.10it/s]
Training 1/1 epoch (loss 2.7292): 56%|ββββββ | 695/1250 [04:14<03:13, 2.87it/s]
Training 1/1 epoch (loss 2.9572): 56%|ββββββ | 695/1250 [04:14<03:13, 2.87it/s]
Training 1/1 epoch (loss 2.9572): 56%|ββββββ | 696/1250 [04:14<03:20, 2.76it/s]
Training 1/1 epoch (loss 2.7399): 56%|ββββββ | 696/1250 [04:14<03:20, 2.76it/s]
Training 1/1 epoch (loss 2.7399): 56%|ββββββ | 697/1250 [04:14<03:25, 2.69it/s]
Training 1/1 epoch (loss 2.6775): 56%|ββββββ | 697/1250 [04:15<03:25, 2.69it/s]
Training 1/1 epoch (loss 2.6775): 56%|ββββββ | 698/1250 [04:15<03:24, 2.70it/s]
Training 1/1 epoch (loss 2.8843): 56%|ββββββ | 698/1250 [04:15<03:24, 2.70it/s]
Training 1/1 epoch (loss 2.8843): 56%|ββββββ | 699/1250 [04:15<03:22, 2.72it/s]
Training 1/1 epoch (loss 3.0212): 56%|ββββββ | 699/1250 [04:15<03:22, 2.72it/s]
Training 1/1 epoch (loss 3.0212): 56%|ββββββ | 700/1250 [04:15<03:14, 2.83it/s]
Training 1/1 epoch (loss 2.9040): 56%|ββββββ | 700/1250 [04:16<03:14, 2.83it/s]
Training 1/1 epoch (loss 2.9040): 56%|ββββββ | 701/1250 [04:16<03:14, 2.83it/s]
Training 1/1 epoch (loss 2.7188): 56%|ββββββ | 701/1250 [04:16<03:14, 2.83it/s]
Training 1/1 epoch (loss 2.7188): 56%|ββββββ | 702/1250 [04:16<03:07, 2.93it/s]
Training 1/1 epoch (loss 2.7964): 56%|ββββββ | 702/1250 [04:16<03:07, 2.93it/s]
Training 1/1 epoch (loss 2.7964): 56%|ββββββ | 703/1250 [04:16<03:05, 2.95it/s]
Training 1/1 epoch (loss 2.5746): 56%|ββββββ | 703/1250 [04:17<03:05, 2.95it/s]
Training 1/1 epoch (loss 2.5746): 56%|ββββββ | 704/1250 [04:17<03:10, 2.86it/s]
Training 1/1 epoch (loss 2.7986): 56%|ββββββ | 704/1250 [04:17<03:10, 2.86it/s]
Training 1/1 epoch (loss 2.7986): 56%|ββββββ | 705/1250 [04:17<03:06, 2.93it/s]
Training 1/1 epoch (loss 2.8114): 56%|ββββββ | 705/1250 [04:17<03:06, 2.93it/s]
Training 1/1 epoch (loss 2.8114): 56%|ββββββ | 706/1250 [04:17<03:07, 2.90it/s]
Training 1/1 epoch (loss 2.7998): 56%|ββββββ | 706/1250 [04:18<03:07, 2.90it/s]
Training 1/1 epoch (loss 2.7998): 57%|ββββββ | 707/1250 [04:18<03:19, 2.72it/s]
Training 1/1 epoch (loss 2.8607): 57%|ββββββ | 707/1250 [04:18<03:19, 2.72it/s]
Training 1/1 epoch (loss 2.8607): 57%|ββββββ | 708/1250 [04:18<03:12, 2.81it/s]
Training 1/1 epoch (loss 2.8848): 57%|ββββββ | 708/1250 [04:18<03:12, 2.81it/s]
Training 1/1 epoch (loss 2.8848): 57%|ββββββ | 709/1250 [04:18<03:02, 2.97it/s]
Training 1/1 epoch (loss 2.7264): 57%|ββββββ | 709/1250 [04:19<03:02, 2.97it/s]
Training 1/1 epoch (loss 2.7264): 57%|ββββββ | 710/1250 [04:19<03:00, 2.99it/s]
Training 1/1 epoch (loss 2.8575): 57%|ββββββ | 710/1250 [04:19<03:00, 2.99it/s]
Training 1/1 epoch (loss 2.8575): 57%|ββββββ | 711/1250 [04:19<02:57, 3.04it/s]
Training 1/1 epoch (loss 2.8407): 57%|ββββββ | 711/1250 [04:19<02:57, 3.04it/s]
Training 1/1 epoch (loss 2.8407): 57%|ββββββ | 712/1250 [04:19<02:59, 3.00it/s]
Training 1/1 epoch (loss 2.6309): 57%|ββββββ | 712/1250 [04:20<02:59, 3.00it/s]
Training 1/1 epoch (loss 2.6309): 57%|ββββββ | 713/1250 [04:20<03:01, 2.95it/s]
Training 1/1 epoch (loss 2.9797): 57%|ββββββ | 713/1250 [04:20<03:01, 2.95it/s]
Training 1/1 epoch (loss 2.9797): 57%|ββββββ | 714/1250 [04:20<03:01, 2.95it/s]
Training 1/1 epoch (loss 2.8169): 57%|ββββββ | 714/1250 [04:20<03:01, 2.95it/s]
Training 1/1 epoch (loss 2.8169): 57%|ββββββ | 715/1250 [04:20<02:57, 3.02it/s]
Training 1/1 epoch (loss 2.4872): 57%|ββββββ | 715/1250 [04:21<02:57, 3.02it/s]
Training 1/1 epoch (loss 2.4872): 57%|ββββββ | 716/1250 [04:21<03:02, 2.92it/s]
Training 1/1 epoch (loss 2.8422): 57%|ββββββ | 716/1250 [04:21<03:02, 2.92it/s]
Training 1/1 epoch (loss 2.8422): 57%|ββββββ | 717/1250 [04:21<02:57, 3.00it/s]
Training 1/1 epoch (loss 2.8244): 57%|ββββββ | 717/1250 [04:21<02:57, 3.00it/s]
Training 1/1 epoch (loss 2.8244): 57%|ββββββ | 718/1250 [04:21<02:56, 3.02it/s]
Training 1/1 epoch (loss 2.8057): 57%|ββββββ | 718/1250 [04:22<02:56, 3.02it/s]
Training 1/1 epoch (loss 2.8057): 58%|ββββββ | 719/1250 [04:22<02:59, 2.96it/s]
Training 1/1 epoch (loss 2.9178): 58%|ββββββ | 719/1250 [04:22<02:59, 2.96it/s]
Training 1/1 epoch (loss 2.9178): 58%|ββββββ | 720/1250 [04:22<03:02, 2.90it/s]
Training 1/1 epoch (loss 2.6634): 58%|ββββββ | 720/1250 [04:23<03:02, 2.90it/s]
Training 1/1 epoch (loss 2.6634): 58%|ββββββ | 721/1250 [04:23<03:03, 2.88it/s]
Training 1/1 epoch (loss 2.6638): 58%|ββββββ | 721/1250 [04:23<03:03, 2.88it/s]
Training 1/1 epoch (loss 2.6638): 58%|ββββββ | 722/1250 [04:23<03:52, 2.27it/s]
Training 1/1 epoch (loss 2.5081): 58%|ββββββ | 722/1250 [04:24<03:52, 2.27it/s]
Training 1/1 epoch (loss 2.5081): 58%|ββββββ | 723/1250 [04:24<03:41, 2.38it/s]
Training 1/1 epoch (loss 2.6696): 58%|ββββββ | 723/1250 [04:24<03:41, 2.38it/s]
Training 1/1 epoch (loss 2.6696): 58%|ββββββ | 724/1250 [04:24<03:27, 2.53it/s]
Training 1/1 epoch (loss 3.0889): 58%|ββββββ | 724/1250 [04:24<03:27, 2.53it/s]
Training 1/1 epoch (loss 3.0889): 58%|ββββββ | 725/1250 [04:24<03:22, 2.60it/s]
Training 1/1 epoch (loss 2.6679): 58%|ββββββ | 725/1250 [04:25<03:22, 2.60it/s]
Training 1/1 epoch (loss 2.6679): 58%|ββββββ | 726/1250 [04:25<03:14, 2.70it/s]
Training 1/1 epoch (loss 2.6324): 58%|ββββββ | 726/1250 [04:25<03:14, 2.70it/s]
Training 1/1 epoch (loss 2.6324): 58%|ββββββ | 727/1250 [04:25<03:06, 2.80it/s]
Training 1/1 epoch (loss 2.6400): 58%|ββββββ | 727/1250 [04:25<03:06, 2.80it/s]
Training 1/1 epoch (loss 2.6400): 58%|ββββββ | 728/1250 [04:25<03:14, 2.69it/s]
Training 1/1 epoch (loss 2.8310): 58%|ββββββ | 728/1250 [04:26<03:14, 2.69it/s]
Training 1/1 epoch (loss 2.8310): 58%|ββββββ | 729/1250 [04:26<03:12, 2.70it/s]
Training 1/1 epoch (loss 2.8372): 58%|ββββββ | 729/1250 [04:26<03:12, 2.70it/s]
Training 1/1 epoch (loss 2.8372): 58%|ββββββ | 730/1250 [04:26<03:07, 2.77it/s]
Training 1/1 epoch (loss 2.7676): 58%|ββββββ | 730/1250 [04:26<03:07, 2.77it/s]
Training 1/1 epoch (loss 2.7676): 58%|ββββββ | 731/1250 [04:26<03:06, 2.78it/s]
Training 1/1 epoch (loss 2.4912): 58%|ββββββ | 731/1250 [04:27<03:06, 2.78it/s]
Training 1/1 epoch (loss 2.4912): 59%|ββββββ | 732/1250 [04:27<02:56, 2.93it/s]
Training 1/1 epoch (loss 2.6205): 59%|ββββββ | 732/1250 [04:27<02:56, 2.93it/s]
Training 1/1 epoch (loss 2.6205): 59%|ββββββ | 733/1250 [04:27<02:55, 2.94it/s]
Training 1/1 epoch (loss 2.7251): 59%|ββββββ | 733/1250 [04:27<02:55, 2.94it/s]
Training 1/1 epoch (loss 2.7251): 59%|ββββββ | 734/1250 [04:27<02:55, 2.94it/s]
Training 1/1 epoch (loss 2.6939): 59%|ββββββ | 734/1250 [04:28<02:55, 2.94it/s]
Training 1/1 epoch (loss 2.6939): 59%|ββββββ | 735/1250 [04:28<03:11, 2.69it/s]
Training 1/1 epoch (loss 2.9324): 59%|ββββββ | 735/1250 [04:28<03:11, 2.69it/s]
Training 1/1 epoch (loss 2.9324): 59%|ββββββ | 736/1250 [04:28<03:26, 2.49it/s]
Training 1/1 epoch (loss 2.5819): 59%|ββββββ | 736/1250 [04:29<03:26, 2.49it/s]
Training 1/1 epoch (loss 2.5819): 59%|ββββββ | 737/1250 [04:29<03:30, 2.44it/s]
Training 1/1 epoch (loss 2.7011): 59%|ββββββ | 737/1250 [04:29<03:30, 2.44it/s]
Training 1/1 epoch (loss 2.7011): 59%|ββββββ | 738/1250 [04:29<03:21, 2.55it/s]
Training 1/1 epoch (loss 2.7576): 59%|ββββββ | 738/1250 [04:29<03:21, 2.55it/s]
Training 1/1 epoch (loss 2.7576): 59%|ββββββ | 739/1250 [04:29<03:12, 2.66it/s]
Training 1/1 epoch (loss 2.6903): 59%|ββββββ | 739/1250 [04:30<03:12, 2.66it/s]
Training 1/1 epoch (loss 2.6903): 59%|ββββββ | 740/1250 [04:30<03:05, 2.75it/s]
Training 1/1 epoch (loss 2.6679): 59%|ββββββ | 740/1250 [04:30<03:05, 2.75it/s]
Training 1/1 epoch (loss 2.6679): 59%|ββββββ | 741/1250 [04:30<02:59, 2.83it/s]
Training 1/1 epoch (loss 2.8213): 59%|ββββββ | 741/1250 [04:30<02:59, 2.83it/s]
Training 1/1 epoch (loss 2.8213): 59%|ββββββ | 742/1250 [04:30<02:54, 2.91it/s]
Training 1/1 epoch (loss 2.7489): 59%|ββββββ | 742/1250 [04:31<02:54, 2.91it/s]
Training 1/1 epoch (loss 2.7489): 59%|ββββββ | 743/1250 [04:31<02:54, 2.90it/s]
Training 1/1 epoch (loss 2.8107): 59%|ββββββ | 743/1250 [04:31<02:54, 2.90it/s]
Training 1/1 epoch (loss 2.8107): 60%|ββββββ | 744/1250 [04:31<02:56, 2.87it/s]
Training 1/1 epoch (loss 2.7591): 60%|ββββββ | 744/1250 [04:31<02:56, 2.87it/s]
Training 1/1 epoch (loss 2.7591): 60%|ββββββ | 745/1250 [04:31<02:59, 2.81it/s]
Training 1/1 epoch (loss 2.8344): 60%|ββββββ | 745/1250 [04:32<02:59, 2.81it/s]
Training 1/1 epoch (loss 2.8344): 60%|ββββββ | 746/1250 [04:32<03:01, 2.78it/s]
Training 1/1 epoch (loss 2.5994): 60%|ββββββ | 746/1250 [04:32<03:01, 2.78it/s]
Training 1/1 epoch (loss 2.5994): 60%|ββββββ | 747/1250 [04:32<02:59, 2.80it/s]
Training 1/1 epoch (loss 2.7995): 60%|ββββββ | 747/1250 [04:33<02:59, 2.80it/s]
Training 1/1 epoch (loss 2.7995): 60%|ββββββ | 748/1250 [04:33<02:57, 2.83it/s]
Training 1/1 epoch (loss 2.5950): 60%|ββββββ | 748/1250 [04:33<02:57, 2.83it/s]
Training 1/1 epoch (loss 2.5950): 60%|ββββββ | 749/1250 [04:33<02:52, 2.91it/s]
Training 1/1 epoch (loss 2.7940): 60%|ββββββ | 749/1250 [04:33<02:52, 2.91it/s]
Training 1/1 epoch (loss 2.7940): 60%|ββββββ | 750/1250 [04:33<02:49, 2.95it/s]
Training 1/1 epoch (loss 2.5636): 60%|ββββββ | 750/1250 [04:34<02:49, 2.95it/s]
Training 1/1 epoch (loss 2.5636): 60%|ββββββ | 751/1250 [04:34<02:48, 2.96it/s]
Training 1/1 epoch (loss 2.8083): 60%|ββββββ | 751/1250 [04:34<02:48, 2.96it/s]
Training 1/1 epoch (loss 2.8083): 60%|ββββββ | 752/1250 [04:34<03:07, 2.65it/s]
Training 1/1 epoch (loss 2.9822): 60%|ββββββ | 752/1250 [04:34<03:07, 2.65it/s]
Training 1/1 epoch (loss 2.9822): 60%|ββββββ | 753/1250 [04:34<03:04, 2.69it/s]
Training 1/1 epoch (loss 2.9166): 60%|ββββββ | 753/1250 [04:35<03:04, 2.69it/s]
Training 1/1 epoch (loss 2.9166): 60%|ββββββ | 754/1250 [04:35<02:58, 2.79it/s]
Training 1/1 epoch (loss 2.7191): 60%|ββββββ | 754/1250 [04:35<02:58, 2.79it/s]
Training 1/1 epoch (loss 2.7191): 60%|ββββββ | 755/1250 [04:35<02:53, 2.85it/s]
Training 1/1 epoch (loss 2.9729): 60%|ββββββ | 755/1250 [04:35<02:53, 2.85it/s]
Training 1/1 epoch (loss 2.9729): 60%|ββββββ | 756/1250 [04:35<02:50, 2.89it/s]
Training 1/1 epoch (loss 2.7689): 60%|ββββββ | 756/1250 [04:36<02:50, 2.89it/s]
Training 1/1 epoch (loss 2.7689): 61%|ββββββ | 757/1250 [04:36<02:51, 2.87it/s]
Training 1/1 epoch (loss 2.6778): 61%|ββββββ | 757/1250 [04:36<02:51, 2.87it/s]
Training 1/1 epoch (loss 2.6778): 61%|ββββββ | 758/1250 [04:36<02:48, 2.91it/s]
Training 1/1 epoch (loss 2.7340): 61%|ββββββ | 758/1250 [04:36<02:48, 2.91it/s]
Training 1/1 epoch (loss 2.7340): 61%|ββββββ | 759/1250 [04:36<02:48, 2.92it/s]
Training 1/1 epoch (loss 2.7932): 61%|ββββββ | 759/1250 [04:37<02:48, 2.92it/s]
Training 1/1 epoch (loss 2.7932): 61%|ββββββ | 760/1250 [04:37<02:46, 2.95it/s]
Training 1/1 epoch (loss 2.8299): 61%|ββββββ | 760/1250 [04:37<02:46, 2.95it/s]
Training 1/1 epoch (loss 2.8299): 61%|ββββββ | 761/1250 [04:37<02:45, 2.95it/s]
Training 1/1 epoch (loss 2.6074): 61%|ββββββ | 761/1250 [04:37<02:45, 2.95it/s]
Training 1/1 epoch (loss 2.6074): 61%|ββββββ | 762/1250 [04:37<02:46, 2.93it/s]
Training 1/1 epoch (loss 2.6011): 61%|ββββββ | 762/1250 [04:38<02:46, 2.93it/s]
Training 1/1 epoch (loss 2.6011): 61%|ββββββ | 763/1250 [04:38<02:45, 2.94it/s]
Training 1/1 epoch (loss 2.7645): 61%|ββββββ | 763/1250 [04:38<02:45, 2.94it/s]
Training 1/1 epoch (loss 2.7645): 61%|ββββββ | 764/1250 [04:38<02:39, 3.05it/s]
Training 1/1 epoch (loss 2.7985): 61%|ββββββ | 764/1250 [04:38<02:39, 3.05it/s]
Training 1/1 epoch (loss 2.7985): 61%|ββββββ | 765/1250 [04:38<02:44, 2.94it/s]
Training 1/1 epoch (loss 2.5274): 61%|ββββββ | 765/1250 [04:39<02:44, 2.94it/s]
Training 1/1 epoch (loss 2.5274): 61%|βββββββ | 766/1250 [04:39<02:43, 2.97it/s]
Training 1/1 epoch (loss 2.5422): 61%|βββββββ | 766/1250 [04:39<02:43, 2.97it/s]
Training 1/1 epoch (loss 2.5422): 61%|βββββββ | 767/1250 [04:39<02:44, 2.93it/s]
Training 1/1 epoch (loss 2.7259): 61%|βββββββ | 767/1250 [04:39<02:44, 2.93it/s]
Training 1/1 epoch (loss 2.7259): 61%|βββββββ | 768/1250 [04:39<02:49, 2.84it/s]
Training 1/1 epoch (loss 2.9017): 61%|βββββββ | 768/1250 [04:40<02:49, 2.84it/s]
Training 1/1 epoch (loss 2.9017): 62%|βββββββ | 769/1250 [04:40<02:46, 2.89it/s]
Training 1/1 epoch (loss 2.7449): 62%|βββββββ | 769/1250 [04:40<02:46, 2.89it/s]
Training 1/1 epoch (loss 2.7449): 62%|βββββββ | 770/1250 [04:40<02:48, 2.85it/s]
Training 1/1 epoch (loss 2.8049): 62%|βββββββ | 770/1250 [04:41<02:48, 2.85it/s]
Training 1/1 epoch (loss 2.8049): 62%|βββββββ | 771/1250 [04:41<02:48, 2.85it/s]
Training 1/1 epoch (loss 2.8361): 62%|βββββββ | 771/1250 [04:41<02:48, 2.85it/s]
Training 1/1 epoch (loss 2.8361): 62%|βββββββ | 772/1250 [04:41<02:42, 2.94it/s]
Training 1/1 epoch (loss 2.6996): 62%|βββββββ | 772/1250 [04:41<02:42, 2.94it/s]
Training 1/1 epoch (loss 2.6996): 62%|βββββββ | 773/1250 [04:41<02:39, 2.99it/s]
Training 1/1 epoch (loss 2.6951): 62%|βββββββ | 773/1250 [04:41<02:39, 2.99it/s]
Training 1/1 epoch (loss 2.6951): 62%|βββββββ | 774/1250 [04:41<02:38, 3.01it/s]
Training 1/1 epoch (loss 2.7918): 62%|βββββββ | 774/1250 [04:42<02:38, 3.01it/s]
Training 1/1 epoch (loss 2.7918): 62%|βββββββ | 775/1250 [04:42<02:38, 3.00it/s]
Training 1/1 epoch (loss 2.6249): 62%|βββββββ | 775/1250 [04:42<02:38, 3.00it/s]
Training 1/1 epoch (loss 2.6249): 62%|βββββββ | 776/1250 [04:42<02:40, 2.95it/s]
Training 1/1 epoch (loss 2.7974): 62%|βββββββ | 776/1250 [04:43<02:40, 2.95it/s]
Training 1/1 epoch (loss 2.7974): 62%|βββββββ | 777/1250 [04:43<02:44, 2.88it/s]
Training 1/1 epoch (loss 2.8290): 62%|βββββββ | 777/1250 [04:43<02:44, 2.88it/s]
Training 1/1 epoch (loss 2.8290): 62%|βββββββ | 778/1250 [04:43<02:37, 3.00it/s]
Training 1/1 epoch (loss 2.6965): 62%|βββββββ | 778/1250 [04:43<02:37, 3.00it/s]
Training 1/1 epoch (loss 2.6965): 62%|βββββββ | 779/1250 [04:43<02:36, 3.01it/s]
Training 1/1 epoch (loss 2.9995): 62%|βββββββ | 779/1250 [04:43<02:36, 3.01it/s]
Training 1/1 epoch (loss 2.9995): 62%|βββββββ | 780/1250 [04:43<02:33, 3.06it/s]
Training 1/1 epoch (loss 2.5981): 62%|βββββββ | 780/1250 [04:44<02:33, 3.06it/s]
Training 1/1 epoch (loss 2.5981): 62%|βββββββ | 781/1250 [04:44<02:40, 2.93it/s]
Training 1/1 epoch (loss 2.8203): 62%|βββββββ | 781/1250 [04:44<02:40, 2.93it/s]
Training 1/1 epoch (loss 2.8203): 63%|βββββββ | 782/1250 [04:44<02:49, 2.76it/s]
Training 1/1 epoch (loss 2.7977): 63%|βββββββ | 782/1250 [04:45<02:49, 2.76it/s]
Training 1/1 epoch (loss 2.7977): 63%|βββββββ | 783/1250 [04:45<02:52, 2.71it/s]
Training 1/1 epoch (loss 2.5840): 63%|βββββββ | 783/1250 [04:45<02:52, 2.71it/s]
Training 1/1 epoch (loss 2.5840): 63%|βββββββ | 784/1250 [04:45<02:46, 2.80it/s]
Training 1/1 epoch (loss 2.9107): 63%|βββββββ | 784/1250 [04:45<02:46, 2.80it/s]
Training 1/1 epoch (loss 2.9107): 63%|βββββββ | 785/1250 [04:45<02:43, 2.84it/s]
Training 1/1 epoch (loss 2.8426): 63%|βββββββ | 785/1250 [04:46<02:43, 2.84it/s]
Training 1/1 epoch (loss 2.8426): 63%|βββββββ | 786/1250 [04:46<02:44, 2.82it/s]
Training 1/1 epoch (loss 2.8196): 63%|βββββββ | 786/1250 [04:46<02:44, 2.82it/s]
Training 1/1 epoch (loss 2.8196): 63%|βββββββ | 787/1250 [04:46<03:43, 2.07it/s]
Training 1/1 epoch (loss 2.5697): 63%|βββββββ | 787/1250 [04:47<03:43, 2.07it/s]
Training 1/1 epoch (loss 2.5697): 63%|βββββββ | 788/1250 [04:47<03:25, 2.24it/s]
Training 1/1 epoch (loss 2.5427): 63%|βββββββ | 788/1250 [04:47<03:25, 2.24it/s]
Training 1/1 epoch (loss 2.5427): 63%|βββββββ | 789/1250 [04:47<03:04, 2.49it/s]
Training 1/1 epoch (loss 2.6282): 63%|βββββββ | 789/1250 [04:47<03:04, 2.49it/s]
Training 1/1 epoch (loss 2.6282): 63%|βββββββ | 790/1250 [04:47<02:56, 2.60it/s]
Training 1/1 epoch (loss 2.7595): 63%|βββββββ | 790/1250 [04:48<02:56, 2.60it/s]
Training 1/1 epoch (loss 2.7595): 63%|βββββββ | 791/1250 [04:48<02:51, 2.68it/s]
Training 1/1 epoch (loss 2.7224): 63%|βββββββ | 791/1250 [04:48<02:51, 2.68it/s]
Training 1/1 epoch (loss 2.7224): 63%|βββββββ | 792/1250 [04:48<02:45, 2.77it/s]
Training 1/1 epoch (loss 2.5571): 63%|βββββββ | 792/1250 [04:48<02:45, 2.77it/s]
Training 1/1 epoch (loss 2.5571): 63%|βββββββ | 793/1250 [04:48<02:44, 2.78it/s]
Training 1/1 epoch (loss 2.7249): 63%|βββββββ | 793/1250 [04:49<02:44, 2.78it/s]
Training 1/1 epoch (loss 2.7249): 64%|βββββββ | 794/1250 [04:49<02:44, 2.78it/s]
Training 1/1 epoch (loss 2.8927): 64%|βββββββ | 794/1250 [04:49<02:44, 2.78it/s]
Training 1/1 epoch (loss 2.8927): 64%|βββββββ | 795/1250 [04:49<02:36, 2.91it/s]
Training 1/1 epoch (loss 2.6904): 64%|βββββββ | 795/1250 [04:49<02:36, 2.91it/s]
Training 1/1 epoch (loss 2.6904): 64%|βββββββ | 796/1250 [04:49<02:30, 3.01it/s]
Training 1/1 epoch (loss 2.6662): 64%|βββββββ | 796/1250 [04:50<02:30, 3.01it/s]
Training 1/1 epoch (loss 2.6662): 64%|βββββββ | 797/1250 [04:50<02:33, 2.95it/s]
Training 1/1 epoch (loss 2.8839): 64%|βββββββ | 797/1250 [04:50<02:33, 2.95it/s]
Training 1/1 epoch (loss 2.8839): 64%|βββββββ | 798/1250 [04:50<03:00, 2.51it/s]
Training 1/1 epoch (loss 2.8420): 64%|βββββββ | 798/1250 [04:51<03:00, 2.51it/s]
Training 1/1 epoch (loss 2.8420): 64%|βββββββ | 799/1250 [04:51<02:51, 2.63it/s]
Training 1/1 epoch (loss 2.7810): 64%|βββββββ | 799/1250 [04:51<02:51, 2.63it/s]
Training 1/1 epoch (loss 2.7810): 64%|βββββββ | 800/1250 [04:51<02:48, 2.67it/s]
Training 1/1 epoch (loss 2.7131): 64%|βββββββ | 800/1250 [04:51<02:48, 2.67it/s]
Training 1/1 epoch (loss 2.7131): 64%|βββββββ | 801/1250 [04:51<02:42, 2.76it/s]
Training 1/1 epoch (loss 2.8420): 64%|βββββββ | 801/1250 [04:52<02:42, 2.76it/s]
Training 1/1 epoch (loss 2.8420): 64%|βββββββ | 802/1250 [04:52<02:39, 2.81it/s]
Training 1/1 epoch (loss 2.7272): 64%|βββββββ | 802/1250 [04:52<02:39, 2.81it/s]
Training 1/1 epoch (loss 2.7272): 64%|βββββββ | 803/1250 [04:52<02:34, 2.90it/s]
Training 1/1 epoch (loss 2.6088): 64%|βββββββ | 803/1250 [04:52<02:34, 2.90it/s]
Training 1/1 epoch (loss 2.6088): 64%|βββββββ | 804/1250 [04:52<02:31, 2.95it/s]
Training 1/1 epoch (loss 2.7235): 64%|βββββββ | 804/1250 [04:53<02:31, 2.95it/s]
Training 1/1 epoch (loss 2.7235): 64%|βββββββ | 805/1250 [04:53<02:30, 2.95it/s]
Training 1/1 epoch (loss 2.7137): 64%|βββββββ | 805/1250 [04:53<02:30, 2.95it/s]
Training 1/1 epoch (loss 2.7137): 64%|βββββββ | 806/1250 [04:53<03:05, 2.40it/s]
Training 1/1 epoch (loss 2.7162): 64%|βββββββ | 806/1250 [04:54<03:05, 2.40it/s]
Training 1/1 epoch (loss 2.7162): 65%|βββββββ | 807/1250 [04:54<02:57, 2.49it/s]
Training 1/1 epoch (loss 2.6596): 65%|βββββββ | 807/1250 [04:54<02:57, 2.49it/s]
Training 1/1 epoch (loss 2.6596): 65%|βββββββ | 808/1250 [04:54<02:54, 2.53it/s]
Training 1/1 epoch (loss 2.6618): 65%|βββββββ | 808/1250 [04:54<02:54, 2.53it/s]
Training 1/1 epoch (loss 2.6618): 65%|βββββββ | 809/1250 [04:54<02:56, 2.49it/s]
Training 1/1 epoch (loss 2.8156): 65%|βββββββ | 809/1250 [04:55<02:56, 2.49it/s]
Training 1/1 epoch (loss 2.8156): 65%|βββββββ | 810/1250 [04:55<02:49, 2.59it/s]
Training 1/1 epoch (loss 2.7768): 65%|βββββββ | 810/1250 [04:55<02:49, 2.59it/s]
Training 1/1 epoch (loss 2.7768): 65%|βββββββ | 811/1250 [04:55<02:42, 2.71it/s]
Training 1/1 epoch (loss 2.7296): 65%|βββββββ | 811/1250 [04:55<02:42, 2.71it/s]
Training 1/1 epoch (loss 2.7296): 65%|βββββββ | 812/1250 [04:55<02:38, 2.76it/s]
Training 1/1 epoch (loss 2.6777): 65%|βββββββ | 812/1250 [04:56<02:38, 2.76it/s]
Training 1/1 epoch (loss 2.6777): 65%|βββββββ | 813/1250 [04:56<02:32, 2.87it/s]
Training 1/1 epoch (loss 2.7758): 65%|βββββββ | 813/1250 [04:56<02:32, 2.87it/s]
Training 1/1 epoch (loss 2.7758): 65%|βββββββ | 814/1250 [04:56<02:43, 2.66it/s]
Training 1/1 epoch (loss 2.8336): 65%|βββββββ | 814/1250 [04:57<02:43, 2.66it/s]
Training 1/1 epoch (loss 2.8336): 65%|βββββββ | 815/1250 [04:57<02:40, 2.71it/s]
Training 1/1 epoch (loss 2.8850): 65%|βββββββ | 815/1250 [04:57<02:40, 2.71it/s]
Training 1/1 epoch (loss 2.8850): 65%|βββββββ | 816/1250 [04:57<02:34, 2.80it/s]
Training 1/1 epoch (loss 2.8073): 65%|βββββββ | 816/1250 [04:57<02:34, 2.80it/s]
Training 1/1 epoch (loss 2.8073): 65%|βββββββ | 817/1250 [04:57<02:31, 2.86it/s]
Training 1/1 epoch (loss 2.6364): 65%|βββββββ | 817/1250 [04:58<02:31, 2.86it/s]
Training 1/1 epoch (loss 2.6364): 65%|βββββββ | 818/1250 [04:58<02:52, 2.50it/s]
Training 1/1 epoch (loss 2.9309): 65%|βββββββ | 818/1250 [04:58<02:52, 2.50it/s]
Training 1/1 epoch (loss 2.9309): 66%|βββββββ | 819/1250 [04:58<02:54, 2.47it/s]
Training 1/1 epoch (loss 2.6973): 66%|βββββββ | 819/1250 [04:59<02:54, 2.47it/s]
Training 1/1 epoch (loss 2.6973): 66%|βββββββ | 820/1250 [04:59<03:02, 2.36it/s]
Training 1/1 epoch (loss 2.6018): 66%|βββββββ | 820/1250 [04:59<03:02, 2.36it/s]
Training 1/1 epoch (loss 2.6018): 66%|βββββββ | 821/1250 [04:59<02:56, 2.43it/s]
Training 1/1 epoch (loss 2.5811): 66%|βββββββ | 821/1250 [04:59<02:56, 2.43it/s]
Training 1/1 epoch (loss 2.5811): 66%|βββββββ | 822/1250 [04:59<02:48, 2.54it/s]
Training 1/1 epoch (loss 2.6145): 66%|βββββββ | 822/1250 [05:00<02:48, 2.54it/s]
Training 1/1 epoch (loss 2.6145): 66%|βββββββ | 823/1250 [05:00<02:44, 2.59it/s]
Training 1/1 epoch (loss 2.6305): 66%|βββββββ | 823/1250 [05:00<02:44, 2.59it/s]
Training 1/1 epoch (loss 2.6305): 66%|βββββββ | 824/1250 [05:00<02:44, 2.58it/s]
Training 1/1 epoch (loss 2.7132): 66%|βββββββ | 824/1250 [05:01<02:44, 2.58it/s]
Training 1/1 epoch (loss 2.7132): 66%|βββββββ | 825/1250 [05:01<02:38, 2.68it/s]
Training 1/1 epoch (loss 2.8757): 66%|βββββββ | 825/1250 [05:01<02:38, 2.68it/s]
Training 1/1 epoch (loss 2.8757): 66%|βββββββ | 826/1250 [05:01<02:31, 2.80it/s]
Training 1/1 epoch (loss 2.7950): 66%|βββββββ | 826/1250 [05:01<02:31, 2.80it/s]
Training 1/1 epoch (loss 2.7950): 66%|βββββββ | 827/1250 [05:01<02:32, 2.78it/s]
Training 1/1 epoch (loss 2.6937): 66%|βββββββ | 827/1250 [05:02<02:32, 2.78it/s]
Training 1/1 epoch (loss 2.6937): 66%|βββββββ | 828/1250 [05:02<02:27, 2.87it/s]
Training 1/1 epoch (loss 2.7467): 66%|βββββββ | 828/1250 [05:02<02:27, 2.87it/s]
Training 1/1 epoch (loss 2.7467): 66%|βββββββ | 829/1250 [05:02<02:26, 2.87it/s]
Training 1/1 epoch (loss 2.6299): 66%|βββββββ | 829/1250 [05:02<02:26, 2.87it/s]
Training 1/1 epoch (loss 2.6299): 66%|βββββββ | 830/1250 [05:02<02:29, 2.80it/s]
Training 1/1 epoch (loss 2.7127): 66%|βββββββ | 830/1250 [05:03<02:29, 2.80it/s]
Training 1/1 epoch (loss 2.7127): 66%|βββββββ | 831/1250 [05:03<02:24, 2.90it/s]
Training 1/1 epoch (loss 2.8268): 66%|βββββββ | 831/1250 [05:03<02:24, 2.90it/s]
Training 1/1 epoch (loss 2.8268): 67%|βββββββ | 832/1250 [05:03<02:30, 2.77it/s]
Training 1/1 epoch (loss 2.6800): 67%|βββββββ | 832/1250 [05:03<02:30, 2.77it/s]
Training 1/1 epoch (loss 2.6800): 67%|βββββββ | 833/1250 [05:03<02:25, 2.86it/s]
Training 1/1 epoch (loss 2.8238): 67%|βββββββ | 833/1250 [05:04<02:25, 2.86it/s]
Training 1/1 epoch (loss 2.8238): 67%|βββββββ | 834/1250 [05:04<02:27, 2.82it/s]
Training 1/1 epoch (loss 2.9829): 67%|βββββββ | 834/1250 [05:04<02:27, 2.82it/s]
Training 1/1 epoch (loss 2.9829): 67%|βββββββ | 835/1250 [05:04<02:33, 2.71it/s]
Training 1/1 epoch (loss 2.7706): 67%|βββββββ | 835/1250 [05:04<02:33, 2.71it/s]
Training 1/1 epoch (loss 2.7706): 67%|βββββββ | 836/1250 [05:04<02:27, 2.81it/s]
Training 1/1 epoch (loss 2.7210): 67%|βββββββ | 836/1250 [05:05<02:27, 2.81it/s]
Training 1/1 epoch (loss 2.7210): 67%|βββββββ | 837/1250 [05:05<02:21, 2.93it/s]
Training 1/1 epoch (loss 2.8198): 67%|βββββββ | 837/1250 [05:05<02:21, 2.93it/s]
Training 1/1 epoch (loss 2.8198): 67%|βββββββ | 838/1250 [05:05<02:19, 2.95it/s]
Training 1/1 epoch (loss 2.8230): 67%|βββββββ | 838/1250 [05:05<02:19, 2.95it/s]
Training 1/1 epoch (loss 2.8230): 67%|βββββββ | 839/1250 [05:05<02:25, 2.83it/s]
Training 1/1 epoch (loss 2.5288): 67%|βββββββ | 839/1250 [05:06<02:25, 2.83it/s]
Training 1/1 epoch (loss 2.5288): 67%|βββββββ | 840/1250 [05:06<02:21, 2.90it/s]
Training 1/1 epoch (loss 2.7285): 67%|βββββββ | 840/1250 [05:06<02:21, 2.90it/s]
Training 1/1 epoch (loss 2.7285): 67%|βββββββ | 841/1250 [05:06<02:26, 2.79it/s]
Training 1/1 epoch (loss 2.8483): 67%|βββββββ | 841/1250 [05:06<02:26, 2.79it/s]
Training 1/1 epoch (loss 2.8483): 67%|βββββββ | 842/1250 [05:06<02:28, 2.75it/s]
Training 1/1 epoch (loss 2.6609): 67%|βββββββ | 842/1250 [05:07<02:28, 2.75it/s]
Training 1/1 epoch (loss 2.6609): 67%|βββββββ | 843/1250 [05:07<02:20, 2.89it/s]
Training 1/1 epoch (loss 2.8179): 67%|βββββββ | 843/1250 [05:07<02:20, 2.89it/s]
Training 1/1 epoch (loss 2.8179): 68%|βββββββ | 844/1250 [05:07<02:23, 2.83it/s]
Training 1/1 epoch (loss 2.7405): 68%|βββββββ | 844/1250 [05:08<02:23, 2.83it/s]
Training 1/1 epoch (loss 2.7405): 68%|βββββββ | 845/1250 [05:08<02:21, 2.86it/s]
Training 1/1 epoch (loss 2.8293): 68%|βββββββ | 845/1250 [05:08<02:21, 2.86it/s]
Training 1/1 epoch (loss 2.8293): 68%|βββββββ | 846/1250 [05:08<02:16, 2.96it/s]
Training 1/1 epoch (loss 2.6338): 68%|βββββββ | 846/1250 [05:08<02:16, 2.96it/s]
Training 1/1 epoch (loss 2.6338): 68%|βββββββ | 847/1250 [05:08<02:13, 3.02it/s]
Training 1/1 epoch (loss 2.7195): 68%|βββββββ | 847/1250 [05:08<02:13, 3.02it/s]
Training 1/1 epoch (loss 2.7195): 68%|βββββββ | 848/1250 [05:08<02:13, 3.01it/s]
Training 1/1 epoch (loss 2.7580): 68%|βββββββ | 848/1250 [05:09<02:13, 3.01it/s]
Training 1/1 epoch (loss 2.7580): 68%|βββββββ | 849/1250 [05:09<02:16, 2.93it/s]
Training 1/1 epoch (loss 2.8976): 68%|βββββββ | 849/1250 [05:09<02:16, 2.93it/s]
Training 1/1 epoch (loss 2.8976): 68%|βββββββ | 850/1250 [05:09<02:15, 2.95it/s]
Training 1/1 epoch (loss 2.6515): 68%|βββββββ | 850/1250 [05:09<02:15, 2.95it/s]
Training 1/1 epoch (loss 2.6515): 68%|βββββββ | 851/1250 [05:09<02:14, 2.96it/s]
Training 1/1 epoch (loss 2.5806): 68%|βββββββ | 851/1250 [05:10<02:14, 2.96it/s]
Training 1/1 epoch (loss 2.5806): 68%|βββββββ | 852/1250 [05:10<02:11, 3.02it/s]
Training 1/1 epoch (loss 2.9495): 68%|βββββββ | 852/1250 [05:10<02:11, 3.02it/s]
Training 1/1 epoch (loss 2.9495): 68%|βββββββ | 853/1250 [05:10<02:09, 3.07it/s]
Training 1/1 epoch (loss 2.6416): 68%|βββββββ | 853/1250 [05:10<02:09, 3.07it/s]
Training 1/1 epoch (loss 2.6416): 68%|βββββββ | 854/1250 [05:10<02:14, 2.94it/s]
Training 1/1 epoch (loss 2.8636): 68%|βββββββ | 854/1250 [05:11<02:14, 2.94it/s]
Training 1/1 epoch (loss 2.8636): 68%|βββββββ | 855/1250 [05:11<02:10, 3.03it/s]
Training 1/1 epoch (loss 2.4451): 68%|βββββββ | 855/1250 [05:11<02:10, 3.03it/s]
Training 1/1 epoch (loss 2.4451): 68%|βββββββ | 856/1250 [05:11<02:13, 2.95it/s]
Training 1/1 epoch (loss 2.7285): 68%|βββββββ | 856/1250 [05:12<02:13, 2.95it/s]
Training 1/1 epoch (loss 2.7285): 69%|βββββββ | 857/1250 [05:12<02:18, 2.83it/s]
Training 1/1 epoch (loss 2.6827): 69%|βββββββ | 857/1250 [05:12<02:18, 2.83it/s]
Training 1/1 epoch (loss 2.6827): 69%|βββββββ | 858/1250 [05:12<02:13, 2.95it/s]
Training 1/1 epoch (loss 2.6798): 69%|βββββββ | 858/1250 [05:12<02:13, 2.95it/s]
Training 1/1 epoch (loss 2.6798): 69%|βββββββ | 859/1250 [05:12<02:14, 2.90it/s]
Training 1/1 epoch (loss 2.7341): 69%|βββββββ | 859/1250 [05:13<02:14, 2.90it/s]
Training 1/1 epoch (loss 2.7341): 69%|βββββββ | 860/1250 [05:13<02:12, 2.93it/s]
Training 1/1 epoch (loss 2.8041): 69%|βββββββ | 860/1250 [05:13<02:12, 2.93it/s]
Training 1/1 epoch (loss 2.8041): 69%|βββββββ | 861/1250 [05:13<02:10, 2.97it/s]
Training 1/1 epoch (loss 2.7210): 69%|βββββββ | 861/1250 [05:13<02:10, 2.97it/s]
Training 1/1 epoch (loss 2.7210): 69%|βββββββ | 862/1250 [05:13<02:10, 2.96it/s]
Training 1/1 epoch (loss 2.7602): 69%|βββββββ | 862/1250 [05:14<02:10, 2.96it/s]
Training 1/1 epoch (loss 2.7602): 69%|βββββββ | 863/1250 [05:14<02:12, 2.92it/s]
Training 1/1 epoch (loss 2.7064): 69%|βββββββ | 863/1250 [05:14<02:12, 2.92it/s]
Training 1/1 epoch (loss 2.7064): 69%|βββββββ | 864/1250 [05:14<02:14, 2.87it/s]
Training 1/1 epoch (loss 2.6887): 69%|βββββββ | 864/1250 [05:14<02:14, 2.87it/s]
Training 1/1 epoch (loss 2.6887): 69%|βββββββ | 865/1250 [05:14<02:18, 2.78it/s]
Training 1/1 epoch (loss 2.6178): 69%|βββββββ | 865/1250 [05:15<02:18, 2.78it/s]
Training 1/1 epoch (loss 2.6178): 69%|βββββββ | 866/1250 [05:15<02:20, 2.74it/s]
Training 1/1 epoch (loss 2.7494): 69%|βββββββ | 866/1250 [05:15<02:20, 2.74it/s]
Training 1/1 epoch (loss 2.7494): 69%|βββββββ | 867/1250 [05:15<02:14, 2.85it/s]
Training 1/1 epoch (loss 2.9886): 69%|βββββββ | 867/1250 [05:15<02:14, 2.85it/s]
Training 1/1 epoch (loss 2.9886): 69%|βββββββ | 868/1250 [05:15<02:08, 2.98it/s]
Training 1/1 epoch (loss 2.5916): 69%|βββββββ | 868/1250 [05:16<02:08, 2.98it/s]
Training 1/1 epoch (loss 2.5916): 70%|βββββββ | 869/1250 [05:16<02:11, 2.89it/s]
Training 1/1 epoch (loss 2.5591): 70%|βββββββ | 869/1250 [05:16<02:11, 2.89it/s]
Training 1/1 epoch (loss 2.5591): 70%|βββββββ | 870/1250 [05:16<02:08, 2.95it/s]
Training 1/1 epoch (loss 2.6214): 70%|βββββββ | 870/1250 [05:16<02:08, 2.95it/s]
Training 1/1 epoch (loss 2.6214): 70%|βββββββ | 871/1250 [05:16<02:04, 3.05it/s]
Training 1/1 epoch (loss 2.7169): 70%|βββββββ | 871/1250 [05:17<02:04, 3.05it/s]
Training 1/1 epoch (loss 2.7169): 70%|βββββββ | 872/1250 [05:17<02:13, 2.83it/s]
Training 1/1 epoch (loss 2.9122): 70%|βββββββ | 872/1250 [05:17<02:13, 2.83it/s]
Training 1/1 epoch (loss 2.9122): 70%|βββββββ | 873/1250 [05:17<02:15, 2.79it/s]
Training 1/1 epoch (loss 2.6920): 70%|βββββββ | 873/1250 [05:17<02:15, 2.79it/s]
Training 1/1 epoch (loss 2.6920): 70%|βββββββ | 874/1250 [05:17<02:11, 2.85it/s]
Training 1/1 epoch (loss 2.7603): 70%|βββββββ | 874/1250 [05:18<02:11, 2.85it/s]
Training 1/1 epoch (loss 2.7603): 70%|βββββββ | 875/1250 [05:18<02:21, 2.66it/s]
Training 1/1 epoch (loss 2.8047): 70%|βββββββ | 875/1250 [05:18<02:21, 2.66it/s]
Training 1/1 epoch (loss 2.8047): 70%|βββββββ | 876/1250 [05:18<02:16, 2.73it/s]
Training 1/1 epoch (loss 2.8655): 70%|βββββββ | 876/1250 [05:19<02:16, 2.73it/s]
Training 1/1 epoch (loss 2.8655): 70%|βββββββ | 877/1250 [05:19<02:16, 2.74it/s]
Training 1/1 epoch (loss 2.9794): 70%|βββββββ | 877/1250 [05:19<02:16, 2.74it/s]
Training 1/1 epoch (loss 2.9794): 70%|βββββββ | 878/1250 [05:19<02:08, 2.90it/s]
Training 1/1 epoch (loss 2.7757): 70%|βββββββ | 878/1250 [05:19<02:08, 2.90it/s]
Training 1/1 epoch (loss 2.7757): 70%|βββββββ | 879/1250 [05:19<02:10, 2.84it/s]
Training 1/1 epoch (loss 2.8580): 70%|βββββββ | 879/1250 [05:20<02:10, 2.84it/s]
Training 1/1 epoch (loss 2.8580): 70%|βββββββ | 880/1250 [05:20<02:13, 2.78it/s]
Training 1/1 epoch (loss 2.6340): 70%|βββββββ | 880/1250 [05:20<02:13, 2.78it/s]
Training 1/1 epoch (loss 2.6340): 70%|βββββββ | 881/1250 [05:20<02:17, 2.68it/s]
Training 1/1 epoch (loss 2.7345): 70%|βββββββ | 881/1250 [05:20<02:17, 2.68it/s]
Training 1/1 epoch (loss 2.7345): 71%|βββββββ | 882/1250 [05:20<02:14, 2.73it/s]
Training 1/1 epoch (loss 2.8713): 71%|βββββββ | 882/1250 [05:21<02:14, 2.73it/s]
Training 1/1 epoch (loss 2.8713): 71%|βββββββ | 883/1250 [05:21<02:12, 2.78it/s]
Training 1/1 epoch (loss 2.8173): 71%|βββββββ | 883/1250 [05:21<02:12, 2.78it/s]
Training 1/1 epoch (loss 2.8173): 71%|βββββββ | 884/1250 [05:21<02:08, 2.86it/s]
Training 1/1 epoch (loss 2.6738): 71%|βββββββ | 884/1250 [05:21<02:08, 2.86it/s]
Training 1/1 epoch (loss 2.6738): 71%|βββββββ | 885/1250 [05:21<02:04, 2.92it/s]
Training 1/1 epoch (loss 2.7872): 71%|βββββββ | 885/1250 [05:22<02:04, 2.92it/s]
Training 1/1 epoch (loss 2.7872): 71%|βββββββ | 886/1250 [05:22<02:02, 2.96it/s]
Training 1/1 epoch (loss 2.7190): 71%|βββββββ | 886/1250 [05:22<02:02, 2.96it/s]
Training 1/1 epoch (loss 2.7190): 71%|βββββββ | 887/1250 [05:22<02:00, 3.01it/s]
Training 1/1 epoch (loss 2.6603): 71%|βββββββ | 887/1250 [05:22<02:00, 3.01it/s]
Training 1/1 epoch (loss 2.6603): 71%|βββββββ | 888/1250 [05:22<02:03, 2.92it/s]
Training 1/1 epoch (loss 2.7479): 71%|βββββββ | 888/1250 [05:23<02:03, 2.92it/s]
Training 1/1 epoch (loss 2.7479): 71%|βββββββ | 889/1250 [05:23<02:02, 2.94it/s]
Training 1/1 epoch (loss 2.7355): 71%|βββββββ | 889/1250 [05:23<02:02, 2.94it/s]
Training 1/1 epoch (loss 2.7355): 71%|βββββββ | 890/1250 [05:23<02:11, 2.73it/s]
Training 1/1 epoch (loss 2.8962): 71%|βββββββ | 890/1250 [05:23<02:11, 2.73it/s]
Training 1/1 epoch (loss 2.8962): 71%|ββββββββ | 891/1250 [05:23<02:09, 2.76it/s]
Training 1/1 epoch (loss 2.7229): 71%|ββββββββ | 891/1250 [05:24<02:09, 2.76it/s]
Training 1/1 epoch (loss 2.7229): 71%|ββββββββ | 892/1250 [05:24<02:13, 2.68it/s]
Training 1/1 epoch (loss 2.5737): 71%|ββββββββ | 892/1250 [05:24<02:13, 2.68it/s]
Training 1/1 epoch (loss 2.5737): 71%|ββββββββ | 893/1250 [05:24<02:14, 2.66it/s]
Training 1/1 epoch (loss 2.5473): 71%|ββββββββ | 893/1250 [05:25<02:14, 2.66it/s]
Training 1/1 epoch (loss 2.5473): 72%|ββββββββ | 894/1250 [05:25<02:13, 2.66it/s]
Training 1/1 epoch (loss 3.0634): 72%|ββββββββ | 894/1250 [05:25<02:13, 2.66it/s]
Training 1/1 epoch (loss 3.0634): 72%|ββββββββ | 895/1250 [05:25<02:08, 2.76it/s]
Training 1/1 epoch (loss 2.4401): 72%|ββββββββ | 895/1250 [05:25<02:08, 2.76it/s]
Training 1/1 epoch (loss 2.4401): 72%|ββββββββ | 896/1250 [05:25<02:03, 2.87it/s]
Training 1/1 epoch (loss 2.6763): 72%|ββββββββ | 896/1250 [05:26<02:03, 2.87it/s]
Training 1/1 epoch (loss 2.6763): 72%|ββββββββ | 897/1250 [05:26<02:04, 2.84it/s]
Training 1/1 epoch (loss 2.8189): 72%|ββββββββ | 897/1250 [05:26<02:04, 2.84it/s]
Training 1/1 epoch (loss 2.8189): 72%|ββββββββ | 898/1250 [05:26<02:05, 2.82it/s]
Training 1/1 epoch (loss 2.7895): 72%|ββββββββ | 898/1250 [05:26<02:05, 2.82it/s]
Training 1/1 epoch (loss 2.7895): 72%|ββββββββ | 899/1250 [05:26<02:07, 2.74it/s]
Training 1/1 epoch (loss 2.9235): 72%|ββββββββ | 899/1250 [05:27<02:07, 2.74it/s]
Training 1/1 epoch (loss 2.9235): 72%|ββββββββ | 900/1250 [05:27<02:10, 2.67it/s]
Training 1/1 epoch (loss 2.8328): 72%|ββββββββ | 900/1250 [05:27<02:10, 2.67it/s]
Training 1/1 epoch (loss 2.8328): 72%|ββββββββ | 901/1250 [05:27<02:05, 2.77it/s]
Training 1/1 epoch (loss 2.7728): 72%|ββββββββ | 901/1250 [05:27<02:05, 2.77it/s]
Training 1/1 epoch (loss 2.7728): 72%|ββββββββ | 902/1250 [05:27<01:58, 2.93it/s]
Training 1/1 epoch (loss 2.7290): 72%|ββββββββ | 902/1250 [05:28<01:58, 2.93it/s]
Training 1/1 epoch (loss 2.7290): 72%|ββββββββ | 903/1250 [05:28<02:13, 2.60it/s]
Training 1/1 epoch (loss 2.7892): 72%|ββββββββ | 903/1250 [05:28<02:13, 2.60it/s]
Training 1/1 epoch (loss 2.7892): 72%|ββββββββ | 904/1250 [05:28<02:28, 2.33it/s]
Training 1/1 epoch (loss 2.6401): 72%|ββββββββ | 904/1250 [05:29<02:28, 2.33it/s]
Training 1/1 epoch (loss 2.6401): 72%|ββββββββ | 905/1250 [05:29<02:19, 2.48it/s]
Training 1/1 epoch (loss 2.6162): 72%|ββββββββ | 905/1250 [05:29<02:19, 2.48it/s]
Training 1/1 epoch (loss 2.6162): 72%|ββββββββ | 906/1250 [05:29<02:11, 2.62it/s]
Training 1/1 epoch (loss 2.5979): 72%|ββββββββ | 906/1250 [05:29<02:11, 2.62it/s]
Training 1/1 epoch (loss 2.5979): 73%|ββββββββ | 907/1250 [05:29<02:09, 2.64it/s]
Training 1/1 epoch (loss 2.5807): 73%|ββββββββ | 907/1250 [05:30<02:09, 2.64it/s]
Training 1/1 epoch (loss 2.5807): 73%|ββββββββ | 908/1250 [05:30<02:06, 2.69it/s]
Training 1/1 epoch (loss 2.8714): 73%|ββββββββ | 908/1250 [05:30<02:06, 2.69it/s]
Training 1/1 epoch (loss 2.8714): 73%|ββββββββ | 909/1250 [05:30<02:03, 2.76it/s]
Training 1/1 epoch (loss 2.6463): 73%|ββββββββ | 909/1250 [05:30<02:03, 2.76it/s]
Training 1/1 epoch (loss 2.6463): 73%|ββββββββ | 910/1250 [05:30<01:58, 2.87it/s]
Training 1/1 epoch (loss 2.5270): 73%|ββββββββ | 910/1250 [05:31<01:58, 2.87it/s]
Training 1/1 epoch (loss 2.5270): 73%|ββββββββ | 911/1250 [05:31<01:58, 2.85it/s]
Training 1/1 epoch (loss 2.7916): 73%|ββββββββ | 911/1250 [05:31<01:58, 2.85it/s]
Training 1/1 epoch (loss 2.7916): 73%|ββββββββ | 912/1250 [05:31<01:58, 2.86it/s]
Training 1/1 epoch (loss 2.5609): 73%|ββββββββ | 912/1250 [05:32<01:58, 2.86it/s]
Training 1/1 epoch (loss 2.5609): 73%|ββββββββ | 913/1250 [05:32<02:04, 2.70it/s]
Training 1/1 epoch (loss 2.6735): 73%|ββββββββ | 913/1250 [05:32<02:04, 2.70it/s]
Training 1/1 epoch (loss 2.6735): 73%|ββββββββ | 914/1250 [05:32<02:01, 2.77it/s]
Training 1/1 epoch (loss 2.8230): 73%|ββββββββ | 914/1250 [05:32<02:01, 2.77it/s]
Training 1/1 epoch (loss 2.8230): 73%|ββββββββ | 915/1250 [05:32<02:03, 2.71it/s]
Training 1/1 epoch (loss 2.7885): 73%|ββββββββ | 915/1250 [05:33<02:03, 2.71it/s]
Training 1/1 epoch (loss 2.7885): 73%|ββββββββ | 916/1250 [05:33<02:05, 2.66it/s]
Training 1/1 epoch (loss 2.5354): 73%|ββββββββ | 916/1250 [05:33<02:05, 2.66it/s]
Training 1/1 epoch (loss 2.5354): 73%|ββββββββ | 917/1250 [05:33<02:02, 2.71it/s]
Training 1/1 epoch (loss 2.9062): 73%|ββββββββ | 917/1250 [05:33<02:02, 2.71it/s]
Training 1/1 epoch (loss 2.9062): 73%|ββββββββ | 918/1250 [05:33<01:56, 2.85it/s]
Training 1/1 epoch (loss 2.6448): 73%|ββββββββ | 918/1250 [05:34<01:56, 2.85it/s]
Training 1/1 epoch (loss 2.6448): 74%|ββββββββ | 919/1250 [05:34<01:58, 2.79it/s]
Training 1/1 epoch (loss 2.8303): 74%|ββββββββ | 919/1250 [05:34<01:58, 2.79it/s]
Training 1/1 epoch (loss 2.8303): 74%|ββββββββ | 920/1250 [05:34<02:04, 2.65it/s]
Training 1/1 epoch (loss 2.7136): 74%|ββββββββ | 920/1250 [05:35<02:04, 2.65it/s]
Training 1/1 epoch (loss 2.7136): 74%|ββββββββ | 921/1250 [05:35<02:05, 2.62it/s]
Training 1/1 epoch (loss 2.6686): 74%|ββββββββ | 921/1250 [05:35<02:05, 2.62it/s]
Training 1/1 epoch (loss 2.6686): 74%|ββββββββ | 922/1250 [05:35<01:59, 2.75it/s]
Training 1/1 epoch (loss 2.6389): 74%|ββββββββ | 922/1250 [05:35<01:59, 2.75it/s]
Training 1/1 epoch (loss 2.6389): 74%|ββββββββ | 923/1250 [05:35<01:58, 2.76it/s]
Training 1/1 epoch (loss 2.9130): 74%|ββββββββ | 923/1250 [05:36<01:58, 2.76it/s]
Training 1/1 epoch (loss 2.9130): 74%|ββββββββ | 924/1250 [05:36<01:53, 2.88it/s]
Training 1/1 epoch (loss 2.7648): 74%|ββββββββ | 924/1250 [05:36<01:53, 2.88it/s]
Training 1/1 epoch (loss 2.7648): 74%|ββββββββ | 925/1250 [05:36<01:50, 2.93it/s]
Training 1/1 epoch (loss 2.7511): 74%|ββββββββ | 925/1250 [05:36<01:50, 2.93it/s]
Training 1/1 epoch (loss 2.7511): 74%|ββββββββ | 926/1250 [05:36<01:52, 2.87it/s]
Training 1/1 epoch (loss 2.6432): 74%|ββββββββ | 926/1250 [05:37<01:52, 2.87it/s]
Training 1/1 epoch (loss 2.6432): 74%|ββββββββ | 927/1250 [05:37<01:51, 2.90it/s]
Training 1/1 epoch (loss 2.9608): 74%|ββββββββ | 927/1250 [05:37<01:51, 2.90it/s]
Training 1/1 epoch (loss 2.9608): 74%|ββββββββ | 928/1250 [05:37<01:59, 2.69it/s]
Training 1/1 epoch (loss 2.8336): 74%|ββββββββ | 928/1250 [05:37<01:59, 2.69it/s]
Training 1/1 epoch (loss 2.8336): 74%|ββββββββ | 929/1250 [05:37<02:01, 2.65it/s]
Training 1/1 epoch (loss 2.9421): 74%|ββββββββ | 929/1250 [05:38<02:01, 2.65it/s]
Training 1/1 epoch (loss 2.9421): 74%|ββββββββ | 930/1250 [05:38<01:55, 2.76it/s]
Training 1/1 epoch (loss 2.8252): 74%|ββββββββ | 930/1250 [05:38<01:55, 2.76it/s]
Training 1/1 epoch (loss 2.8252): 74%|ββββββββ | 931/1250 [05:38<01:52, 2.83it/s]
Training 1/1 epoch (loss 2.8008): 74%|ββββββββ | 931/1250 [05:38<01:52, 2.83it/s]
Training 1/1 epoch (loss 2.8008): 75%|ββββββββ | 932/1250 [05:38<01:51, 2.86it/s]
Training 1/1 epoch (loss 2.7679): 75%|ββββββββ | 932/1250 [05:39<01:51, 2.86it/s]
Training 1/1 epoch (loss 2.7679): 75%|ββββββββ | 933/1250 [05:39<01:51, 2.84it/s]
Training 1/1 epoch (loss 2.6153): 75%|ββββββββ | 933/1250 [05:39<01:51, 2.84it/s]
Training 1/1 epoch (loss 2.6153): 75%|ββββββββ | 934/1250 [05:39<01:49, 2.89it/s]
Training 1/1 epoch (loss 2.6405): 75%|ββββββββ | 934/1250 [05:39<01:49, 2.89it/s]
Training 1/1 epoch (loss 2.6405): 75%|ββββββββ | 935/1250 [05:39<01:49, 2.88it/s]
Training 1/1 epoch (loss 2.7853): 75%|ββββββββ | 935/1250 [05:40<01:49, 2.88it/s]
Training 1/1 epoch (loss 2.7853): 75%|ββββββββ | 936/1250 [05:40<01:50, 2.84it/s]
Training 1/1 epoch (loss 2.5449): 75%|ββββββββ | 936/1250 [05:40<01:50, 2.84it/s]
Training 1/1 epoch (loss 2.5449): 75%|ββββββββ | 937/1250 [05:40<01:49, 2.86it/s]
Training 1/1 epoch (loss 2.6389): 75%|ββββββββ | 937/1250 [05:41<01:49, 2.86it/s]
Training 1/1 epoch (loss 2.6389): 75%|ββββββββ | 938/1250 [05:41<01:52, 2.77it/s]
Training 1/1 epoch (loss 2.7939): 75%|ββββββββ | 938/1250 [05:41<01:52, 2.77it/s]
Training 1/1 epoch (loss 2.7939): 75%|ββββββββ | 939/1250 [05:41<01:46, 2.91it/s]
Training 1/1 epoch (loss 2.7351): 75%|ββββββββ | 939/1250 [05:41<01:46, 2.91it/s]
Training 1/1 epoch (loss 2.7351): 75%|ββββββββ | 940/1250 [05:41<01:45, 2.93it/s]
Training 1/1 epoch (loss 2.7076): 75%|ββββββββ | 940/1250 [05:42<01:45, 2.93it/s]
Training 1/1 epoch (loss 2.7076): 75%|ββββββββ | 941/1250 [05:42<01:47, 2.89it/s]
Training 1/1 epoch (loss 2.7065): 75%|ββββββββ | 941/1250 [05:42<01:47, 2.89it/s]
Training 1/1 epoch (loss 2.7065): 75%|ββββββββ | 942/1250 [05:42<01:48, 2.85it/s]
Training 1/1 epoch (loss 2.5334): 75%|ββββββββ | 942/1250 [05:42<01:48, 2.85it/s]
Training 1/1 epoch (loss 2.5334): 75%|ββββββββ | 943/1250 [05:42<01:51, 2.75it/s]
Training 1/1 epoch (loss 2.4604): 75%|ββββββββ | 943/1250 [05:43<01:51, 2.75it/s]
Training 1/1 epoch (loss 2.4604): 76%|ββββββββ | 944/1250 [05:43<01:50, 2.77it/s]
Training 1/1 epoch (loss 2.7037): 76%|ββββββββ | 944/1250 [05:43<01:50, 2.77it/s]
Training 1/1 epoch (loss 2.7037): 76%|ββββββββ | 945/1250 [05:43<01:49, 2.79it/s]
Training 1/1 epoch (loss 2.7576): 76%|ββββββββ | 945/1250 [05:43<01:49, 2.79it/s]
Training 1/1 epoch (loss 2.7576): 76%|ββββββββ | 946/1250 [05:43<01:49, 2.78it/s]
Training 1/1 epoch (loss 2.6594): 76%|ββββββββ | 946/1250 [05:44<01:49, 2.78it/s]
Training 1/1 epoch (loss 2.6594): 76%|ββββββββ | 947/1250 [05:44<01:51, 2.73it/s]
Training 1/1 epoch (loss 2.6088): 76%|ββββββββ | 947/1250 [05:44<01:51, 2.73it/s]
Training 1/1 epoch (loss 2.6088): 76%|ββββββββ | 948/1250 [05:44<01:48, 2.79it/s]
Training 1/1 epoch (loss 2.7690): 76%|ββββββββ | 948/1250 [05:45<01:48, 2.79it/s]
Training 1/1 epoch (loss 2.7690): 76%|ββββββββ | 949/1250 [05:45<01:53, 2.65it/s]
Training 1/1 epoch (loss 2.5939): 76%|ββββββββ | 949/1250 [05:45<01:53, 2.65it/s]
Training 1/1 epoch (loss 2.5939): 76%|ββββββββ | 950/1250 [05:45<01:52, 2.66it/s]
Training 1/1 epoch (loss 2.5322): 76%|ββββββββ | 950/1250 [05:45<01:52, 2.66it/s]
Training 1/1 epoch (loss 2.5322): 76%|ββββββββ | 951/1250 [05:45<01:47, 2.78it/s]
Training 1/1 epoch (loss 2.8348): 76%|ββββββββ | 951/1250 [05:46<01:47, 2.78it/s]
Training 1/1 epoch (loss 2.8348): 76%|ββββββββ | 952/1250 [05:46<01:46, 2.79it/s]
Training 1/1 epoch (loss 2.5795): 76%|ββββββββ | 952/1250 [05:46<01:46, 2.79it/s]
Training 1/1 epoch (loss 2.5795): 76%|ββββββββ | 953/1250 [05:46<01:46, 2.78it/s]
Training 1/1 epoch (loss 2.6889): 76%|ββββββββ | 953/1250 [05:46<01:46, 2.78it/s]
Training 1/1 epoch (loss 2.6889): 76%|ββββββββ | 954/1250 [05:46<01:47, 2.76it/s]
Training 1/1 epoch (loss 2.6685): 76%|ββββββββ | 954/1250 [05:47<01:47, 2.76it/s]
Training 1/1 epoch (loss 2.6685): 76%|ββββββββ | 955/1250 [05:47<01:46, 2.76it/s]
Training 1/1 epoch (loss 2.7345): 76%|ββββββββ | 955/1250 [05:47<01:46, 2.76it/s]
Training 1/1 epoch (loss 2.7345): 76%|ββββββββ | 956/1250 [05:47<01:41, 2.91it/s]
Training 1/1 epoch (loss 2.9136): 76%|ββββββββ | 956/1250 [05:47<01:41, 2.91it/s]
Training 1/1 epoch (loss 2.9136): 77%|ββββββββ | 957/1250 [05:47<01:47, 2.73it/s]
Training 1/1 epoch (loss 2.6790): 77%|ββββββββ | 957/1250 [05:48<01:47, 2.73it/s]
Training 1/1 epoch (loss 2.6790): 77%|ββββββββ | 958/1250 [05:48<01:47, 2.72it/s]
Training 1/1 epoch (loss 2.9678): 77%|ββββββββ | 958/1250 [05:48<01:47, 2.72it/s]
Training 1/1 epoch (loss 2.9678): 77%|ββββββββ | 959/1250 [05:48<01:44, 2.78it/s]
Training 1/1 epoch (loss 2.4710): 77%|ββββββββ | 959/1250 [05:48<01:44, 2.78it/s]
Training 1/1 epoch (loss 2.4710): 77%|ββββββββ | 960/1250 [05:48<01:44, 2.79it/s]
Training 1/1 epoch (loss 2.4875): 77%|ββββββββ | 960/1250 [05:49<01:44, 2.79it/s]
Training 1/1 epoch (loss 2.4875): 77%|ββββββββ | 961/1250 [05:49<01:43, 2.80it/s]
Training 1/1 epoch (loss 2.4818): 77%|ββββββββ | 961/1250 [05:49<01:43, 2.80it/s]
Training 1/1 epoch (loss 2.4818): 77%|ββββββββ | 962/1250 [05:49<01:41, 2.85it/s]
Training 1/1 epoch (loss 2.8613): 77%|ββββββββ | 962/1250 [05:49<01:41, 2.85it/s]
Training 1/1 epoch (loss 2.8613): 77%|ββββββββ | 963/1250 [05:49<01:39, 2.90it/s]
Training 1/1 epoch (loss 2.8231): 77%|ββββββββ | 963/1250 [05:50<01:39, 2.90it/s]
Training 1/1 epoch (loss 2.8231): 77%|ββββββββ | 964/1250 [05:50<01:38, 2.90it/s]
Training 1/1 epoch (loss 2.4747): 77%|ββββββββ | 964/1250 [05:50<01:38, 2.90it/s]
Training 1/1 epoch (loss 2.4747): 77%|ββββββββ | 965/1250 [05:50<01:36, 2.95it/s]
Training 1/1 epoch (loss 2.6964): 77%|ββββββββ | 965/1250 [05:50<01:36, 2.95it/s]
Training 1/1 epoch (loss 2.6964): 77%|ββββββββ | 966/1250 [05:51<01:36, 2.93it/s]
Training 1/1 epoch (loss 2.6514): 77%|ββββββββ | 966/1250 [05:51<01:36, 2.93it/s]
Training 1/1 epoch (loss 2.6514): 77%|ββββββββ | 967/1250 [05:51<01:33, 3.02it/s]
Training 1/1 epoch (loss 2.7059): 77%|ββββββββ | 967/1250 [05:51<01:33, 3.02it/s]
Training 1/1 epoch (loss 2.7059): 77%|ββββββββ | 968/1250 [05:51<01:35, 2.95it/s]
Training 1/1 epoch (loss 3.0112): 77%|ββββββββ | 968/1250 [05:52<01:35, 2.95it/s]
Training 1/1 epoch (loss 3.0112): 78%|ββββββββ | 969/1250 [05:52<01:37, 2.87it/s]
Training 1/1 epoch (loss 2.6039): 78%|ββββββββ | 969/1250 [05:52<01:37, 2.87it/s]
Training 1/1 epoch (loss 2.6039): 78%|ββββββββ | 970/1250 [05:52<01:35, 2.94it/s]
Training 1/1 epoch (loss 3.0167): 78%|ββββββββ | 970/1250 [05:52<01:35, 2.94it/s]
Training 1/1 epoch (loss 3.0167): 78%|ββββββββ | 971/1250 [05:52<01:35, 2.91it/s]
Training 1/1 epoch (loss 2.7090): 78%|ββββββββ | 971/1250 [05:53<01:35, 2.91it/s]
Training 1/1 epoch (loss 2.7090): 78%|ββββββββ | 972/1250 [05:53<01:39, 2.79it/s]
Training 1/1 epoch (loss 2.7744): 78%|ββββββββ | 972/1250 [05:53<01:39, 2.79it/s]
Training 1/1 epoch (loss 2.7744): 78%|ββββββββ | 973/1250 [05:53<01:55, 2.39it/s]
Training 1/1 epoch (loss 2.5755): 78%|ββββββββ | 973/1250 [05:54<01:55, 2.39it/s]
Training 1/1 epoch (loss 2.5755): 78%|ββββββββ | 974/1250 [05:54<01:51, 2.47it/s]
Training 1/1 epoch (loss 2.6999): 78%|ββββββββ | 974/1250 [05:54<01:51, 2.47it/s]
Training 1/1 epoch (loss 2.6999): 78%|ββββββββ | 975/1250 [05:54<01:48, 2.52it/s]
Training 1/1 epoch (loss 2.5906): 78%|ββββββββ | 975/1250 [05:54<01:48, 2.52it/s]
Training 1/1 epoch (loss 2.5906): 78%|ββββββββ | 976/1250 [05:54<01:50, 2.48it/s]
Training 1/1 epoch (loss 2.7943): 78%|ββββββββ | 976/1250 [05:55<01:50, 2.48it/s]
Training 1/1 epoch (loss 2.7943): 78%|ββββββββ | 977/1250 [05:55<01:47, 2.54it/s]
Training 1/1 epoch (loss 2.7930): 78%|ββββββββ | 977/1250 [05:55<01:47, 2.54it/s]
Training 1/1 epoch (loss 2.7930): 78%|ββββββββ | 978/1250 [05:55<01:43, 2.63it/s]
Training 1/1 epoch (loss 2.9794): 78%|ββββββββ | 978/1250 [05:55<01:43, 2.63it/s]
Training 1/1 epoch (loss 2.9794): 78%|ββββββββ | 979/1250 [05:55<01:40, 2.71it/s]
Training 1/1 epoch (loss 2.7873): 78%|ββββββββ | 979/1250 [05:56<01:40, 2.71it/s]
Training 1/1 epoch (loss 2.7873): 78%|ββββββββ | 980/1250 [05:56<01:39, 2.73it/s]
Training 1/1 epoch (loss 2.7252): 78%|ββββββββ | 980/1250 [05:56<01:39, 2.73it/s]
Training 1/1 epoch (loss 2.7252): 78%|ββββββββ | 981/1250 [05:56<01:34, 2.83it/s]
Training 1/1 epoch (loss 2.7612): 78%|ββββββββ | 981/1250 [05:56<01:34, 2.83it/s]
Training 1/1 epoch (loss 2.7612): 79%|ββββββββ | 982/1250 [05:56<01:35, 2.82it/s]
Training 1/1 epoch (loss 2.7707): 79%|ββββββββ | 982/1250 [05:57<01:35, 2.82it/s]
Training 1/1 epoch (loss 2.7707): 79%|ββββββββ | 983/1250 [05:57<01:34, 2.81it/s]
Training 1/1 epoch (loss 2.7729): 79%|ββββββββ | 983/1250 [05:57<01:34, 2.81it/s]
Training 1/1 epoch (loss 2.7729): 79%|ββββββββ | 984/1250 [05:57<01:31, 2.89it/s]
Training 1/1 epoch (loss 2.8742): 79%|ββββββββ | 984/1250 [05:57<01:31, 2.89it/s]
Training 1/1 epoch (loss 2.8742): 79%|ββββββββ | 985/1250 [05:57<01:34, 2.81it/s]
Training 1/1 epoch (loss 2.5919): 79%|ββββββββ | 985/1250 [05:58<01:34, 2.81it/s]
Training 1/1 epoch (loss 2.5919): 79%|ββββββββ | 986/1250 [05:58<01:53, 2.32it/s]
Training 1/1 epoch (loss 2.7192): 79%|ββββββββ | 986/1250 [05:59<01:53, 2.32it/s]
Training 1/1 epoch (loss 2.7192): 79%|ββββββββ | 987/1250 [05:59<01:51, 2.35it/s]
Training 1/1 epoch (loss 2.6991): 79%|ββββββββ | 987/1250 [05:59<01:51, 2.35it/s]
Training 1/1 epoch (loss 2.6991): 79%|ββββββββ | 988/1250 [05:59<01:44, 2.51it/s]
Training 1/1 epoch (loss 2.9396): 79%|ββββββββ | 988/1250 [05:59<01:44, 2.51it/s]
Training 1/1 epoch (loss 2.9396): 79%|ββββββββ | 989/1250 [05:59<01:35, 2.72it/s]
Training 1/1 epoch (loss 2.6126): 79%|ββββββββ | 989/1250 [06:00<01:35, 2.72it/s]
Training 1/1 epoch (loss 2.6126): 79%|ββββββββ | 990/1250 [06:00<01:35, 2.72it/s]
Training 1/1 epoch (loss 2.6447): 79%|ββββββββ | 990/1250 [06:00<01:35, 2.72it/s]
Training 1/1 epoch (loss 2.6447): 79%|ββββββββ | 991/1250 [06:00<01:32, 2.79it/s]
Training 1/1 epoch (loss 2.3736): 79%|ββββββββ | 991/1250 [06:00<01:32, 2.79it/s]
Training 1/1 epoch (loss 2.3736): 79%|ββββββββ | 992/1250 [06:00<01:31, 2.82it/s]
Training 1/1 epoch (loss 2.6367): 79%|ββββββββ | 992/1250 [06:01<01:31, 2.82it/s]
Training 1/1 epoch (loss 2.6367): 79%|ββββββββ | 993/1250 [06:01<01:29, 2.88it/s]
Training 1/1 epoch (loss 2.5600): 79%|ββββββββ | 993/1250 [06:01<01:29, 2.88it/s]
Training 1/1 epoch (loss 2.5600): 80%|ββββββββ | 994/1250 [06:01<01:27, 2.94it/s]
Training 1/1 epoch (loss 2.4748): 80%|ββββββββ | 994/1250 [06:01<01:27, 2.94it/s]
Training 1/1 epoch (loss 2.4748): 80%|ββββββββ | 995/1250 [06:01<01:26, 2.94it/s]
Training 1/1 epoch (loss 2.6628): 80%|ββββββββ | 995/1250 [06:02<01:26, 2.94it/s]
Training 1/1 epoch (loss 2.6628): 80%|ββββββββ | 996/1250 [06:02<01:25, 2.98it/s]
Training 1/1 epoch (loss 2.8017): 80%|ββββββββ | 996/1250 [06:02<01:25, 2.98it/s]
Training 1/1 epoch (loss 2.8017): 80%|ββββββββ | 997/1250 [06:02<01:24, 2.99it/s]
Training 1/1 epoch (loss 2.7231): 80%|ββββββββ | 997/1250 [06:02<01:24, 2.99it/s]
Training 1/1 epoch (loss 2.7231): 80%|ββββββββ | 998/1250 [06:02<01:25, 2.95it/s]
Training 1/1 epoch (loss 2.6633): 80%|ββββββββ | 998/1250 [06:03<01:25, 2.95it/s]
Training 1/1 epoch (loss 2.6633): 80%|ββββββββ | 999/1250 [06:03<01:26, 2.90it/s]
Training 1/1 epoch (loss 2.6786): 80%|ββββββββ | 999/1250 [06:03<01:26, 2.90it/s]
Training 1/1 epoch (loss 2.6786): 80%|ββββββββ | 1000/1250 [06:03<01:26, 2.88it/s]
Training 1/1 epoch (loss 2.7935): 80%|ββββββββ | 1000/1250 [06:03<01:26, 2.88it/s]
Training 1/1 epoch (loss 2.7935): 80%|ββββββββ | 1001/1250 [06:03<01:34, 2.63it/s]
Training 1/1 epoch (loss 2.5360): 80%|ββββββββ | 1001/1250 [06:04<01:34, 2.63it/s]
Training 1/1 epoch (loss 2.5360): 80%|ββββββββ | 1002/1250 [06:04<01:30, 2.76it/s]
Training 1/1 epoch (loss 2.6398): 80%|ββββββββ | 1002/1250 [06:04<01:30, 2.76it/s]
Training 1/1 epoch (loss 2.6398): 80%|ββββββββ | 1003/1250 [06:04<01:29, 2.77it/s]
Training 1/1 epoch (loss 2.5722): 80%|ββββββββ | 1003/1250 [06:04<01:29, 2.77it/s]
Training 1/1 epoch (loss 2.5722): 80%|ββββββββ | 1004/1250 [06:04<01:30, 2.73it/s]
Training 1/1 epoch (loss 2.8345): 80%|ββββββββ | 1004/1250 [06:05<01:30, 2.73it/s]
Training 1/1 epoch (loss 2.8345): 80%|ββββββββ | 1005/1250 [06:05<01:27, 2.79it/s]
Training 1/1 epoch (loss 2.7330): 80%|ββββββββ | 1005/1250 [06:05<01:27, 2.79it/s]
Training 1/1 epoch (loss 2.7330): 80%|ββββββββ | 1006/1250 [06:05<01:26, 2.83it/s]
Training 1/1 epoch (loss 2.7918): 80%|ββββββββ | 1006/1250 [06:05<01:26, 2.83it/s]
Training 1/1 epoch (loss 2.7918): 81%|ββββββββ | 1007/1250 [06:05<01:24, 2.87it/s]
Training 1/1 epoch (loss 2.7334): 81%|ββββββββ | 1007/1250 [06:06<01:24, 2.87it/s]
Training 1/1 epoch (loss 2.7334): 81%|ββββββββ | 1008/1250 [06:06<01:22, 2.92it/s]
Training 1/1 epoch (loss 2.8525): 81%|ββββββββ | 1008/1250 [06:06<01:22, 2.92it/s]
Training 1/1 epoch (loss 2.8525): 81%|ββββββββ | 1009/1250 [06:06<01:21, 2.94it/s]
Training 1/1 epoch (loss 2.7950): 81%|ββββββββ | 1009/1250 [06:06<01:21, 2.94it/s]
Training 1/1 epoch (loss 2.7950): 81%|ββββββββ | 1010/1250 [06:06<01:21, 2.94it/s]
Training 1/1 epoch (loss 2.8791): 81%|ββββββββ | 1010/1250 [06:07<01:21, 2.94it/s]
Training 1/1 epoch (loss 2.8791): 81%|ββββββββ | 1011/1250 [06:07<01:22, 2.90it/s]
Training 1/1 epoch (loss 2.7453): 81%|ββββββββ | 1011/1250 [06:07<01:22, 2.90it/s]
Training 1/1 epoch (loss 2.7453): 81%|ββββββββ | 1012/1250 [06:07<01:21, 2.91it/s]
Training 1/1 epoch (loss 2.5717): 81%|ββββββββ | 1012/1250 [06:07<01:21, 2.91it/s]
Training 1/1 epoch (loss 2.5717): 81%|ββββββββ | 1013/1250 [06:07<01:21, 2.90it/s]
Training 1/1 epoch (loss 2.5567): 81%|ββββββββ | 1013/1250 [06:08<01:21, 2.90it/s]
Training 1/1 epoch (loss 2.5567): 81%|ββββββββ | 1014/1250 [06:08<01:23, 2.82it/s]
Training 1/1 epoch (loss 2.6503): 81%|ββββββββ | 1014/1250 [06:08<01:23, 2.82it/s]
Training 1/1 epoch (loss 2.6503): 81%|ββββββββ | 1015/1250 [06:08<01:22, 2.85it/s]
Training 1/1 epoch (loss 2.7187): 81%|ββββββββ | 1015/1250 [06:09<01:22, 2.85it/s]
Training 1/1 epoch (loss 2.7187): 81%|βββββββββ | 1016/1250 [06:09<01:27, 2.67it/s]
Training 1/1 epoch (loss 2.7622): 81%|βββββββββ | 1016/1250 [06:09<01:27, 2.67it/s]
Training 1/1 epoch (loss 2.7622): 81%|βββββββββ | 1017/1250 [06:09<01:24, 2.75it/s]
Training 1/1 epoch (loss 2.7984): 81%|βββββββββ | 1017/1250 [06:09<01:24, 2.75it/s]
Training 1/1 epoch (loss 2.7984): 81%|βββββββββ | 1018/1250 [06:09<01:20, 2.88it/s]
Training 1/1 epoch (loss 2.7525): 81%|βββββββββ | 1018/1250 [06:10<01:20, 2.88it/s]
Training 1/1 epoch (loss 2.7525): 82%|βββββββββ | 1019/1250 [06:10<01:21, 2.84it/s]
Training 1/1 epoch (loss 2.9155): 82%|βββββββββ | 1019/1250 [06:10<01:21, 2.84it/s]
Training 1/1 epoch (loss 2.9155): 82%|βββββββββ | 1020/1250 [06:10<01:22, 2.78it/s]
Training 1/1 epoch (loss 2.6640): 82%|βββββββββ | 1020/1250 [06:10<01:22, 2.78it/s]
Training 1/1 epoch (loss 2.6640): 82%|βββββββββ | 1021/1250 [06:10<01:21, 2.81it/s]
Training 1/1 epoch (loss 2.9314): 82%|βββββββββ | 1021/1250 [06:11<01:21, 2.81it/s]
Training 1/1 epoch (loss 2.9314): 82%|βββββββββ | 1022/1250 [06:11<01:24, 2.69it/s]
Training 1/1 epoch (loss 2.6192): 82%|βββββββββ | 1022/1250 [06:11<01:24, 2.69it/s]
Training 1/1 epoch (loss 2.6192): 82%|βββββββββ | 1023/1250 [06:11<01:22, 2.74it/s]
Training 1/1 epoch (loss 2.6696): 82%|βββββββββ | 1023/1250 [06:12<01:22, 2.74it/s]
Training 1/1 epoch (loss 2.6696): 82%|βββββββββ | 1024/1250 [06:12<01:23, 2.71it/s]
Training 1/1 epoch (loss 2.5941): 82%|βββββββββ | 1024/1250 [06:12<01:23, 2.71it/s]
Training 1/1 epoch (loss 2.5941): 82%|βββββββββ | 1025/1250 [06:12<01:24, 2.67it/s]
Training 1/1 epoch (loss 2.7293): 82%|βββββββββ | 1025/1250 [06:12<01:24, 2.67it/s]
Training 1/1 epoch (loss 2.7293): 82%|βββββββββ | 1026/1250 [06:12<01:24, 2.66it/s]
Training 1/1 epoch (loss 2.6182): 82%|βββββββββ | 1026/1250 [06:13<01:24, 2.66it/s]
Training 1/1 epoch (loss 2.6182): 82%|βββββββββ | 1027/1250 [06:13<01:23, 2.67it/s]
Training 1/1 epoch (loss 2.6089): 82%|βββββββββ | 1027/1250 [06:13<01:23, 2.67it/s]
Training 1/1 epoch (loss 2.6089): 82%|βββββββββ | 1028/1250 [06:13<01:23, 2.66it/s]
Training 1/1 epoch (loss 2.7368): 82%|βββββββββ | 1028/1250 [06:13<01:23, 2.66it/s]
Training 1/1 epoch (loss 2.7368): 82%|βββββββββ | 1029/1250 [06:13<01:25, 2.60it/s]
Training 1/1 epoch (loss 2.7922): 82%|βββββββββ | 1029/1250 [06:14<01:25, 2.60it/s]
Training 1/1 epoch (loss 2.7922): 82%|βββββββββ | 1030/1250 [06:14<01:27, 2.51it/s]
Training 1/1 epoch (loss 2.7183): 82%|βββββββββ | 1030/1250 [06:14<01:27, 2.51it/s]
Training 1/1 epoch (loss 2.7183): 82%|βββββββββ | 1031/1250 [06:14<01:25, 2.55it/s]
Training 1/1 epoch (loss 2.7416): 82%|βββββββββ | 1031/1250 [06:15<01:25, 2.55it/s]
Training 1/1 epoch (loss 2.7416): 83%|βββββββββ | 1032/1250 [06:15<01:29, 2.44it/s]
Training 1/1 epoch (loss 2.5828): 83%|βββββββββ | 1032/1250 [06:15<01:29, 2.44it/s]
Training 1/1 epoch (loss 2.5828): 83%|βββββββββ | 1033/1250 [06:15<01:28, 2.44it/s]
Training 1/1 epoch (loss 2.5607): 83%|βββββββββ | 1033/1250 [06:15<01:28, 2.44it/s]
Training 1/1 epoch (loss 2.5607): 83%|βββββββββ | 1034/1250 [06:15<01:25, 2.54it/s]
Training 1/1 epoch (loss 2.5121): 83%|βββββββββ | 1034/1250 [06:16<01:25, 2.54it/s]
Training 1/1 epoch (loss 2.5121): 83%|βββββββββ | 1035/1250 [06:16<01:20, 2.68it/s]
Training 1/1 epoch (loss 2.6123): 83%|βββββββββ | 1035/1250 [06:16<01:20, 2.68it/s]
Training 1/1 epoch (loss 2.6123): 83%|βββββββββ | 1036/1250 [06:16<01:16, 2.81it/s]
Training 1/1 epoch (loss 2.8178): 83%|βββββββββ | 1036/1250 [06:16<01:16, 2.81it/s]
Training 1/1 epoch (loss 2.8178): 83%|βββββββββ | 1037/1250 [06:16<01:12, 2.95it/s]
Training 1/1 epoch (loss 2.5709): 83%|βββββββββ | 1037/1250 [06:17<01:12, 2.95it/s]
Training 1/1 epoch (loss 2.5709): 83%|βββββββββ | 1038/1250 [06:17<01:14, 2.86it/s]
Training 1/1 epoch (loss 2.4821): 83%|βββββββββ | 1038/1250 [06:17<01:14, 2.86it/s]
Training 1/1 epoch (loss 2.4821): 83%|βββββββββ | 1039/1250 [06:17<01:13, 2.87it/s]
Training 1/1 epoch (loss 2.5646): 83%|βββββββββ | 1039/1250 [06:18<01:13, 2.87it/s]
Training 1/1 epoch (loss 2.5646): 83%|βββββββββ | 1040/1250 [06:18<01:15, 2.77it/s]
Training 1/1 epoch (loss 2.4441): 83%|βββββββββ | 1040/1250 [06:18<01:15, 2.77it/s]
Training 1/1 epoch (loss 2.4441): 83%|βββββββββ | 1041/1250 [06:18<01:17, 2.69it/s]
Training 1/1 epoch (loss 2.8763): 83%|βββββββββ | 1041/1250 [06:18<01:17, 2.69it/s]
Training 1/1 epoch (loss 2.8763): 83%|βββββββββ | 1042/1250 [06:18<01:12, 2.85it/s]
Training 1/1 epoch (loss 2.7807): 83%|βββββββββ | 1042/1250 [06:19<01:12, 2.85it/s]
Training 1/1 epoch (loss 2.7807): 83%|βββββββββ | 1043/1250 [06:19<01:14, 2.76it/s]
Training 1/1 epoch (loss 2.7071): 83%|βββββββββ | 1043/1250 [06:19<01:14, 2.76it/s]
Training 1/1 epoch (loss 2.7071): 84%|βββββββββ | 1044/1250 [06:19<01:16, 2.71it/s]
Training 1/1 epoch (loss 2.8064): 84%|βββββββββ | 1044/1250 [06:19<01:16, 2.71it/s]
Training 1/1 epoch (loss 2.8064): 84%|βββββββββ | 1045/1250 [06:19<01:14, 2.74it/s]
Training 1/1 epoch (loss 2.7755): 84%|βββββββββ | 1045/1250 [06:20<01:14, 2.74it/s]
Training 1/1 epoch (loss 2.7755): 84%|βββββββββ | 1046/1250 [06:20<01:09, 2.93it/s]
Training 1/1 epoch (loss 2.7773): 84%|βββββββββ | 1046/1250 [06:20<01:09, 2.93it/s]
Training 1/1 epoch (loss 2.7773): 84%|βββββββββ | 1047/1250 [06:20<01:09, 2.91it/s]
Training 1/1 epoch (loss 2.7112): 84%|βββββββββ | 1047/1250 [06:20<01:09, 2.91it/s]
Training 1/1 epoch (loss 2.7112): 84%|βββββββββ | 1048/1250 [06:20<01:10, 2.85it/s]
Training 1/1 epoch (loss 2.6231): 84%|βββββββββ | 1048/1250 [06:21<01:10, 2.85it/s]
Training 1/1 epoch (loss 2.6231): 84%|βββββββββ | 1049/1250 [06:21<01:12, 2.78it/s]
Training 1/1 epoch (loss 2.8800): 84%|βββββββββ | 1049/1250 [06:21<01:12, 2.78it/s]
Training 1/1 epoch (loss 2.8800): 84%|βββββββββ | 1050/1250 [06:21<01:09, 2.87it/s]
Training 1/1 epoch (loss 2.7105): 84%|βββββββββ | 1050/1250 [06:21<01:09, 2.87it/s]
Training 1/1 epoch (loss 2.7105): 84%|βββββββββ | 1051/1250 [06:21<01:10, 2.80it/s]
Training 1/1 epoch (loss 2.7094): 84%|βββββββββ | 1051/1250 [06:22<01:10, 2.80it/s]
Training 1/1 epoch (loss 2.7094): 84%|βββββββββ | 1052/1250 [06:22<01:09, 2.86it/s]
Training 1/1 epoch (loss 2.5874): 84%|βββββββββ | 1052/1250 [06:22<01:09, 2.86it/s]
Training 1/1 epoch (loss 2.5874): 84%|βββββββββ | 1053/1250 [06:22<01:06, 2.98it/s]
Training 1/1 epoch (loss 2.5746): 84%|βββββββββ | 1053/1250 [06:22<01:06, 2.98it/s]
Training 1/1 epoch (loss 2.5746): 84%|βββββββββ | 1054/1250 [06:22<01:05, 3.02it/s]
Training 1/1 epoch (loss 2.5310): 84%|βββββββββ | 1054/1250 [06:23<01:05, 3.02it/s]
Training 1/1 epoch (loss 2.5310): 84%|βββββββββ | 1055/1250 [06:23<01:13, 2.66it/s]
Training 1/1 epoch (loss 2.6827): 84%|βββββββββ | 1055/1250 [06:23<01:13, 2.66it/s]
Training 1/1 epoch (loss 2.6827): 84%|βββββββββ | 1056/1250 [06:23<01:23, 2.33it/s]
Training 1/1 epoch (loss 2.7653): 84%|βββββββββ | 1056/1250 [06:24<01:23, 2.33it/s]
Training 1/1 epoch (loss 2.7653): 85%|βββββββββ | 1057/1250 [06:24<01:17, 2.48it/s]
Training 1/1 epoch (loss 2.7596): 85%|βββββββββ | 1057/1250 [06:24<01:17, 2.48it/s]
Training 1/1 epoch (loss 2.7596): 85%|βββββββββ | 1058/1250 [06:24<01:13, 2.60it/s]
Training 1/1 epoch (loss 2.7429): 85%|βββββββββ | 1058/1250 [06:24<01:13, 2.60it/s]
Training 1/1 epoch (loss 2.7429): 85%|βββββββββ | 1059/1250 [06:24<01:13, 2.59it/s]
Training 1/1 epoch (loss 2.8435): 85%|βββββββββ | 1059/1250 [06:25<01:13, 2.59it/s]
Training 1/1 epoch (loss 2.8435): 85%|βββββββββ | 1060/1250 [06:25<01:11, 2.67it/s]
Training 1/1 epoch (loss 2.5270): 85%|βββββββββ | 1060/1250 [06:25<01:11, 2.67it/s]
Training 1/1 epoch (loss 2.5270): 85%|βββββββββ | 1061/1250 [06:25<01:07, 2.78it/s]
Training 1/1 epoch (loss 2.9176): 85%|βββββββββ | 1061/1250 [06:25<01:07, 2.78it/s]
Training 1/1 epoch (loss 2.9176): 85%|βββββββββ | 1062/1250 [06:25<01:05, 2.88it/s]
Training 1/1 epoch (loss 2.8435): 85%|βββββββββ | 1062/1250 [06:26<01:05, 2.88it/s]
Training 1/1 epoch (loss 2.8435): 85%|βββββββββ | 1063/1250 [06:26<01:04, 2.89it/s]
Training 1/1 epoch (loss 2.7510): 85%|βββββββββ | 1063/1250 [06:26<01:04, 2.89it/s]
Training 1/1 epoch (loss 2.7510): 85%|βββββββββ | 1064/1250 [06:26<01:05, 2.84it/s]
Training 1/1 epoch (loss 2.6658): 85%|βββββββββ | 1064/1250 [06:27<01:05, 2.84it/s]
Training 1/1 epoch (loss 2.6658): 85%|βββββββββ | 1065/1250 [06:27<01:04, 2.85it/s]
Training 1/1 epoch (loss 2.8722): 85%|βββββββββ | 1065/1250 [06:27<01:04, 2.85it/s]
Training 1/1 epoch (loss 2.8722): 85%|βββββββββ | 1066/1250 [06:27<01:05, 2.80it/s]
Training 1/1 epoch (loss 2.6778): 85%|βββββββββ | 1066/1250 [06:27<01:05, 2.80it/s]
Training 1/1 epoch (loss 2.6778): 85%|βββββββββ | 1067/1250 [06:27<01:04, 2.83it/s]
Training 1/1 epoch (loss 2.6186): 85%|βββββββββ | 1067/1250 [06:28<01:04, 2.83it/s]
Training 1/1 epoch (loss 2.6186): 85%|βββββββββ | 1068/1250 [06:28<01:05, 2.78it/s]
Training 1/1 epoch (loss 2.6356): 85%|βββββββββ | 1068/1250 [06:28<01:05, 2.78it/s]
Training 1/1 epoch (loss 2.6356): 86%|βββββββββ | 1069/1250 [06:28<01:07, 2.68it/s]
Training 1/1 epoch (loss 2.7030): 86%|βββββββββ | 1069/1250 [06:29<01:07, 2.68it/s]
Training 1/1 epoch (loss 2.7030): 86%|βββββββββ | 1070/1250 [06:29<01:16, 2.35it/s]
Training 1/1 epoch (loss 2.7921): 86%|βββββββββ | 1070/1250 [06:29<01:16, 2.35it/s]
Training 1/1 epoch (loss 2.7921): 86%|βββββββββ | 1071/1250 [06:29<01:11, 2.52it/s]
Training 1/1 epoch (loss 2.7193): 86%|βββββββββ | 1071/1250 [06:29<01:11, 2.52it/s]
Training 1/1 epoch (loss 2.7193): 86%|βββββββββ | 1072/1250 [06:29<01:09, 2.57it/s]
Training 1/1 epoch (loss 2.6913): 86%|βββββββββ | 1072/1250 [06:30<01:09, 2.57it/s]
Training 1/1 epoch (loss 2.6913): 86%|βββββββββ | 1073/1250 [06:30<01:13, 2.41it/s]
Training 1/1 epoch (loss 2.7797): 86%|βββββββββ | 1073/1250 [06:30<01:13, 2.41it/s]
Training 1/1 epoch (loss 2.7797): 86%|βββββββββ | 1074/1250 [06:30<01:08, 2.55it/s]
Training 1/1 epoch (loss 2.7718): 86%|βββββββββ | 1074/1250 [06:30<01:08, 2.55it/s]
Training 1/1 epoch (loss 2.7718): 86%|βββββββββ | 1075/1250 [06:30<01:05, 2.69it/s]
Training 1/1 epoch (loss 2.7740): 86%|βββββββββ | 1075/1250 [06:31<01:05, 2.69it/s]
Training 1/1 epoch (loss 2.7740): 86%|βββββββββ | 1076/1250 [06:31<01:01, 2.81it/s]
Training 1/1 epoch (loss 2.7415): 86%|βββββββββ | 1076/1250 [06:31<01:01, 2.81it/s]
Training 1/1 epoch (loss 2.7415): 86%|βββββββββ | 1077/1250 [06:31<01:02, 2.75it/s]
Training 1/1 epoch (loss 2.7145): 86%|βββββββββ | 1077/1250 [06:31<01:02, 2.75it/s]
Training 1/1 epoch (loss 2.7145): 86%|βββββββββ | 1078/1250 [06:31<01:01, 2.80it/s]
Training 1/1 epoch (loss 2.7126): 86%|βββββββββ | 1078/1250 [06:32<01:01, 2.80it/s]
Training 1/1 epoch (loss 2.7126): 86%|βββββββββ | 1079/1250 [06:32<01:00, 2.84it/s]
Training 1/1 epoch (loss 2.8995): 86%|βββββββββ | 1079/1250 [06:32<01:00, 2.84it/s]
Training 1/1 epoch (loss 2.8995): 86%|βββββββββ | 1080/1250 [06:32<01:00, 2.81it/s]
Training 1/1 epoch (loss 2.8633): 86%|βββββββββ | 1080/1250 [06:32<01:00, 2.81it/s]
Training 1/1 epoch (loss 2.8633): 86%|βββββββββ | 1081/1250 [06:32<00:59, 2.85it/s]
Training 1/1 epoch (loss 2.4616): 86%|βββββββββ | 1081/1250 [06:33<00:59, 2.85it/s]
Training 1/1 epoch (loss 2.4616): 87%|βββββββββ | 1082/1250 [06:33<01:01, 2.74it/s]
Training 1/1 epoch (loss 2.7447): 87%|βββββββββ | 1082/1250 [06:33<01:01, 2.74it/s]
Training 1/1 epoch (loss 2.7447): 87%|βββββββββ | 1083/1250 [06:33<01:00, 2.78it/s]
Training 1/1 epoch (loss 2.8334): 87%|βββββββββ | 1083/1250 [06:34<01:00, 2.78it/s]
Training 1/1 epoch (loss 2.8334): 87%|βββββββββ | 1084/1250 [06:34<00:57, 2.88it/s]
Training 1/1 epoch (loss 2.6201): 87%|βββββββββ | 1084/1250 [06:34<00:57, 2.88it/s]
Training 1/1 epoch (loss 2.6201): 87%|βββββββββ | 1085/1250 [06:34<00:57, 2.86it/s]
Training 1/1 epoch (loss 2.6761): 87%|βββββββββ | 1085/1250 [06:34<00:57, 2.86it/s]
Training 1/1 epoch (loss 2.6761): 87%|βββββββββ | 1086/1250 [06:34<01:00, 2.72it/s]
Training 1/1 epoch (loss 2.8060): 87%|βββββββββ | 1086/1250 [06:35<01:00, 2.72it/s]
Training 1/1 epoch (loss 2.8060): 87%|βββββββββ | 1087/1250 [06:35<00:57, 2.83it/s]
Training 1/1 epoch (loss 2.7887): 87%|βββββββββ | 1087/1250 [06:35<00:57, 2.83it/s]
Training 1/1 epoch (loss 2.7887): 87%|βββββββββ | 1088/1250 [06:35<00:58, 2.75it/s]
Training 1/1 epoch (loss 2.5960): 87%|βββββββββ | 1088/1250 [06:35<00:58, 2.75it/s]
Training 1/1 epoch (loss 2.5960): 87%|βββββββββ | 1089/1250 [06:35<00:58, 2.76it/s]
Training 1/1 epoch (loss 2.7048): 87%|βββββββββ | 1089/1250 [06:36<00:58, 2.76it/s]
Training 1/1 epoch (loss 2.7048): 87%|βββββββββ | 1090/1250 [06:36<00:56, 2.83it/s]
Training 1/1 epoch (loss 2.7767): 87%|βββββββββ | 1090/1250 [06:36<00:56, 2.83it/s]
Training 1/1 epoch (loss 2.7767): 87%|βββββββββ | 1091/1250 [06:36<00:54, 2.91it/s]
Training 1/1 epoch (loss 2.5785): 87%|βββββββββ | 1091/1250 [06:36<00:54, 2.91it/s]
Training 1/1 epoch (loss 2.5785): 87%|βββββββββ | 1092/1250 [06:36<00:52, 3.03it/s]
Training 1/1 epoch (loss 2.6283): 87%|βββββββββ | 1092/1250 [06:37<00:52, 3.03it/s]
Training 1/1 epoch (loss 2.6283): 87%|βββββββββ | 1093/1250 [06:37<00:53, 2.93it/s]
Training 1/1 epoch (loss 2.6387): 87%|βββββββββ | 1093/1250 [06:37<00:53, 2.93it/s]
Training 1/1 epoch (loss 2.6387): 88%|βββββββββ | 1094/1250 [06:37<00:52, 2.97it/s]
Training 1/1 epoch (loss 2.7890): 88%|βββββββββ | 1094/1250 [06:37<00:52, 2.97it/s]
Training 1/1 epoch (loss 2.7890): 88%|βββββββββ | 1095/1250 [06:37<00:51, 2.99it/s]
Training 1/1 epoch (loss 2.6037): 88%|βββββββββ | 1095/1250 [06:38<00:51, 2.99it/s]
Training 1/1 epoch (loss 2.6037): 88%|βββββββββ | 1096/1250 [06:38<00:50, 3.03it/s]
Training 1/1 epoch (loss 2.9103): 88%|βββββββββ | 1096/1250 [06:38<00:50, 3.03it/s]
Training 1/1 epoch (loss 2.9103): 88%|βββββββββ | 1097/1250 [06:38<00:52, 2.91it/s]
Training 1/1 epoch (loss 2.8141): 88%|βββββββββ | 1097/1250 [06:38<00:52, 2.91it/s]
Training 1/1 epoch (loss 2.8141): 88%|βββββββββ | 1098/1250 [06:38<00:50, 3.03it/s]
Training 1/1 epoch (loss 2.8171): 88%|βββββββββ | 1098/1250 [06:39<00:50, 3.03it/s]
Training 1/1 epoch (loss 2.8171): 88%|βββββββββ | 1099/1250 [06:39<00:52, 2.88it/s]
Training 1/1 epoch (loss 2.8308): 88%|βββββββββ | 1099/1250 [06:39<00:52, 2.88it/s]
Training 1/1 epoch (loss 2.8308): 88%|βββββββββ | 1100/1250 [06:39<00:51, 2.92it/s]
Training 1/1 epoch (loss 2.5585): 88%|βββββββββ | 1100/1250 [06:39<00:51, 2.92it/s]
Training 1/1 epoch (loss 2.5585): 88%|βββββββββ | 1101/1250 [06:39<00:50, 2.95it/s]
Training 1/1 epoch (loss 2.8185): 88%|βββββββββ | 1101/1250 [06:40<00:50, 2.95it/s]
Training 1/1 epoch (loss 2.8185): 88%|βββββββββ | 1102/1250 [06:40<00:53, 2.78it/s]
Training 1/1 epoch (loss 2.7450): 88%|βββββββββ | 1102/1250 [06:40<00:53, 2.78it/s]
Training 1/1 epoch (loss 2.7450): 88%|βββββββββ | 1103/1250 [06:40<00:52, 2.78it/s]
Training 1/1 epoch (loss 2.5875): 88%|βββββββββ | 1103/1250 [06:40<00:52, 2.78it/s]
Training 1/1 epoch (loss 2.5875): 88%|βββββββββ | 1104/1250 [06:40<00:50, 2.89it/s]
Training 1/1 epoch (loss 2.6279): 88%|βββββββββ | 1104/1250 [06:41<00:50, 2.89it/s]
Training 1/1 epoch (loss 2.6279): 88%|βββββββββ | 1105/1250 [06:41<00:52, 2.76it/s]
Training 1/1 epoch (loss 2.6561): 88%|βββββββββ | 1105/1250 [06:41<00:52, 2.76it/s]
Training 1/1 epoch (loss 2.6561): 88%|βββββββββ | 1106/1250 [06:41<00:50, 2.88it/s]
Training 1/1 epoch (loss 2.5517): 88%|βββββββββ | 1106/1250 [06:42<00:50, 2.88it/s]
Training 1/1 epoch (loss 2.5517): 89%|βββββββββ | 1107/1250 [06:42<00:49, 2.91it/s]
Training 1/1 epoch (loss 2.8169): 89%|βββββββββ | 1107/1250 [06:42<00:49, 2.91it/s]
Training 1/1 epoch (loss 2.8169): 89%|βββββββββ | 1108/1250 [06:42<00:48, 2.91it/s]
Training 1/1 epoch (loss 2.7932): 89%|βββββββββ | 1108/1250 [06:42<00:48, 2.91it/s]
Training 1/1 epoch (loss 2.7932): 89%|βββββββββ | 1109/1250 [06:42<00:47, 2.94it/s]
Training 1/1 epoch (loss 2.8644): 89%|βββββββββ | 1109/1250 [06:43<00:47, 2.94it/s]
Training 1/1 epoch (loss 2.8644): 89%|βββββββββ | 1110/1250 [06:43<00:48, 2.88it/s]
Training 1/1 epoch (loss 2.8404): 89%|βββββββββ | 1110/1250 [06:43<00:48, 2.88it/s]
Training 1/1 epoch (loss 2.8404): 89%|βββββββββ | 1111/1250 [06:43<00:46, 2.98it/s]
Training 1/1 epoch (loss 2.7183): 89%|βββββββββ | 1111/1250 [06:43<00:46, 2.98it/s]
Training 1/1 epoch (loss 2.7183): 89%|βββββββββ | 1112/1250 [06:43<00:47, 2.90it/s]
Training 1/1 epoch (loss 2.7744): 89%|βββββββββ | 1112/1250 [06:44<00:47, 2.90it/s]
Training 1/1 epoch (loss 2.7744): 89%|βββββββββ | 1113/1250 [06:44<00:46, 2.92it/s]
Training 1/1 epoch (loss 2.6834): 89%|βββββββββ | 1113/1250 [06:44<00:46, 2.92it/s]
Training 1/1 epoch (loss 2.6834): 89%|βββββββββ | 1114/1250 [06:44<00:46, 2.95it/s]
Training 1/1 epoch (loss 2.5831): 89%|βββββββββ | 1114/1250 [06:44<00:46, 2.95it/s]
Training 1/1 epoch (loss 2.5831): 89%|βββββββββ | 1115/1250 [06:44<00:47, 2.85it/s]
Training 1/1 epoch (loss 2.6497): 89%|βββββββββ | 1115/1250 [06:45<00:47, 2.85it/s]
Training 1/1 epoch (loss 2.6497): 89%|βββββββββ | 1116/1250 [06:45<00:48, 2.75it/s]
Training 1/1 epoch (loss 2.7793): 89%|βββββββββ | 1116/1250 [06:45<00:48, 2.75it/s]
Training 1/1 epoch (loss 2.7793): 89%|βββββββββ | 1117/1250 [06:45<00:47, 2.79it/s]
Training 1/1 epoch (loss 2.5217): 89%|βββββββββ | 1117/1250 [06:45<00:47, 2.79it/s]
Training 1/1 epoch (loss 2.5217): 89%|βββββββββ | 1118/1250 [06:45<00:49, 2.68it/s]
Training 1/1 epoch (loss 2.5990): 89%|βββββββββ | 1118/1250 [06:46<00:49, 2.68it/s]
Training 1/1 epoch (loss 2.5990): 90%|βββββββββ | 1119/1250 [06:46<00:47, 2.78it/s]
Training 1/1 epoch (loss 2.5312): 90%|βββββββββ | 1119/1250 [06:46<00:47, 2.78it/s]
Training 1/1 epoch (loss 2.5312): 90%|βββββββββ | 1120/1250 [06:46<00:46, 2.78it/s]
Training 1/1 epoch (loss 2.7128): 90%|βββββββββ | 1120/1250 [06:46<00:46, 2.78it/s]
Training 1/1 epoch (loss 2.7128): 90%|βββββββββ | 1121/1250 [06:46<00:45, 2.83it/s]
Training 1/1 epoch (loss 2.6886): 90%|βββββββββ | 1121/1250 [06:47<00:45, 2.83it/s]
Training 1/1 epoch (loss 2.6886): 90%|βββββββββ | 1122/1250 [06:47<00:43, 2.92it/s]
Training 1/1 epoch (loss 2.7570): 90%|βββββββββ | 1122/1250 [06:47<00:43, 2.92it/s]
Training 1/1 epoch (loss 2.7570): 90%|βββββββββ | 1123/1250 [06:47<00:43, 2.92it/s]
Training 1/1 epoch (loss 2.6605): 90%|βββββββββ | 1123/1250 [06:47<00:43, 2.92it/s]
Training 1/1 epoch (loss 2.6605): 90%|βββββββββ | 1124/1250 [06:47<00:42, 2.96it/s]
Training 1/1 epoch (loss 2.8243): 90%|βββββββββ | 1124/1250 [06:48<00:42, 2.96it/s]
Training 1/1 epoch (loss 2.8243): 90%|βββββββββ | 1125/1250 [06:48<00:41, 3.02it/s]
Training 1/1 epoch (loss 2.8103): 90%|βββββββββ | 1125/1250 [06:48<00:41, 3.02it/s]
Training 1/1 epoch (loss 2.8103): 90%|βββββββββ | 1126/1250 [06:48<00:40, 3.06it/s]
Training 1/1 epoch (loss 2.6230): 90%|βββββββββ | 1126/1250 [06:48<00:40, 3.06it/s]
Training 1/1 epoch (loss 2.6230): 90%|βββββββββ | 1127/1250 [06:48<00:40, 3.03it/s]
Training 1/1 epoch (loss 2.7427): 90%|βββββββββ | 1127/1250 [06:49<00:40, 3.03it/s]
Training 1/1 epoch (loss 2.7427): 90%|βββββββββ | 1128/1250 [06:49<00:40, 2.99it/s]
Training 1/1 epoch (loss 2.6496): 90%|βββββββββ | 1128/1250 [06:49<00:40, 2.99it/s]
Training 1/1 epoch (loss 2.6496): 90%|βββββββββ | 1129/1250 [06:49<00:41, 2.90it/s]
Training 1/1 epoch (loss 2.7239): 90%|βββββββββ | 1129/1250 [06:49<00:41, 2.90it/s]
Training 1/1 epoch (loss 2.7239): 90%|βββββββββ | 1130/1250 [06:49<00:41, 2.91it/s]
Training 1/1 epoch (loss 2.5236): 90%|βββββββββ | 1130/1250 [06:50<00:41, 2.91it/s]
Training 1/1 epoch (loss 2.5236): 90%|βββββββββ | 1131/1250 [06:50<00:40, 2.96it/s]
Training 1/1 epoch (loss 2.6549): 90%|βββββββββ | 1131/1250 [06:50<00:40, 2.96it/s]
Training 1/1 epoch (loss 2.6549): 91%|βββββββββ | 1132/1250 [06:50<00:38, 3.06it/s]
Training 1/1 epoch (loss 2.5997): 91%|βββββββββ | 1132/1250 [06:51<00:38, 3.06it/s]
Training 1/1 epoch (loss 2.5997): 91%|βββββββββ | 1133/1250 [06:51<00:40, 2.86it/s]
Training 1/1 epoch (loss 2.7109): 91%|βββββββββ | 1133/1250 [06:51<00:40, 2.86it/s]
Training 1/1 epoch (loss 2.7109): 91%|βββββββββ | 1134/1250 [06:51<00:40, 2.85it/s]
Training 1/1 epoch (loss 2.7163): 91%|βββββββββ | 1134/1250 [06:51<00:40, 2.85it/s]
Training 1/1 epoch (loss 2.7163): 91%|βββββββββ | 1135/1250 [06:51<00:38, 2.97it/s]
Training 1/1 epoch (loss 2.6398): 91%|βββββββββ | 1135/1250 [06:52<00:38, 2.97it/s]
Training 1/1 epoch (loss 2.6398): 91%|βββββββββ | 1136/1250 [06:52<00:39, 2.85it/s]
Training 1/1 epoch (loss 2.6974): 91%|βββββββββ | 1136/1250 [06:52<00:39, 2.85it/s]
Training 1/1 epoch (loss 2.6974): 91%|βββββββββ | 1137/1250 [06:52<00:39, 2.87it/s]
Training 1/1 epoch (loss 2.7106): 91%|βββββββββ | 1137/1250 [06:52<00:39, 2.87it/s]
Training 1/1 epoch (loss 2.7106): 91%|βββββββββ | 1138/1250 [06:52<00:38, 2.92it/s]
Training 1/1 epoch (loss 2.5012): 91%|βββββββββ | 1138/1250 [06:53<00:38, 2.92it/s]
Training 1/1 epoch (loss 2.5012): 91%|βββββββββ | 1139/1250 [06:53<00:37, 2.96it/s]
Training 1/1 epoch (loss 2.8785): 91%|βββββββββ | 1139/1250 [06:53<00:37, 2.96it/s]
Training 1/1 epoch (loss 2.8785): 91%|βββββββββ | 1140/1250 [06:53<00:42, 2.61it/s]
Training 1/1 epoch (loss 2.5789): 91%|βββββββββ | 1140/1250 [06:53<00:42, 2.61it/s]
Training 1/1 epoch (loss 2.5789): 91%|ββββββββββ| 1141/1250 [06:53<00:43, 2.51it/s]
Training 1/1 epoch (loss 2.6396): 91%|ββββββββββ| 1141/1250 [06:54<00:43, 2.51it/s]
Training 1/1 epoch (loss 2.6396): 91%|ββββββββββ| 1142/1250 [06:54<00:41, 2.61it/s]
Training 1/1 epoch (loss 2.7244): 91%|ββββββββββ| 1142/1250 [06:54<00:41, 2.61it/s]
Training 1/1 epoch (loss 2.7244): 91%|ββββββββββ| 1143/1250 [06:54<00:41, 2.59it/s]
Training 1/1 epoch (loss 2.5499): 91%|ββββββββββ| 1143/1250 [06:55<00:41, 2.59it/s]
Training 1/1 epoch (loss 2.5499): 92%|ββββββββββ| 1144/1250 [06:55<00:41, 2.54it/s]
Training 1/1 epoch (loss 2.6689): 92%|ββββββββββ| 1144/1250 [06:55<00:41, 2.54it/s]
Training 1/1 epoch (loss 2.6689): 92%|ββββββββββ| 1145/1250 [06:55<00:40, 2.60it/s]
Training 1/1 epoch (loss 2.7423): 92%|ββββββββββ| 1145/1250 [06:55<00:40, 2.60it/s]
Training 1/1 epoch (loss 2.7423): 92%|ββββββββββ| 1146/1250 [06:55<00:37, 2.77it/s]
Training 1/1 epoch (loss 2.5494): 92%|ββββββββββ| 1146/1250 [06:56<00:37, 2.77it/s]
Training 1/1 epoch (loss 2.5494): 92%|ββββββββββ| 1147/1250 [06:56<00:36, 2.83it/s]
Training 1/1 epoch (loss 2.7938): 92%|ββββββββββ| 1147/1250 [06:56<00:36, 2.83it/s]
Training 1/1 epoch (loss 2.7938): 92%|ββββββββββ| 1148/1250 [06:56<00:36, 2.78it/s]
Training 1/1 epoch (loss 2.5668): 92%|ββββββββββ| 1148/1250 [06:56<00:36, 2.78it/s]
Training 1/1 epoch (loss 2.5668): 92%|ββββββββββ| 1149/1250 [06:56<00:35, 2.84it/s]
Training 1/1 epoch (loss 2.5124): 92%|ββββββββββ| 1149/1250 [06:57<00:35, 2.84it/s]
Training 1/1 epoch (loss 2.5124): 92%|ββββββββββ| 1150/1250 [06:57<00:35, 2.81it/s]
Training 1/1 epoch (loss 2.8845): 92%|ββββββββββ| 1150/1250 [06:57<00:35, 2.81it/s]
Training 1/1 epoch (loss 2.8845): 92%|ββββββββββ| 1151/1250 [06:57<00:35, 2.77it/s]
Training 1/1 epoch (loss 2.7269): 92%|ββββββββββ| 1151/1250 [06:57<00:35, 2.77it/s]
Training 1/1 epoch (loss 2.7269): 92%|ββββββββββ| 1152/1250 [06:57<00:35, 2.80it/s]
Training 1/1 epoch (loss 2.7051): 92%|ββββββββββ| 1152/1250 [06:58<00:35, 2.80it/s]
Training 1/1 epoch (loss 2.7051): 92%|ββββββββββ| 1153/1250 [06:58<00:37, 2.56it/s]
Training 1/1 epoch (loss 2.6192): 92%|ββββββββββ| 1153/1250 [06:58<00:37, 2.56it/s]
Training 1/1 epoch (loss 2.6192): 92%|ββββββββββ| 1154/1250 [06:58<00:39, 2.44it/s]
Training 1/1 epoch (loss 2.6444): 92%|ββββββββββ| 1154/1250 [06:59<00:39, 2.44it/s]
Training 1/1 epoch (loss 2.6444): 92%|ββββββββββ| 1155/1250 [06:59<00:37, 2.56it/s]
Training 1/1 epoch (loss 2.6967): 92%|ββββββββββ| 1155/1250 [06:59<00:37, 2.56it/s]
Training 1/1 epoch (loss 2.6967): 92%|ββββββββββ| 1156/1250 [06:59<00:35, 2.66it/s]
Training 1/1 epoch (loss 2.6281): 92%|ββββββββββ| 1156/1250 [06:59<00:35, 2.66it/s]
Training 1/1 epoch (loss 2.6281): 93%|ββββββββββ| 1157/1250 [06:59<00:34, 2.73it/s]
Training 1/1 epoch (loss 2.6783): 93%|ββββββββββ| 1157/1250 [07:00<00:34, 2.73it/s]
Training 1/1 epoch (loss 2.6783): 93%|ββββββββββ| 1158/1250 [07:00<00:33, 2.76it/s]
Training 1/1 epoch (loss 2.6472): 93%|ββββββββββ| 1158/1250 [07:00<00:33, 2.76it/s]
Training 1/1 epoch (loss 2.6472): 93%|ββββββββββ| 1159/1250 [07:00<00:31, 2.88it/s]
Training 1/1 epoch (loss 2.6331): 93%|ββββββββββ| 1159/1250 [07:00<00:31, 2.88it/s]
Training 1/1 epoch (loss 2.6331): 93%|ββββββββββ| 1160/1250 [07:00<00:31, 2.88it/s]
Training 1/1 epoch (loss 2.5052): 93%|ββββββββββ| 1160/1250 [07:01<00:31, 2.88it/s]
Training 1/1 epoch (loss 2.5052): 93%|ββββββββββ| 1161/1250 [07:01<00:33, 2.69it/s]
Training 1/1 epoch (loss 2.6407): 93%|ββββββββββ| 1161/1250 [07:01<00:33, 2.69it/s]
Training 1/1 epoch (loss 2.6407): 93%|ββββββββββ| 1162/1250 [07:01<00:31, 2.83it/s]
Training 1/1 epoch (loss 2.5739): 93%|ββββββββββ| 1162/1250 [07:01<00:31, 2.83it/s]
Training 1/1 epoch (loss 2.5739): 93%|ββββββββββ| 1163/1250 [07:01<00:30, 2.86it/s]
Training 1/1 epoch (loss 2.4780): 93%|ββββββββββ| 1163/1250 [07:02<00:30, 2.86it/s]
Training 1/1 epoch (loss 2.4780): 93%|ββββββββββ| 1164/1250 [07:02<00:28, 3.00it/s]
Training 1/1 epoch (loss 2.6826): 93%|ββββββββββ| 1164/1250 [07:02<00:28, 3.00it/s]
Training 1/1 epoch (loss 2.6826): 93%|ββββββββββ| 1165/1250 [07:02<00:28, 3.02it/s]
Training 1/1 epoch (loss 2.7977): 93%|ββββββββββ| 1165/1250 [07:02<00:28, 3.02it/s]
Training 1/1 epoch (loss 2.7977): 93%|ββββββββββ| 1166/1250 [07:02<00:27, 3.06it/s]
Training 1/1 epoch (loss 2.8533): 93%|ββββββββββ| 1166/1250 [07:03<00:27, 3.06it/s]
Training 1/1 epoch (loss 2.8533): 93%|ββββββββββ| 1167/1250 [07:03<00:27, 2.99it/s]
Training 1/1 epoch (loss 2.7859): 93%|ββββββββββ| 1167/1250 [07:03<00:27, 2.99it/s]
Training 1/1 epoch (loss 2.7859): 93%|ββββββββββ| 1168/1250 [07:03<00:30, 2.72it/s]
Training 1/1 epoch (loss 2.5100): 93%|ββββββββββ| 1168/1250 [07:04<00:30, 2.72it/s]
Training 1/1 epoch (loss 2.5100): 94%|ββββββββββ| 1169/1250 [07:04<00:29, 2.73it/s]
Training 1/1 epoch (loss 2.6529): 94%|ββββββββββ| 1169/1250 [07:04<00:29, 2.73it/s]
Training 1/1 epoch (loss 2.6529): 94%|ββββββββββ| 1170/1250 [07:04<00:28, 2.79it/s]
Training 1/1 epoch (loss 2.6889): 94%|ββββββββββ| 1170/1250 [07:04<00:28, 2.79it/s]
Training 1/1 epoch (loss 2.6889): 94%|ββββββββββ| 1171/1250 [07:04<00:27, 2.84it/s]
Training 1/1 epoch (loss 2.8419): 94%|ββββββββββ| 1171/1250 [07:05<00:27, 2.84it/s]
Training 1/1 epoch (loss 2.8419): 94%|ββββββββββ| 1172/1250 [07:05<00:27, 2.87it/s]
Training 1/1 epoch (loss 2.8788): 94%|ββββββββββ| 1172/1250 [07:05<00:27, 2.87it/s]
Training 1/1 epoch (loss 2.8788): 94%|ββββββββββ| 1173/1250 [07:05<00:27, 2.84it/s]
Training 1/1 epoch (loss 2.7401): 94%|ββββββββββ| 1173/1250 [07:05<00:27, 2.84it/s]
Training 1/1 epoch (loss 2.7401): 94%|ββββββββββ| 1174/1250 [07:05<00:25, 2.95it/s]
Training 1/1 epoch (loss 2.7389): 94%|ββββββββββ| 1174/1250 [07:06<00:25, 2.95it/s]
Training 1/1 epoch (loss 2.7389): 94%|ββββββββββ| 1175/1250 [07:06<00:25, 2.97it/s]
Training 1/1 epoch (loss 2.7504): 94%|ββββββββββ| 1175/1250 [07:06<00:25, 2.97it/s]
Training 1/1 epoch (loss 2.7504): 94%|ββββββββββ| 1176/1250 [07:06<00:24, 2.99it/s]
Training 1/1 epoch (loss 2.7947): 94%|ββββββββββ| 1176/1250 [07:06<00:24, 2.99it/s]
Training 1/1 epoch (loss 2.7947): 94%|ββββββββββ| 1177/1250 [07:06<00:26, 2.79it/s]
Training 1/1 epoch (loss 2.7881): 94%|ββββββββββ| 1177/1250 [07:07<00:26, 2.79it/s]
Training 1/1 epoch (loss 2.7881): 94%|ββββββββββ| 1178/1250 [07:07<00:26, 2.70it/s]
Training 1/1 epoch (loss 2.6393): 94%|ββββββββββ| 1178/1250 [07:07<00:26, 2.70it/s]
Training 1/1 epoch (loss 2.6393): 94%|ββββββββββ| 1179/1250 [07:07<00:25, 2.73it/s]
Training 1/1 epoch (loss 2.7907): 94%|ββββββββββ| 1179/1250 [07:07<00:25, 2.73it/s]
Training 1/1 epoch (loss 2.7907): 94%|ββββββββββ| 1180/1250 [07:07<00:24, 2.81it/s]
Training 1/1 epoch (loss 2.7758): 94%|ββββββββββ| 1180/1250 [07:08<00:24, 2.81it/s]
Training 1/1 epoch (loss 2.7758): 94%|ββββββββββ| 1181/1250 [07:08<00:24, 2.87it/s]
Training 1/1 epoch (loss 2.7540): 94%|ββββββββββ| 1181/1250 [07:08<00:24, 2.87it/s]
Training 1/1 epoch (loss 2.7540): 95%|ββββββββββ| 1182/1250 [07:08<00:23, 2.95it/s]
Training 1/1 epoch (loss 2.7213): 95%|ββββββββββ| 1182/1250 [07:08<00:23, 2.95it/s]
Training 1/1 epoch (loss 2.7213): 95%|ββββββββββ| 1183/1250 [07:08<00:22, 3.02it/s]
Training 1/1 epoch (loss 2.6067): 95%|ββββββββββ| 1183/1250 [07:09<00:22, 3.02it/s]
Training 1/1 epoch (loss 2.6067): 95%|ββββββββββ| 1184/1250 [07:09<00:23, 2.84it/s]
Training 1/1 epoch (loss 2.7273): 95%|ββββββββββ| 1184/1250 [07:09<00:23, 2.84it/s]
Training 1/1 epoch (loss 2.7273): 95%|ββββββββββ| 1185/1250 [07:09<00:22, 2.84it/s]
Training 1/1 epoch (loss 2.7254): 95%|ββββββββββ| 1185/1250 [07:10<00:22, 2.84it/s]
Training 1/1 epoch (loss 2.7254): 95%|ββββββββββ| 1186/1250 [07:10<00:23, 2.67it/s]
Training 1/1 epoch (loss 2.8663): 95%|ββββββββββ| 1186/1250 [07:10<00:23, 2.67it/s]
Training 1/1 epoch (loss 2.8663): 95%|ββββββββββ| 1187/1250 [07:10<00:22, 2.80it/s]
Training 1/1 epoch (loss 2.4725): 95%|ββββββββββ| 1187/1250 [07:10<00:22, 2.80it/s]
Training 1/1 epoch (loss 2.4725): 95%|ββββββββββ| 1188/1250 [07:10<00:21, 2.93it/s]
Training 1/1 epoch (loss 2.5905): 95%|ββββββββββ| 1188/1250 [07:10<00:21, 2.93it/s]
Training 1/1 epoch (loss 2.5905): 95%|ββββββββββ| 1189/1250 [07:10<00:20, 3.02it/s]
Training 1/1 epoch (loss 2.9385): 95%|ββββββββββ| 1189/1250 [07:11<00:20, 3.02it/s]
Training 1/1 epoch (loss 2.9385): 95%|ββββββββββ| 1190/1250 [07:11<00:19, 3.03it/s]
Training 1/1 epoch (loss 2.8558): 95%|ββββββββββ| 1190/1250 [07:11<00:19, 3.03it/s]
Training 1/1 epoch (loss 2.8558): 95%|ββββββββββ| 1191/1250 [07:11<00:20, 2.95it/s]
Training 1/1 epoch (loss 2.7171): 95%|ββββββββββ| 1191/1250 [07:12<00:20, 2.95it/s]
Training 1/1 epoch (loss 2.7171): 95%|ββββββββββ| 1192/1250 [07:12<00:21, 2.75it/s]
Training 1/1 epoch (loss 2.7325): 95%|ββββββββββ| 1192/1250 [07:12<00:21, 2.75it/s]
Training 1/1 epoch (loss 2.7325): 95%|ββββββββββ| 1193/1250 [07:12<00:20, 2.76it/s]
Training 1/1 epoch (loss 2.7073): 95%|ββββββββββ| 1193/1250 [07:12<00:20, 2.76it/s]
Training 1/1 epoch (loss 2.7073): 96%|ββββββββββ| 1194/1250 [07:12<00:19, 2.88it/s]
Training 1/1 epoch (loss 2.7972): 96%|ββββββββββ| 1194/1250 [07:13<00:19, 2.88it/s]
Training 1/1 epoch (loss 2.7972): 96%|ββββββββββ| 1195/1250 [07:13<00:18, 2.97it/s]
Training 1/1 epoch (loss 2.9516): 96%|ββββββββββ| 1195/1250 [07:13<00:18, 2.97it/s]
Training 1/1 epoch (loss 2.9516): 96%|ββββββββββ| 1196/1250 [07:13<00:18, 2.99it/s]
Training 1/1 epoch (loss 2.7760): 96%|ββββββββββ| 1196/1250 [07:13<00:18, 2.99it/s]
Training 1/1 epoch (loss 2.7760): 96%|ββββββββββ| 1197/1250 [07:13<00:17, 3.03it/s]
Training 1/1 epoch (loss 2.6236): 96%|ββββββββββ| 1197/1250 [07:14<00:17, 3.03it/s]
Training 1/1 epoch (loss 2.6236): 96%|ββββββββββ| 1198/1250 [07:14<00:17, 3.00it/s]
Training 1/1 epoch (loss 2.6223): 96%|ββββββββββ| 1198/1250 [07:14<00:17, 3.00it/s]
Training 1/1 epoch (loss 2.6223): 96%|ββββββββββ| 1199/1250 [07:14<00:17, 2.95it/s]
Training 1/1 epoch (loss 2.6392): 96%|ββββββββββ| 1199/1250 [07:14<00:17, 2.95it/s]
Training 1/1 epoch (loss 2.6392): 96%|ββββββββββ| 1200/1250 [07:14<00:18, 2.64it/s]
Training 1/1 epoch (loss 2.5985): 96%|ββββββββββ| 1200/1250 [07:15<00:18, 2.64it/s]
Training 1/1 epoch (loss 2.5985): 96%|ββββββββββ| 1201/1250 [07:15<00:18, 2.67it/s]
Training 1/1 epoch (loss 2.5396): 96%|ββββββββββ| 1201/1250 [07:15<00:18, 2.67it/s]
Training 1/1 epoch (loss 2.5396): 96%|ββββββββββ| 1202/1250 [07:15<00:18, 2.65it/s]
Training 1/1 epoch (loss 2.7128): 96%|ββββββββββ| 1202/1250 [07:15<00:18, 2.65it/s]
Training 1/1 epoch (loss 2.7128): 96%|ββββββββββ| 1203/1250 [07:15<00:17, 2.75it/s]
Training 1/1 epoch (loss 2.5919): 96%|ββββββββββ| 1203/1250 [07:16<00:17, 2.75it/s]
Training 1/1 epoch (loss 2.5919): 96%|ββββββββββ| 1204/1250 [07:16<00:15, 2.91it/s]
Training 1/1 epoch (loss 2.9678): 96%|ββββββββββ| 1204/1250 [07:16<00:15, 2.91it/s]
Training 1/1 epoch (loss 2.9678): 96%|ββββββββββ| 1205/1250 [07:16<00:15, 2.99it/s]
Training 1/1 epoch (loss 2.6221): 96%|ββββββββββ| 1205/1250 [07:16<00:15, 2.99it/s]
Training 1/1 epoch (loss 2.6221): 96%|ββββββββββ| 1206/1250 [07:16<00:14, 3.03it/s]
Training 1/1 epoch (loss 2.6957): 96%|ββββββββββ| 1206/1250 [07:17<00:14, 3.03it/s]
Training 1/1 epoch (loss 2.6957): 97%|ββββββββββ| 1207/1250 [07:17<00:15, 2.84it/s]
Training 1/1 epoch (loss 2.6710): 97%|ββββββββββ| 1207/1250 [07:17<00:15, 2.84it/s]
Training 1/1 epoch (loss 2.6710): 97%|ββββββββββ| 1208/1250 [07:17<00:14, 2.86it/s]
Training 1/1 epoch (loss 2.7345): 97%|ββββββββββ| 1208/1250 [07:18<00:14, 2.86it/s]
Training 1/1 epoch (loss 2.7345): 97%|ββββββββββ| 1209/1250 [07:18<00:14, 2.78it/s]
Training 1/1 epoch (loss 2.8447): 97%|ββββββββββ| 1209/1250 [07:18<00:14, 2.78it/s]
Training 1/1 epoch (loss 2.8447): 97%|ββββββββββ| 1210/1250 [07:18<00:14, 2.83it/s]
Training 1/1 epoch (loss 2.7053): 97%|ββββββββββ| 1210/1250 [07:18<00:14, 2.83it/s]
Training 1/1 epoch (loss 2.7053): 97%|ββββββββββ| 1211/1250 [07:18<00:13, 2.90it/s]
Training 1/1 epoch (loss 2.6631): 97%|ββββββββββ| 1211/1250 [07:18<00:13, 2.90it/s]
Training 1/1 epoch (loss 2.6631): 97%|ββββββββββ| 1212/1250 [07:18<00:12, 3.02it/s]
Training 1/1 epoch (loss 2.7798): 97%|ββββββββββ| 1212/1250 [07:19<00:12, 3.02it/s]
Training 1/1 epoch (loss 2.7798): 97%|ββββββββββ| 1213/1250 [07:19<00:12, 2.97it/s]
Training 1/1 epoch (loss 2.5544): 97%|ββββββββββ| 1213/1250 [07:19<00:12, 2.97it/s]
Training 1/1 epoch (loss 2.5544): 97%|ββββββββββ| 1214/1250 [07:19<00:12, 2.92it/s]
Training 1/1 epoch (loss 2.6673): 97%|ββββββββββ| 1214/1250 [07:20<00:12, 2.92it/s]
Training 1/1 epoch (loss 2.6673): 97%|ββββββββββ| 1215/1250 [07:20<00:12, 2.81it/s]
Training 1/1 epoch (loss 3.0018): 97%|ββββββββββ| 1215/1250 [07:20<00:12, 2.81it/s]
Training 1/1 epoch (loss 3.0018): 97%|ββββββββββ| 1216/1250 [07:20<00:11, 2.87it/s]
Training 1/1 epoch (loss 2.5760): 97%|ββββββββββ| 1216/1250 [07:20<00:11, 2.87it/s]
Training 1/1 epoch (loss 2.5760): 97%|ββββββββββ| 1217/1250 [07:20<00:11, 2.90it/s]
Training 1/1 epoch (loss 2.6415): 97%|ββββββββββ| 1217/1250 [07:21<00:11, 2.90it/s]
Training 1/1 epoch (loss 2.6415): 97%|ββββββββββ| 1218/1250 [07:21<00:10, 2.95it/s]
Training 1/1 epoch (loss 2.7668): 97%|ββββββββββ| 1218/1250 [07:21<00:10, 2.95it/s]
Training 1/1 epoch (loss 2.7668): 98%|ββββββββββ| 1219/1250 [07:21<00:10, 2.98it/s]
Training 1/1 epoch (loss 2.6507): 98%|ββββββββββ| 1219/1250 [07:21<00:10, 2.98it/s]
Training 1/1 epoch (loss 2.6507): 98%|ββββββββββ| 1220/1250 [07:21<00:10, 2.90it/s]
Training 1/1 epoch (loss 2.5687): 98%|ββββββββββ| 1220/1250 [07:22<00:10, 2.90it/s]
Training 1/1 epoch (loss 2.5687): 98%|ββββββββββ| 1221/1250 [07:22<00:10, 2.89it/s]
Training 1/1 epoch (loss 2.5386): 98%|ββββββββββ| 1221/1250 [07:22<00:10, 2.89it/s]
Training 1/1 epoch (loss 2.5386): 98%|ββββββββββ| 1222/1250 [07:22<00:09, 2.81it/s]
Training 1/1 epoch (loss 2.5924): 98%|ββββββββββ| 1222/1250 [07:22<00:09, 2.81it/s]
Training 1/1 epoch (loss 2.5924): 98%|ββββββββββ| 1223/1250 [07:22<00:09, 2.88it/s]
Training 1/1 epoch (loss 2.7281): 98%|ββββββββββ| 1223/1250 [07:23<00:09, 2.88it/s]
Training 1/1 epoch (loss 2.7281): 98%|ββββββββββ| 1224/1250 [07:23<00:09, 2.73it/s]
Training 1/1 epoch (loss 2.6974): 98%|ββββββββββ| 1224/1250 [07:23<00:09, 2.73it/s]
Training 1/1 epoch (loss 2.6974): 98%|ββββββββββ| 1225/1250 [07:23<00:10, 2.36it/s]
Training 1/1 epoch (loss 2.7985): 98%|ββββββββββ| 1225/1250 [07:24<00:10, 2.36it/s]
Training 1/1 epoch (loss 2.7985): 98%|ββββββββββ| 1226/1250 [07:24<00:09, 2.52it/s]
Training 1/1 epoch (loss 2.6792): 98%|ββββββββββ| 1226/1250 [07:24<00:09, 2.52it/s]
Training 1/1 epoch (loss 2.6792): 98%|ββββββββββ| 1227/1250 [07:24<00:08, 2.61it/s]
Training 1/1 epoch (loss 2.8331): 98%|ββββββββββ| 1227/1250 [07:24<00:08, 2.61it/s]
Training 1/1 epoch (loss 2.8331): 98%|ββββββββββ| 1228/1250 [07:24<00:08, 2.62it/s]
Training 1/1 epoch (loss 2.4746): 98%|ββββββββββ| 1228/1250 [07:25<00:08, 2.62it/s]
Training 1/1 epoch (loss 2.4746): 98%|ββββββββββ| 1229/1250 [07:25<00:07, 2.68it/s]
Training 1/1 epoch (loss 2.7689): 98%|ββββββββββ| 1229/1250 [07:25<00:07, 2.68it/s]
Training 1/1 epoch (loss 2.7689): 98%|ββββββββββ| 1230/1250 [07:25<00:07, 2.79it/s]
Training 1/1 epoch (loss 2.6478): 98%|ββββββββββ| 1230/1250 [07:25<00:07, 2.79it/s]
Training 1/1 epoch (loss 2.6478): 98%|ββββββββββ| 1231/1250 [07:25<00:06, 2.73it/s]
Training 1/1 epoch (loss 2.7677): 98%|ββββββββββ| 1231/1250 [07:26<00:06, 2.73it/s]
Training 1/1 epoch (loss 2.7677): 99%|ββββββββββ| 1232/1250 [07:26<00:06, 2.78it/s]
Training 1/1 epoch (loss 2.7679): 99%|ββββββββββ| 1232/1250 [07:26<00:06, 2.78it/s]
Training 1/1 epoch (loss 2.7679): 99%|ββββββββββ| 1233/1250 [07:26<00:06, 2.81it/s]
Training 1/1 epoch (loss 2.7595): 99%|ββββββββββ| 1233/1250 [07:26<00:06, 2.81it/s]
Training 1/1 epoch (loss 2.7595): 99%|ββββββββββ| 1234/1250 [07:26<00:05, 2.98it/s]
Training 1/1 epoch (loss 2.6532): 99%|ββββββββββ| 1234/1250 [07:27<00:05, 2.98it/s]
Training 1/1 epoch (loss 2.6532): 99%|ββββββββββ| 1235/1250 [07:27<00:05, 2.81it/s]
Training 1/1 epoch (loss 2.5893): 99%|ββββββββββ| 1235/1250 [07:27<00:05, 2.81it/s]
Training 1/1 epoch (loss 2.5893): 99%|ββββββββββ| 1236/1250 [07:27<00:05, 2.79it/s]
Training 1/1 epoch (loss 2.6252): 99%|ββββββββββ| 1236/1250 [07:28<00:05, 2.79it/s]
Training 1/1 epoch (loss 2.6252): 99%|ββββββββββ| 1237/1250 [07:28<00:04, 2.75it/s]
Training 1/1 epoch (loss 2.6661): 99%|ββββββββββ| 1237/1250 [07:28<00:04, 2.75it/s]
Training 1/1 epoch (loss 2.6661): 99%|ββββββββββ| 1238/1250 [07:28<00:04, 2.42it/s]
Training 1/1 epoch (loss 2.8040): 99%|ββββββββββ| 1238/1250 [07:29<00:04, 2.42it/s]
Training 1/1 epoch (loss 2.8040): 99%|ββββββββββ| 1239/1250 [07:29<00:04, 2.36it/s]
Training 1/1 epoch (loss 2.6784): 99%|ββββββββββ| 1239/1250 [07:29<00:04, 2.36it/s]
Training 1/1 epoch (loss 2.6784): 99%|ββββββββββ| 1240/1250 [07:29<00:03, 2.55it/s]
Training 1/1 epoch (loss 2.7301): 99%|ββββββββββ| 1240/1250 [07:29<00:03, 2.55it/s]
Training 1/1 epoch (loss 2.7301): 99%|ββββββββββ| 1241/1250 [07:29<00:03, 2.54it/s]
Training 1/1 epoch (loss 2.7583): 99%|ββββββββββ| 1241/1250 [07:30<00:03, 2.54it/s]
Training 1/1 epoch (loss 2.7583): 99%|ββββββββββ| 1242/1250 [07:30<00:03, 2.46it/s]
Training 1/1 epoch (loss 2.8828): 99%|ββββββββββ| 1242/1250 [07:30<00:03, 2.46it/s]
Training 1/1 epoch (loss 2.8828): 99%|ββββββββββ| 1243/1250 [07:30<00:02, 2.53it/s]
Training 1/1 epoch (loss 2.7951): 99%|ββββββββββ| 1243/1250 [07:30<00:02, 2.53it/s]
Training 1/1 epoch (loss 2.7951): 100%|ββββββββββ| 1244/1250 [07:30<00:02, 2.60it/s]
Training 1/1 epoch (loss 2.9376): 100%|ββββββββββ| 1244/1250 [07:31<00:02, 2.60it/s]
Training 1/1 epoch (loss 2.9376): 100%|ββββββββββ| 1245/1250 [07:31<00:01, 2.76it/s]
Training 1/1 epoch (loss 2.5708): 100%|ββββββββββ| 1245/1250 [07:31<00:01, 2.76it/s]
Training 1/1 epoch (loss 2.5708): 100%|ββββββββββ| 1246/1250 [07:31<00:01, 2.87it/s]
Training 1/1 epoch (loss 2.8824): 100%|ββββββββββ| 1246/1250 [07:31<00:01, 2.87it/s]
Training 1/1 epoch (loss 2.8824): 100%|ββββββββββ| 1247/1250 [07:31<00:01, 2.82it/s]
Training 1/1 epoch (loss 2.7327): 100%|ββββββββββ| 1247/1250 [07:32<00:01, 2.82it/s]
Training 1/1 epoch (loss 2.7327): 100%|ββββββββββ| 1248/1250 [07:32<00:00, 2.78it/s]
Training 1/1 epoch (loss 2.8448): 100%|ββββββββββ| 1248/1250 [07:32<00:00, 2.78it/s]
Training 1/1 epoch (loss 2.8448): 100%|ββββββββββ| 1249/1250 [07:32<00:00, 2.86it/s]
Training 1/1 epoch (loss 2.5225): 100%|ββββββββββ| 1249/1250 [07:33<00:00, 2.86it/s]
Training 1/1 epoch (loss 2.5225): 100%|ββββββββββ| 1250/1250 [07:33<00:00, 2.64it/s]
Training 1/1 epoch (loss 2.5225): 100%|ββββββββββ| 1250/1250 [07:33<00:00, 2.76it/s] |