|
Training 1/3 epoch (loss 6.0409): 0%| | 0/2046 [00:03<?, ?it/s]
Training 1/3 epoch (loss 6.0409): 0%| | 1/2046 [00:03<2:08:04, 3.76s/it]
Training 1/3 epoch (loss 5.8384): 0%| | 1/2046 [00:05<2:08:04, 3.76s/it]
Training 1/3 epoch (loss 5.8384): 0%| | 2/2046 [00:05<1:20:03, 2.35s/it]
Training 1/3 epoch (loss 5.7309): 0%| | 2/2046 [00:05<1:20:03, 2.35s/it]
Training 1/3 epoch (loss 5.7309): 0%| | 3/2046 [00:05<53:43, 1.58s/it]
Training 1/3 epoch (loss 5.7818): 0%| | 3/2046 [00:06<53:43, 1.58s/it]
Training 1/3 epoch (loss 5.7818): 0%| | 4/2046 [00:06<40:38, 1.19s/it]
Training 1/3 epoch (loss 5.8005): 0%| | 4/2046 [00:07<40:38, 1.19s/it]
Training 1/3 epoch (loss 5.8005): 0%| | 5/2046 [00:07<33:57, 1.00it/s]
Training 1/3 epoch (loss 6.2337): 0%| | 5/2046 [00:07<33:57, 1.00it/s]
Training 1/3 epoch (loss 6.2337): 0%| | 6/2046 [00:07<30:01, 1.13it/s]
Training 1/3 epoch (loss 5.4436): 0%| | 6/2046 [00:08<30:01, 1.13it/s]
Training 1/3 epoch (loss 5.4436): 0%| | 7/2046 [00:08<27:12, 1.25it/s]
Training 1/3 epoch (loss 6.1080): 0%| | 7/2046 [00:09<27:12, 1.25it/s]
Training 1/3 epoch (loss 6.1080): 0%| | 8/2046 [00:09<28:24, 1.20it/s]
Training 1/3 epoch (loss 5.6866): 0%| | 8/2046 [00:10<28:24, 1.20it/s]
Training 1/3 epoch (loss 5.6866): 0%| | 9/2046 [00:10<27:37, 1.23it/s]
Training 1/3 epoch (loss 6.1017): 0%| | 9/2046 [00:10<27:37, 1.23it/s]
Training 1/3 epoch (loss 6.1017): 0%| | 10/2046 [00:10<25:36, 1.33it/s]
Training 1/3 epoch (loss 5.8457): 0%| | 10/2046 [00:11<25:36, 1.33it/s]
Training 1/3 epoch (loss 5.8457): 1%| | 11/2046 [00:11<24:45, 1.37it/s]
Training 1/3 epoch (loss 5.5554): 1%| | 11/2046 [00:12<24:45, 1.37it/s]
Training 1/3 epoch (loss 5.5554): 1%| | 12/2046 [00:12<26:48, 1.26it/s]
Training 1/3 epoch (loss 6.0976): 1%| | 12/2046 [00:12<26:48, 1.26it/s]
Training 1/3 epoch (loss 6.0976): 1%| | 13/2046 [00:12<26:32, 1.28it/s]
Training 1/3 epoch (loss 5.9682): 1%| | 13/2046 [00:13<26:32, 1.28it/s]
Training 1/3 epoch (loss 5.9682): 1%| | 14/2046 [00:13<26:50, 1.26it/s]
Training 1/3 epoch (loss 6.1717): 1%| | 14/2046 [00:14<26:50, 1.26it/s]
Training 1/3 epoch (loss 6.1717): 1%| | 15/2046 [00:14<25:47, 1.31it/s]
Training 1/3 epoch (loss 5.9000): 1%| | 15/2046 [00:15<25:47, 1.31it/s]
Training 1/3 epoch (loss 5.9000): 1%| | 16/2046 [00:15<25:48, 1.31it/s]
Training 1/3 epoch (loss 5.8836): 1%| | 16/2046 [00:15<25:48, 1.31it/s]
Training 1/3 epoch (loss 5.8836): 1%| | 17/2046 [00:15<24:27, 1.38it/s]
Training 1/3 epoch (loss 6.2627): 1%| | 17/2046 [00:16<24:27, 1.38it/s]
Training 1/3 epoch (loss 6.2627): 1%| | 18/2046 [00:16<24:21, 1.39it/s]
Training 1/3 epoch (loss 5.6787): 1%| | 18/2046 [00:17<24:21, 1.39it/s]
Training 1/3 epoch (loss 5.6787): 1%| | 19/2046 [00:17<23:32, 1.43it/s]
Training 1/3 epoch (loss 5.9802): 1%| | 19/2046 [00:17<23:32, 1.43it/s]
Training 1/3 epoch (loss 5.9802): 1%| | 20/2046 [00:17<23:12, 1.45it/s]
Training 1/3 epoch (loss 5.3702): 1%| | 20/2046 [00:18<23:12, 1.45it/s]
Training 1/3 epoch (loss 5.3702): 1%| | 21/2046 [00:18<22:39, 1.49it/s]
Training 1/3 epoch (loss 5.2702): 1%| | 21/2046 [00:19<22:39, 1.49it/s]
Training 1/3 epoch (loss 5.2702): 1%| | 22/2046 [00:19<22:35, 1.49it/s]
Training 1/3 epoch (loss 5.4074): 1%| | 22/2046 [00:19<22:35, 1.49it/s]
Training 1/3 epoch (loss 5.4074): 1%| | 23/2046 [00:19<23:12, 1.45it/s]
Training 1/3 epoch (loss 5.7723): 1%| | 23/2046 [00:20<23:12, 1.45it/s]
Training 1/3 epoch (loss 5.7723): 1%| | 24/2046 [00:20<24:34, 1.37it/s]
Training 1/3 epoch (loss 4.7893): 1%| | 24/2046 [00:21<24:34, 1.37it/s]
Training 1/3 epoch (loss 4.7893): 1%| | 25/2046 [00:21<23:47, 1.42it/s]
Training 1/3 epoch (loss 4.8407): 1%| | 25/2046 [00:22<23:47, 1.42it/s]
Training 1/3 epoch (loss 4.8407): 1%|β | 26/2046 [00:22<24:22, 1.38it/s]
Training 1/3 epoch (loss 5.2082): 1%|β | 26/2046 [00:22<24:22, 1.38it/s]
Training 1/3 epoch (loss 5.2082): 1%|β | 27/2046 [00:22<24:23, 1.38it/s]
Training 1/3 epoch (loss 4.6411): 1%|β | 27/2046 [00:23<24:23, 1.38it/s]
Training 1/3 epoch (loss 4.6411): 1%|β | 28/2046 [00:23<24:29, 1.37it/s]
Training 1/3 epoch (loss 4.8341): 1%|β | 28/2046 [00:24<24:29, 1.37it/s]
Training 1/3 epoch (loss 4.8341): 1%|β | 29/2046 [00:24<25:09, 1.34it/s]
Training 1/3 epoch (loss 4.6791): 1%|β | 29/2046 [00:25<25:09, 1.34it/s]
Training 1/3 epoch (loss 4.6791): 1%|β | 30/2046 [00:25<23:50, 1.41it/s]
Training 1/3 epoch (loss 4.9067): 1%|β | 30/2046 [00:25<23:50, 1.41it/s]
Training 1/3 epoch (loss 4.9067): 2%|β | 31/2046 [00:25<24:01, 1.40it/s]
Training 1/3 epoch (loss 4.7987): 2%|β | 31/2046 [00:26<24:01, 1.40it/s]
Training 1/3 epoch (loss 4.7987): 2%|β | 32/2046 [00:26<25:15, 1.33it/s]
Training 1/3 epoch (loss 3.3664): 2%|β | 32/2046 [00:27<25:15, 1.33it/s]
Training 1/3 epoch (loss 3.3664): 2%|β | 33/2046 [00:27<24:50, 1.35it/s]
Training 1/3 epoch (loss 3.5121): 2%|β | 33/2046 [00:27<24:50, 1.35it/s]
Training 1/3 epoch (loss 3.5121): 2%|β | 34/2046 [00:27<23:49, 1.41it/s]
Training 1/3 epoch (loss 3.4773): 2%|β | 34/2046 [00:28<23:49, 1.41it/s]
Training 1/3 epoch (loss 3.4773): 2%|β | 35/2046 [00:28<23:42, 1.41it/s]
Training 1/3 epoch (loss 3.3874): 2%|β | 35/2046 [00:29<23:42, 1.41it/s]
Training 1/3 epoch (loss 3.3874): 2%|β | 36/2046 [00:29<23:58, 1.40it/s]
Training 1/3 epoch (loss 3.4193): 2%|β | 36/2046 [00:30<23:58, 1.40it/s]
Training 1/3 epoch (loss 3.4193): 2%|β | 37/2046 [00:30<23:30, 1.42it/s]
Training 1/3 epoch (loss 3.4827): 2%|β | 37/2046 [00:30<23:30, 1.42it/s]
Training 1/3 epoch (loss 3.4827): 2%|β | 38/2046 [00:30<22:42, 1.47it/s]
Training 1/3 epoch (loss 3.7202): 2%|β | 38/2046 [00:31<22:42, 1.47it/s]
Training 1/3 epoch (loss 3.7202): 2%|β | 39/2046 [00:31<22:14, 1.50it/s]
Training 1/3 epoch (loss 3.4805): 2%|β | 39/2046 [00:32<22:14, 1.50it/s]
Training 1/3 epoch (loss 3.4805): 2%|β | 40/2046 [00:32<24:35, 1.36it/s]
Training 1/3 epoch (loss 3.1132): 2%|β | 40/2046 [00:33<24:35, 1.36it/s]
Training 1/3 epoch (loss 3.1132): 2%|β | 41/2046 [00:33<24:42, 1.35it/s]
Training 1/3 epoch (loss 3.0531): 2%|β | 41/2046 [00:33<24:42, 1.35it/s]
Training 1/3 epoch (loss 3.0531): 2%|β | 42/2046 [00:33<23:49, 1.40it/s]
Training 1/3 epoch (loss 3.2395): 2%|β | 42/2046 [00:34<23:49, 1.40it/s]
Training 1/3 epoch (loss 3.2395): 2%|β | 43/2046 [00:34<23:20, 1.43it/s]
Training 1/3 epoch (loss 3.0151): 2%|β | 43/2046 [00:34<23:20, 1.43it/s]
Training 1/3 epoch (loss 3.0151): 2%|β | 44/2046 [00:34<22:45, 1.47it/s]
Training 1/3 epoch (loss 3.2464): 2%|β | 44/2046 [00:35<22:45, 1.47it/s]
Training 1/3 epoch (loss 3.2464): 2%|β | 45/2046 [00:35<23:28, 1.42it/s]
Training 1/3 epoch (loss 3.1543): 2%|β | 45/2046 [00:36<23:28, 1.42it/s]
Training 1/3 epoch (loss 3.1543): 2%|β | 46/2046 [00:36<23:56, 1.39it/s]
Training 1/3 epoch (loss 3.1509): 2%|β | 46/2046 [00:37<23:56, 1.39it/s]
Training 1/3 epoch (loss 3.1509): 2%|β | 47/2046 [00:37<25:02, 1.33it/s]
Training 1/3 epoch (loss 2.7276): 2%|β | 47/2046 [00:38<25:02, 1.33it/s]
Training 1/3 epoch (loss 2.7276): 2%|β | 48/2046 [00:38<25:53, 1.29it/s]
Training 1/3 epoch (loss 2.1258): 2%|β | 48/2046 [00:38<25:53, 1.29it/s]
Training 1/3 epoch (loss 2.1258): 2%|β | 49/2046 [00:38<25:16, 1.32it/s]
Training 1/3 epoch (loss 2.2470): 2%|β | 49/2046 [00:39<25:16, 1.32it/s]
Training 1/3 epoch (loss 2.2470): 2%|β | 50/2046 [00:39<24:16, 1.37it/s]
Training 1/3 epoch (loss 2.3416): 2%|β | 50/2046 [00:40<24:16, 1.37it/s]
Training 1/3 epoch (loss 2.3416): 2%|β | 51/2046 [00:40<23:52, 1.39it/s]
Training 1/3 epoch (loss 2.2003): 2%|β | 51/2046 [00:41<23:52, 1.39it/s]
Training 1/3 epoch (loss 2.2003): 3%|β | 52/2046 [00:41<25:14, 1.32it/s]
Training 1/3 epoch (loss 2.2306): 3%|β | 52/2046 [00:41<25:14, 1.32it/s]
Training 1/3 epoch (loss 2.2306): 3%|β | 53/2046 [00:41<23:54, 1.39it/s]
Training 1/3 epoch (loss 2.5335): 3%|β | 53/2046 [00:42<23:54, 1.39it/s]
Training 1/3 epoch (loss 2.5335): 3%|β | 54/2046 [00:42<23:00, 1.44it/s]
Training 1/3 epoch (loss 2.2166): 3%|β | 54/2046 [00:42<23:00, 1.44it/s]
Training 1/3 epoch (loss 2.2166): 3%|β | 55/2046 [00:42<22:17, 1.49it/s]
Training 1/3 epoch (loss 2.1811): 3%|β | 55/2046 [00:43<22:17, 1.49it/s]
Training 1/3 epoch (loss 2.1811): 3%|β | 56/2046 [00:43<25:14, 1.31it/s]
Training 1/3 epoch (loss 1.9834): 3%|β | 56/2046 [00:44<25:14, 1.31it/s]
Training 1/3 epoch (loss 1.9834): 3%|β | 57/2046 [00:44<23:52, 1.39it/s]
Training 1/3 epoch (loss 2.0621): 3%|β | 57/2046 [00:45<23:52, 1.39it/s]
Training 1/3 epoch (loss 2.0621): 3%|β | 58/2046 [00:45<22:59, 1.44it/s]
Training 1/3 epoch (loss 2.1041): 3%|β | 58/2046 [00:45<22:59, 1.44it/s]
Training 1/3 epoch (loss 2.1041): 3%|β | 59/2046 [00:45<22:36, 1.46it/s]
Training 1/3 epoch (loss 2.1798): 3%|β | 59/2046 [00:46<22:36, 1.46it/s]
Training 1/3 epoch (loss 2.1798): 3%|β | 60/2046 [00:46<22:42, 1.46it/s]
Training 1/3 epoch (loss 2.1110): 3%|β | 60/2046 [00:47<22:42, 1.46it/s]
Training 1/3 epoch (loss 2.1110): 3%|β | 61/2046 [00:47<22:04, 1.50it/s]
Training 1/3 epoch (loss 2.1582): 3%|β | 61/2046 [00:47<22:04, 1.50it/s]
Training 1/3 epoch (loss 2.1582): 3%|β | 62/2046 [00:47<21:26, 1.54it/s]
Training 1/3 epoch (loss 2.3361): 3%|β | 62/2046 [00:48<21:26, 1.54it/s]
Training 1/3 epoch (loss 2.3361): 3%|β | 63/2046 [00:48<21:21, 1.55it/s]
Training 1/3 epoch (loss 2.3463): 3%|β | 63/2046 [00:49<21:21, 1.55it/s]
Training 1/3 epoch (loss 2.3463): 3%|β | 64/2046 [00:49<25:11, 1.31it/s]
Training 1/3 epoch (loss 2.1533): 3%|β | 64/2046 [00:50<25:11, 1.31it/s]
Training 1/3 epoch (loss 2.1533): 3%|β | 65/2046 [00:50<26:24, 1.25it/s]
Training 1/3 epoch (loss 1.9320): 3%|β | 65/2046 [00:51<26:24, 1.25it/s]
Training 1/3 epoch (loss 1.9320): 3%|β | 66/2046 [00:51<26:10, 1.26it/s]
Training 1/3 epoch (loss 2.1937): 3%|β | 66/2046 [00:51<26:10, 1.26it/s]
Training 1/3 epoch (loss 2.1937): 3%|β | 67/2046 [00:51<25:10, 1.31it/s]
Training 1/3 epoch (loss 1.9125): 3%|β | 67/2046 [00:52<25:10, 1.31it/s]
Training 1/3 epoch (loss 1.9125): 3%|β | 68/2046 [00:52<23:52, 1.38it/s]
Training 1/3 epoch (loss 2.1394): 3%|β | 68/2046 [00:53<23:52, 1.38it/s]
Training 1/3 epoch (loss 2.1394): 3%|β | 69/2046 [00:53<24:14, 1.36it/s]
Training 1/3 epoch (loss 1.8619): 3%|β | 69/2046 [00:53<24:14, 1.36it/s]
Training 1/3 epoch (loss 1.8619): 3%|β | 70/2046 [00:53<24:12, 1.36it/s]
Training 1/3 epoch (loss 1.9095): 3%|β | 70/2046 [00:54<24:12, 1.36it/s]
Training 1/3 epoch (loss 1.9095): 3%|β | 71/2046 [00:54<24:43, 1.33it/s]
Training 1/3 epoch (loss 1.8992): 3%|β | 71/2046 [00:55<24:43, 1.33it/s]
Training 1/3 epoch (loss 1.8992): 4%|β | 72/2046 [00:55<27:11, 1.21it/s]
Training 1/3 epoch (loss 1.8518): 4%|β | 72/2046 [00:56<27:11, 1.21it/s]
Training 1/3 epoch (loss 1.8518): 4%|β | 73/2046 [00:56<25:24, 1.29it/s]
Training 1/3 epoch (loss 2.0078): 4%|β | 73/2046 [00:57<25:24, 1.29it/s]
Training 1/3 epoch (loss 2.0078): 4%|β | 74/2046 [00:57<24:24, 1.35it/s]
Training 1/3 epoch (loss 1.5901): 4%|β | 74/2046 [00:57<24:24, 1.35it/s]
Training 1/3 epoch (loss 1.5901): 4%|β | 75/2046 [00:57<23:21, 1.41it/s]
Training 1/3 epoch (loss 1.6983): 4%|β | 75/2046 [00:58<23:21, 1.41it/s]
Training 1/3 epoch (loss 1.6983): 4%|β | 76/2046 [00:58<22:34, 1.45it/s]
Training 1/3 epoch (loss 2.0514): 4%|β | 76/2046 [00:58<22:34, 1.45it/s]
Training 1/3 epoch (loss 2.0514): 4%|β | 77/2046 [00:58<22:42, 1.45it/s]
Training 1/3 epoch (loss 1.8548): 4%|β | 77/2046 [00:59<22:42, 1.45it/s]
Training 1/3 epoch (loss 1.8548): 4%|β | 78/2046 [00:59<23:49, 1.38it/s]
Training 1/3 epoch (loss 1.9297): 4%|β | 78/2046 [01:00<23:49, 1.38it/s]
Training 1/3 epoch (loss 1.9297): 4%|β | 79/2046 [01:00<24:20, 1.35it/s]
Training 1/3 epoch (loss 1.5682): 4%|β | 79/2046 [01:01<24:20, 1.35it/s]
Training 1/3 epoch (loss 1.5682): 4%|β | 80/2046 [01:01<25:18, 1.29it/s]
Training 1/3 epoch (loss 1.5060): 4%|β | 80/2046 [01:02<25:18, 1.29it/s]
Training 1/3 epoch (loss 1.5060): 4%|β | 81/2046 [01:02<25:20, 1.29it/s]
Training 1/3 epoch (loss 1.9127): 4%|β | 81/2046 [01:02<25:20, 1.29it/s]
Training 1/3 epoch (loss 1.9127): 4%|β | 82/2046 [01:02<24:01, 1.36it/s]
Training 1/3 epoch (loss 1.6528): 4%|β | 82/2046 [01:03<24:01, 1.36it/s]
Training 1/3 epoch (loss 1.6528): 4%|β | 83/2046 [01:03<24:16, 1.35it/s]
Training 1/3 epoch (loss 1.4918): 4%|β | 83/2046 [01:04<24:16, 1.35it/s]
Training 1/3 epoch (loss 1.4918): 4%|β | 84/2046 [01:04<24:05, 1.36it/s]
Training 1/3 epoch (loss 1.6846): 4%|β | 84/2046 [01:04<24:05, 1.36it/s]
Training 1/3 epoch (loss 1.6846): 4%|β | 85/2046 [01:04<22:50, 1.43it/s]
Training 1/3 epoch (loss 1.9137): 4%|β | 85/2046 [01:05<22:50, 1.43it/s]
Training 1/3 epoch (loss 1.9137): 4%|β | 86/2046 [01:05<22:14, 1.47it/s]
Training 1/3 epoch (loss 1.6747): 4%|β | 86/2046 [01:06<22:14, 1.47it/s]
Training 1/3 epoch (loss 1.6747): 4%|β | 87/2046 [01:06<22:40, 1.44it/s]
Training 1/3 epoch (loss 1.8170): 4%|β | 87/2046 [01:07<22:40, 1.44it/s]
Training 1/3 epoch (loss 1.8170): 4%|β | 88/2046 [01:07<23:37, 1.38it/s]
Training 1/3 epoch (loss 1.5688): 4%|β | 88/2046 [01:07<23:37, 1.38it/s]
Training 1/3 epoch (loss 1.5688): 4%|β | 89/2046 [01:07<22:41, 1.44it/s]
Training 1/3 epoch (loss 1.6788): 4%|β | 89/2046 [01:08<22:41, 1.44it/s]
Training 1/3 epoch (loss 1.6788): 4%|β | 90/2046 [01:08<23:27, 1.39it/s]
Training 1/3 epoch (loss 1.5623): 4%|β | 90/2046 [01:09<23:27, 1.39it/s]
Training 1/3 epoch (loss 1.5623): 4%|β | 91/2046 [01:09<23:04, 1.41it/s]
Training 1/3 epoch (loss 1.9324): 4%|β | 91/2046 [01:09<23:04, 1.41it/s]
Training 1/3 epoch (loss 1.9324): 4%|β | 92/2046 [01:09<22:10, 1.47it/s]
Training 1/3 epoch (loss 1.8396): 4%|β | 92/2046 [01:10<22:10, 1.47it/s]
Training 1/3 epoch (loss 1.8396): 5%|β | 93/2046 [01:10<22:47, 1.43it/s]
Training 1/3 epoch (loss 1.5482): 5%|β | 93/2046 [01:11<22:47, 1.43it/s]
Training 1/3 epoch (loss 1.5482): 5%|β | 94/2046 [01:11<22:01, 1.48it/s]
Training 1/3 epoch (loss 1.7794): 5%|β | 94/2046 [01:11<22:01, 1.48it/s]
Training 1/3 epoch (loss 1.7794): 5%|β | 95/2046 [01:11<21:30, 1.51it/s]
Training 1/3 epoch (loss 1.5769): 5%|β | 95/2046 [01:12<21:30, 1.51it/s]
Training 1/3 epoch (loss 1.5769): 5%|β | 96/2046 [01:12<22:22, 1.45it/s]
Training 1/3 epoch (loss 1.5153): 5%|β | 96/2046 [01:13<22:22, 1.45it/s]
Training 1/3 epoch (loss 1.5153): 5%|β | 97/2046 [01:13<22:00, 1.48it/s]
Training 1/3 epoch (loss 1.4026): 5%|β | 97/2046 [01:13<22:00, 1.48it/s]
Training 1/3 epoch (loss 1.4026): 5%|β | 98/2046 [01:13<22:12, 1.46it/s]
Training 1/3 epoch (loss 1.7407): 5%|β | 98/2046 [01:14<22:12, 1.46it/s]
Training 1/3 epoch (loss 1.7407): 5%|β | 99/2046 [01:14<22:20, 1.45it/s]
Training 1/3 epoch (loss 1.7623): 5%|β | 99/2046 [01:15<22:20, 1.45it/s]
Training 1/3 epoch (loss 1.7623): 5%|β | 100/2046 [01:15<21:39, 1.50it/s]
Training 1/3 epoch (loss 1.6184): 5%|β | 100/2046 [01:15<21:39, 1.50it/s]
Training 1/3 epoch (loss 1.6184): 5%|β | 101/2046 [01:15<21:15, 1.52it/s]
Training 1/3 epoch (loss 1.5424): 5%|β | 101/2046 [01:16<21:15, 1.52it/s]
Training 1/3 epoch (loss 1.5424): 5%|β | 102/2046 [01:16<21:23, 1.51it/s]
Training 1/3 epoch (loss 1.4348): 5%|β | 102/2046 [01:17<21:23, 1.51it/s]
Training 1/3 epoch (loss 1.4348): 5%|β | 103/2046 [01:17<20:53, 1.55it/s]
Training 1/3 epoch (loss 1.9613): 5%|β | 103/2046 [01:17<20:53, 1.55it/s]
Training 1/3 epoch (loss 1.9613): 5%|β | 104/2046 [01:17<21:48, 1.48it/s]
Training 1/3 epoch (loss 1.6269): 5%|β | 104/2046 [01:18<21:48, 1.48it/s]
Training 1/3 epoch (loss 1.6269): 5%|β | 105/2046 [01:18<21:25, 1.51it/s]
Training 1/3 epoch (loss 1.3781): 5%|β | 105/2046 [01:19<21:25, 1.51it/s]
Training 1/3 epoch (loss 1.3781): 5%|β | 106/2046 [01:19<23:41, 1.37it/s]
Training 1/3 epoch (loss 1.6137): 5%|β | 106/2046 [01:20<23:41, 1.37it/s]
Training 1/3 epoch (loss 1.6137): 5%|β | 107/2046 [01:20<24:18, 1.33it/s]
Training 1/3 epoch (loss 1.7572): 5%|β | 107/2046 [01:20<24:18, 1.33it/s]
Training 1/3 epoch (loss 1.7572): 5%|β | 108/2046 [01:20<23:55, 1.35it/s]
Training 1/3 epoch (loss 1.5703): 5%|β | 108/2046 [01:21<23:55, 1.35it/s]
Training 1/3 epoch (loss 1.5703): 5%|β | 109/2046 [01:21<26:13, 1.23it/s]
Training 1/3 epoch (loss 1.4569): 5%|β | 109/2046 [01:22<26:13, 1.23it/s]
Training 1/3 epoch (loss 1.4569): 5%|β | 110/2046 [01:22<24:27, 1.32it/s]
Training 1/3 epoch (loss 1.4305): 5%|β | 110/2046 [01:23<24:27, 1.32it/s]
Training 1/3 epoch (loss 1.4305): 5%|β | 111/2046 [01:23<24:43, 1.30it/s]
Training 1/3 epoch (loss 1.5009): 5%|β | 111/2046 [01:24<24:43, 1.30it/s]
Training 1/3 epoch (loss 1.5009): 5%|β | 112/2046 [01:24<29:48, 1.08it/s]
Training 1/3 epoch (loss 1.3970): 5%|β | 112/2046 [01:25<29:48, 1.08it/s]
Training 1/3 epoch (loss 1.3970): 6%|β | 113/2046 [01:25<29:32, 1.09it/s]
Training 1/3 epoch (loss 1.4121): 6%|β | 113/2046 [01:26<29:32, 1.09it/s]
Training 1/3 epoch (loss 1.4121): 6%|β | 114/2046 [01:26<29:06, 1.11it/s]
Training 1/3 epoch (loss 1.4365): 6%|β | 114/2046 [01:27<29:06, 1.11it/s]
Training 1/3 epoch (loss 1.4365): 6%|β | 115/2046 [01:27<26:34, 1.21it/s]
Training 1/3 epoch (loss 1.3798): 6%|β | 115/2046 [01:27<26:34, 1.21it/s]
Training 1/3 epoch (loss 1.3798): 6%|β | 116/2046 [01:27<24:27, 1.31it/s]
Training 1/3 epoch (loss 1.5766): 6%|β | 116/2046 [01:28<24:27, 1.31it/s]
Training 1/3 epoch (loss 1.5766): 6%|β | 117/2046 [01:28<23:34, 1.36it/s]
Training 1/3 epoch (loss 1.5908): 6%|β | 117/2046 [01:28<23:34, 1.36it/s]
Training 1/3 epoch (loss 1.5908): 6%|β | 118/2046 [01:28<22:34, 1.42it/s]
Training 1/3 epoch (loss 1.7120): 6%|β | 118/2046 [01:29<22:34, 1.42it/s]
Training 1/3 epoch (loss 1.7120): 6%|β | 119/2046 [01:29<21:49, 1.47it/s]
Training 1/3 epoch (loss 1.6359): 6%|β | 119/2046 [01:30<21:49, 1.47it/s]
Training 1/3 epoch (loss 1.6359): 6%|β | 120/2046 [01:30<22:50, 1.41it/s]
Training 1/3 epoch (loss 1.4749): 6%|β | 120/2046 [01:30<22:50, 1.41it/s]
Training 1/3 epoch (loss 1.4749): 6%|β | 121/2046 [01:30<22:17, 1.44it/s]
Training 1/3 epoch (loss 1.3340): 6%|β | 121/2046 [01:31<22:17, 1.44it/s]
Training 1/3 epoch (loss 1.3340): 6%|β | 122/2046 [01:31<24:04, 1.33it/s]
Training 1/3 epoch (loss 1.7749): 6%|β | 122/2046 [01:32<24:04, 1.33it/s]
Training 1/3 epoch (loss 1.7749): 6%|β | 123/2046 [01:32<23:12, 1.38it/s]
Training 1/3 epoch (loss 1.5335): 6%|β | 123/2046 [01:33<23:12, 1.38it/s]
Training 1/3 epoch (loss 1.5335): 6%|β | 124/2046 [01:33<22:16, 1.44it/s]
Training 1/3 epoch (loss 1.4331): 6%|β | 124/2046 [01:33<22:16, 1.44it/s]
Training 1/3 epoch (loss 1.4331): 6%|β | 125/2046 [01:33<21:44, 1.47it/s]
Training 1/3 epoch (loss 1.5367): 6%|β | 125/2046 [01:34<21:44, 1.47it/s]
Training 1/3 epoch (loss 1.5367): 6%|β | 126/2046 [01:34<21:56, 1.46it/s]
Training 1/3 epoch (loss 1.5226): 6%|β | 126/2046 [01:35<21:56, 1.46it/s]
Training 1/3 epoch (loss 1.5226): 6%|β | 127/2046 [01:35<21:54, 1.46it/s]
Training 1/3 epoch (loss 1.4910): 6%|β | 127/2046 [01:36<21:54, 1.46it/s]
Training 1/3 epoch (loss 1.4910): 6%|β | 128/2046 [01:36<23:30, 1.36it/s]
Training 1/3 epoch (loss 1.7307): 6%|β | 128/2046 [01:36<23:30, 1.36it/s]
Training 1/3 epoch (loss 1.7307): 6%|β | 129/2046 [01:36<23:05, 1.38it/s]
Training 1/3 epoch (loss 1.3358): 6%|β | 129/2046 [01:37<23:05, 1.38it/s]
Training 1/3 epoch (loss 1.3358): 6%|β | 130/2046 [01:37<21:56, 1.45it/s]
Training 1/3 epoch (loss 1.5300): 6%|β | 130/2046 [01:38<21:56, 1.45it/s]
Training 1/3 epoch (loss 1.5300): 6%|β | 131/2046 [01:38<21:50, 1.46it/s]
Training 1/3 epoch (loss 1.3557): 6%|β | 131/2046 [01:38<21:50, 1.46it/s]
Training 1/3 epoch (loss 1.3557): 6%|β | 132/2046 [01:38<22:16, 1.43it/s]
Training 1/3 epoch (loss 1.3808): 6%|β | 132/2046 [01:39<22:16, 1.43it/s]
Training 1/3 epoch (loss 1.3808): 7%|β | 133/2046 [01:39<21:43, 1.47it/s]
Training 1/3 epoch (loss 1.3888): 7%|β | 133/2046 [01:40<21:43, 1.47it/s]
Training 1/3 epoch (loss 1.3888): 7%|β | 134/2046 [01:40<21:17, 1.50it/s]
Training 1/3 epoch (loss 1.4713): 7%|β | 134/2046 [01:40<21:17, 1.50it/s]
Training 1/3 epoch (loss 1.4713): 7%|β | 135/2046 [01:40<21:00, 1.52it/s]
Training 1/3 epoch (loss 1.4416): 7%|β | 135/2046 [01:41<21:00, 1.52it/s]
Training 1/3 epoch (loss 1.4416): 7%|β | 136/2046 [01:41<22:41, 1.40it/s]
Training 1/3 epoch (loss 1.5809): 7%|β | 136/2046 [01:42<22:41, 1.40it/s]
Training 1/3 epoch (loss 1.5809): 7%|β | 137/2046 [01:42<23:28, 1.36it/s]
Training 1/3 epoch (loss 1.5188): 7%|β | 137/2046 [01:42<23:28, 1.36it/s]
Training 1/3 epoch (loss 1.5188): 7%|β | 138/2046 [01:42<22:22, 1.42it/s]
Training 1/3 epoch (loss 1.5678): 7%|β | 138/2046 [01:43<22:22, 1.42it/s]
Training 1/3 epoch (loss 1.5678): 7%|β | 139/2046 [01:43<23:26, 1.36it/s]
Training 1/3 epoch (loss 1.5842): 7%|β | 139/2046 [01:44<23:26, 1.36it/s]
Training 1/3 epoch (loss 1.5842): 7%|β | 140/2046 [01:44<22:50, 1.39it/s]
Training 1/3 epoch (loss 1.4174): 7%|β | 140/2046 [01:45<22:50, 1.39it/s]
Training 1/3 epoch (loss 1.4174): 7%|β | 141/2046 [01:45<22:37, 1.40it/s]
Training 1/3 epoch (loss 1.5955): 7%|β | 141/2046 [01:45<22:37, 1.40it/s]
Training 1/3 epoch (loss 1.5955): 7%|β | 142/2046 [01:45<22:28, 1.41it/s]
Training 1/3 epoch (loss 1.5384): 7%|β | 142/2046 [01:46<22:28, 1.41it/s]
Training 1/3 epoch (loss 1.5384): 7%|β | 143/2046 [01:46<21:52, 1.45it/s]
Training 1/3 epoch (loss 1.3263): 7%|β | 143/2046 [01:47<21:52, 1.45it/s]
Training 1/3 epoch (loss 1.3263): 7%|β | 144/2046 [01:47<22:23, 1.42it/s]
Training 1/3 epoch (loss 1.6251): 7%|β | 144/2046 [01:47<22:23, 1.42it/s]
Training 1/3 epoch (loss 1.6251): 7%|β | 145/2046 [01:47<22:05, 1.43it/s]
Training 1/3 epoch (loss 1.4112): 7%|β | 145/2046 [01:48<22:05, 1.43it/s]
Training 1/3 epoch (loss 1.4112): 7%|β | 146/2046 [01:48<22:16, 1.42it/s]
Training 1/3 epoch (loss 1.0504): 7%|β | 146/2046 [01:49<22:16, 1.42it/s]
Training 1/3 epoch (loss 1.0504): 7%|β | 147/2046 [01:49<21:28, 1.47it/s]
Training 1/3 epoch (loss 1.5499): 7%|β | 147/2046 [01:50<21:28, 1.47it/s]
Training 1/3 epoch (loss 1.5499): 7%|β | 148/2046 [01:50<23:26, 1.35it/s]
Training 1/3 epoch (loss 1.3259): 7%|β | 148/2046 [01:50<23:26, 1.35it/s]
Training 1/3 epoch (loss 1.3259): 7%|β | 149/2046 [01:50<24:04, 1.31it/s]
Training 1/3 epoch (loss 1.3605): 7%|β | 149/2046 [01:51<24:04, 1.31it/s]
Training 1/3 epoch (loss 1.3605): 7%|β | 150/2046 [01:51<22:40, 1.39it/s]
Training 1/3 epoch (loss 1.5252): 7%|β | 150/2046 [01:52<22:40, 1.39it/s]
Training 1/3 epoch (loss 1.5252): 7%|β | 151/2046 [01:52<22:23, 1.41it/s]
Training 1/3 epoch (loss 1.5486): 7%|β | 151/2046 [01:53<22:23, 1.41it/s]
Training 1/3 epoch (loss 1.5486): 7%|β | 152/2046 [01:53<23:21, 1.35it/s]
Training 1/3 epoch (loss 1.1668): 7%|β | 152/2046 [01:53<23:21, 1.35it/s]
Training 1/3 epoch (loss 1.1668): 7%|β | 153/2046 [01:53<23:41, 1.33it/s]
Training 1/3 epoch (loss 1.4318): 7%|β | 153/2046 [01:54<23:41, 1.33it/s]
Training 1/3 epoch (loss 1.4318): 8%|β | 154/2046 [01:54<25:54, 1.22it/s]
Training 1/3 epoch (loss 1.3173): 8%|β | 154/2046 [01:55<25:54, 1.22it/s]
Training 1/3 epoch (loss 1.3173): 8%|β | 155/2046 [01:55<25:18, 1.25it/s]
Training 1/3 epoch (loss 1.5000): 8%|β | 155/2046 [01:56<25:18, 1.25it/s]
Training 1/3 epoch (loss 1.5000): 8%|β | 156/2046 [01:56<25:42, 1.22it/s]
Training 1/3 epoch (loss 1.4903): 8%|β | 156/2046 [01:57<25:42, 1.22it/s]
Training 1/3 epoch (loss 1.4903): 8%|β | 157/2046 [01:57<24:29, 1.29it/s]
Training 1/3 epoch (loss 1.3344): 8%|β | 157/2046 [01:57<24:29, 1.29it/s]
Training 1/3 epoch (loss 1.3344): 8%|β | 158/2046 [01:57<23:23, 1.35it/s]
Training 1/3 epoch (loss 1.5798): 8%|β | 158/2046 [01:58<23:23, 1.35it/s]
Training 1/3 epoch (loss 1.5798): 8%|β | 159/2046 [01:58<22:20, 1.41it/s]
Training 1/3 epoch (loss 1.2561): 8%|β | 159/2046 [01:59<22:20, 1.41it/s]
Training 1/3 epoch (loss 1.2561): 8%|β | 160/2046 [01:59<24:12, 1.30it/s]
Training 1/3 epoch (loss 1.4311): 8%|β | 160/2046 [01:59<24:12, 1.30it/s]
Training 1/3 epoch (loss 1.4311): 8%|β | 161/2046 [01:59<23:05, 1.36it/s]
Training 1/3 epoch (loss 1.3758): 8%|β | 161/2046 [02:00<23:05, 1.36it/s]
Training 1/3 epoch (loss 1.3758): 8%|β | 162/2046 [02:00<22:03, 1.42it/s]
Training 1/3 epoch (loss 1.4214): 8%|β | 162/2046 [02:01<22:03, 1.42it/s]
Training 1/3 epoch (loss 1.4214): 8%|β | 163/2046 [02:01<24:09, 1.30it/s]
Training 1/3 epoch (loss 1.5814): 8%|β | 163/2046 [02:02<24:09, 1.30it/s]
Training 1/3 epoch (loss 1.5814): 8%|β | 164/2046 [02:02<23:27, 1.34it/s]
Training 1/3 epoch (loss 1.3266): 8%|β | 164/2046 [02:02<23:27, 1.34it/s]
Training 1/3 epoch (loss 1.3266): 8%|β | 165/2046 [02:02<22:14, 1.41it/s]
Training 1/3 epoch (loss 1.4413): 8%|β | 165/2046 [02:03<22:14, 1.41it/s]
Training 1/3 epoch (loss 1.4413): 8%|β | 166/2046 [02:03<21:19, 1.47it/s]
Training 1/3 epoch (loss 1.3155): 8%|β | 166/2046 [02:04<21:19, 1.47it/s]
Training 1/3 epoch (loss 1.3155): 8%|β | 167/2046 [02:04<22:11, 1.41it/s]
Training 1/3 epoch (loss 1.3063): 8%|β | 167/2046 [02:05<22:11, 1.41it/s]
Training 1/3 epoch (loss 1.3063): 8%|β | 168/2046 [02:05<23:02, 1.36it/s]
Training 1/3 epoch (loss 1.3395): 8%|β | 168/2046 [02:05<23:02, 1.36it/s]
Training 1/3 epoch (loss 1.3395): 8%|β | 169/2046 [02:05<23:31, 1.33it/s]
Training 1/3 epoch (loss 1.3829): 8%|β | 169/2046 [02:06<23:31, 1.33it/s]
Training 1/3 epoch (loss 1.3829): 8%|β | 170/2046 [02:06<23:21, 1.34it/s]
Training 1/3 epoch (loss 1.4747): 8%|β | 170/2046 [02:07<23:21, 1.34it/s]
Training 1/3 epoch (loss 1.4747): 8%|β | 171/2046 [02:07<22:08, 1.41it/s]
Training 1/3 epoch (loss 1.1424): 8%|β | 171/2046 [02:07<22:08, 1.41it/s]
Training 1/3 epoch (loss 1.1424): 8%|β | 172/2046 [02:07<21:33, 1.45it/s]
Training 1/3 epoch (loss 1.5238): 8%|β | 172/2046 [02:08<21:33, 1.45it/s]
Training 1/3 epoch (loss 1.5238): 8%|β | 173/2046 [02:08<21:47, 1.43it/s]
Training 1/3 epoch (loss 1.3076): 8%|β | 173/2046 [02:09<21:47, 1.43it/s]
Training 1/3 epoch (loss 1.3076): 9%|β | 174/2046 [02:09<21:09, 1.48it/s]
Training 1/3 epoch (loss 1.4967): 9%|β | 174/2046 [02:09<21:09, 1.48it/s]
Training 1/3 epoch (loss 1.4967): 9%|β | 175/2046 [02:09<20:56, 1.49it/s]
Training 1/3 epoch (loss 1.4796): 9%|β | 175/2046 [02:10<20:56, 1.49it/s]
Training 1/3 epoch (loss 1.4796): 9%|β | 176/2046 [02:10<21:46, 1.43it/s]
Training 1/3 epoch (loss 1.3885): 9%|β | 176/2046 [02:11<21:46, 1.43it/s]
Training 1/3 epoch (loss 1.3885): 9%|β | 177/2046 [02:11<21:10, 1.47it/s]
Training 1/3 epoch (loss 1.2703): 9%|β | 177/2046 [02:11<21:10, 1.47it/s]
Training 1/3 epoch (loss 1.2703): 9%|β | 178/2046 [02:11<20:41, 1.50it/s]
Training 1/3 epoch (loss 1.3816): 9%|β | 178/2046 [02:12<20:41, 1.50it/s]
Training 1/3 epoch (loss 1.3816): 9%|β | 179/2046 [02:12<21:05, 1.48it/s]
Training 1/3 epoch (loss 1.5188): 9%|β | 179/2046 [02:13<21:05, 1.48it/s]
Training 1/3 epoch (loss 1.5188): 9%|β | 180/2046 [02:13<21:02, 1.48it/s]
Training 1/3 epoch (loss 1.3911): 9%|β | 180/2046 [02:13<21:02, 1.48it/s]
Training 1/3 epoch (loss 1.3911): 9%|β | 181/2046 [02:13<20:42, 1.50it/s]
Training 1/3 epoch (loss 1.4642): 9%|β | 181/2046 [02:14<20:42, 1.50it/s]
Training 1/3 epoch (loss 1.4642): 9%|β | 182/2046 [02:14<20:30, 1.51it/s]
Training 1/3 epoch (loss 1.4255): 9%|β | 182/2046 [02:15<20:30, 1.51it/s]
Training 1/3 epoch (loss 1.4255): 9%|β | 183/2046 [02:15<21:03, 1.47it/s]
Training 1/3 epoch (loss 1.6850): 9%|β | 183/2046 [02:16<21:03, 1.47it/s]
Training 1/3 epoch (loss 1.6850): 9%|β | 184/2046 [02:16<23:07, 1.34it/s]
Training 1/3 epoch (loss 1.4697): 9%|β | 184/2046 [02:16<23:07, 1.34it/s]
Training 1/3 epoch (loss 1.4697): 9%|β | 185/2046 [02:16<22:03, 1.41it/s]
Training 1/3 epoch (loss 1.5004): 9%|β | 185/2046 [02:17<22:03, 1.41it/s]
Training 1/3 epoch (loss 1.5004): 9%|β | 186/2046 [02:17<21:07, 1.47it/s]
Training 1/3 epoch (loss 1.2026): 9%|β | 186/2046 [02:18<21:07, 1.47it/s]
Training 1/3 epoch (loss 1.2026): 9%|β | 187/2046 [02:18<20:46, 1.49it/s]
Training 1/3 epoch (loss 1.3547): 9%|β | 187/2046 [02:18<20:46, 1.49it/s]
Training 1/3 epoch (loss 1.3547): 9%|β | 188/2046 [02:18<22:41, 1.36it/s]
Training 1/3 epoch (loss 1.5992): 9%|β | 188/2046 [02:19<22:41, 1.36it/s]
Training 1/3 epoch (loss 1.5992): 9%|β | 189/2046 [02:19<22:45, 1.36it/s]
Training 1/3 epoch (loss 1.2417): 9%|β | 189/2046 [02:20<22:45, 1.36it/s]
Training 1/3 epoch (loss 1.2417): 9%|β | 190/2046 [02:20<24:57, 1.24it/s]
Training 1/3 epoch (loss 1.2647): 9%|β | 190/2046 [02:21<24:57, 1.24it/s]
Training 1/3 epoch (loss 1.2647): 9%|β | 191/2046 [02:21<23:17, 1.33it/s]
Training 1/3 epoch (loss 1.3083): 9%|β | 191/2046 [02:21<23:17, 1.33it/s]
Training 1/3 epoch (loss 1.3083): 9%|β | 192/2046 [02:21<23:16, 1.33it/s]
Training 1/3 epoch (loss 1.5443): 9%|β | 192/2046 [02:22<23:16, 1.33it/s]
Training 1/3 epoch (loss 1.5443): 9%|β | 193/2046 [02:22<22:56, 1.35it/s]
Training 1/3 epoch (loss 1.4399): 9%|β | 193/2046 [02:23<22:56, 1.35it/s]
Training 1/3 epoch (loss 1.4399): 9%|β | 194/2046 [02:23<24:15, 1.27it/s]
Training 1/3 epoch (loss 1.7163): 9%|β | 194/2046 [02:24<24:15, 1.27it/s]
Training 1/3 epoch (loss 1.7163): 10%|β | 195/2046 [02:24<23:39, 1.30it/s]
Training 1/3 epoch (loss 1.4852): 10%|β | 195/2046 [02:24<23:39, 1.30it/s]
Training 1/3 epoch (loss 1.4852): 10%|β | 196/2046 [02:24<22:11, 1.39it/s]
Training 1/3 epoch (loss 1.4968): 10%|β | 196/2046 [02:25<22:11, 1.39it/s]
Training 1/3 epoch (loss 1.4968): 10%|β | 197/2046 [02:25<21:30, 1.43it/s]
Training 1/3 epoch (loss 1.5864): 10%|β | 197/2046 [02:26<21:30, 1.43it/s]
Training 1/3 epoch (loss 1.5864): 10%|β | 198/2046 [02:26<20:42, 1.49it/s]
Training 1/3 epoch (loss 1.4679): 10%|β | 198/2046 [02:27<20:42, 1.49it/s]
Training 1/3 epoch (loss 1.4679): 10%|β | 199/2046 [02:27<22:18, 1.38it/s]
Training 1/3 epoch (loss 1.4650): 10%|β | 199/2046 [02:27<22:18, 1.38it/s]
Training 1/3 epoch (loss 1.4650): 10%|β | 200/2046 [02:27<22:44, 1.35it/s]
Training 1/3 epoch (loss 1.3490): 10%|β | 200/2046 [02:28<22:44, 1.35it/s]
Training 1/3 epoch (loss 1.3490): 10%|β | 201/2046 [02:28<22:22, 1.37it/s]
Training 1/3 epoch (loss 1.4909): 10%|β | 201/2046 [02:29<22:22, 1.37it/s]
Training 1/3 epoch (loss 1.4909): 10%|β | 202/2046 [02:29<24:01, 1.28it/s]
Training 1/3 epoch (loss 1.3978): 10%|β | 202/2046 [02:30<24:01, 1.28it/s]
Training 1/3 epoch (loss 1.3978): 10%|β | 203/2046 [02:30<22:28, 1.37it/s]
Training 1/3 epoch (loss 1.3879): 10%|β | 203/2046 [02:30<22:28, 1.37it/s]
Training 1/3 epoch (loss 1.3879): 10%|β | 204/2046 [02:30<21:59, 1.40it/s]
Training 1/3 epoch (loss 1.3053): 10%|β | 204/2046 [02:31<21:59, 1.40it/s]
Training 1/3 epoch (loss 1.3053): 10%|β | 205/2046 [02:31<22:12, 1.38it/s]
Training 1/3 epoch (loss 1.5173): 10%|β | 205/2046 [02:32<22:12, 1.38it/s]
Training 1/3 epoch (loss 1.5173): 10%|β | 206/2046 [02:32<21:48, 1.41it/s]
Training 1/3 epoch (loss 1.4618): 10%|β | 206/2046 [02:32<21:48, 1.41it/s]
Training 1/3 epoch (loss 1.4618): 10%|β | 207/2046 [02:32<21:18, 1.44it/s]
Training 1/3 epoch (loss 1.2698): 10%|β | 207/2046 [02:33<21:18, 1.44it/s]
Training 1/3 epoch (loss 1.2698): 10%|β | 208/2046 [02:33<23:33, 1.30it/s]
Training 1/3 epoch (loss 1.4725): 10%|β | 208/2046 [02:34<23:33, 1.30it/s]
Training 1/3 epoch (loss 1.4725): 10%|β | 209/2046 [02:34<23:04, 1.33it/s]
Training 1/3 epoch (loss 1.2364): 10%|β | 209/2046 [02:35<23:04, 1.33it/s]
Training 1/3 epoch (loss 1.2364): 10%|β | 210/2046 [02:35<23:04, 1.33it/s]
Training 1/3 epoch (loss 1.3489): 10%|β | 210/2046 [02:35<23:04, 1.33it/s]
Training 1/3 epoch (loss 1.3489): 10%|β | 211/2046 [02:35<23:29, 1.30it/s]
Training 1/3 epoch (loss 1.5059): 10%|β | 211/2046 [02:36<23:29, 1.30it/s]
Training 1/3 epoch (loss 1.5059): 10%|β | 212/2046 [02:36<22:47, 1.34it/s]
Training 1/3 epoch (loss 1.4334): 10%|β | 212/2046 [02:37<22:47, 1.34it/s]
Training 1/3 epoch (loss 1.4334): 10%|β | 213/2046 [02:37<21:36, 1.41it/s]
Training 1/3 epoch (loss 1.2644): 10%|β | 213/2046 [02:37<21:36, 1.41it/s]
Training 1/3 epoch (loss 1.2644): 10%|β | 214/2046 [02:37<20:43, 1.47it/s]
Training 1/3 epoch (loss 1.4753): 10%|β | 214/2046 [02:38<20:43, 1.47it/s]
Training 1/3 epoch (loss 1.4753): 11%|β | 215/2046 [02:38<21:42, 1.41it/s]
Training 1/3 epoch (loss 1.3028): 11%|β | 215/2046 [02:39<21:42, 1.41it/s]
Training 1/3 epoch (loss 1.3028): 11%|β | 216/2046 [02:39<22:03, 1.38it/s]
Training 1/3 epoch (loss 1.3925): 11%|β | 216/2046 [02:40<22:03, 1.38it/s]
Training 1/3 epoch (loss 1.3925): 11%|β | 217/2046 [02:40<21:04, 1.45it/s]
Training 1/3 epoch (loss 1.3674): 11%|β | 217/2046 [02:40<21:04, 1.45it/s]
Training 1/3 epoch (loss 1.3674): 11%|β | 218/2046 [02:40<20:59, 1.45it/s]
Training 1/3 epoch (loss 1.2668): 11%|β | 218/2046 [02:41<20:59, 1.45it/s]
Training 1/3 epoch (loss 1.2668): 11%|β | 219/2046 [02:41<20:23, 1.49it/s]
Training 1/3 epoch (loss 1.1741): 11%|β | 219/2046 [02:42<20:23, 1.49it/s]
Training 1/3 epoch (loss 1.1741): 11%|β | 220/2046 [02:42<20:09, 1.51it/s]
Training 1/3 epoch (loss 1.3965): 11%|β | 220/2046 [02:42<20:09, 1.51it/s]
Training 1/3 epoch (loss 1.3965): 11%|β | 221/2046 [02:42<20:20, 1.49it/s]
Training 1/3 epoch (loss 1.4373): 11%|β | 221/2046 [02:43<20:20, 1.49it/s]
Training 1/3 epoch (loss 1.4373): 11%|β | 222/2046 [02:43<20:24, 1.49it/s]
Training 1/3 epoch (loss 1.5393): 11%|β | 222/2046 [02:44<20:24, 1.49it/s]
Training 1/3 epoch (loss 1.5393): 11%|β | 223/2046 [02:44<20:19, 1.49it/s]
Training 1/3 epoch (loss 1.3118): 11%|β | 223/2046 [02:44<20:19, 1.49it/s]
Training 1/3 epoch (loss 1.3118): 11%|β | 224/2046 [02:44<21:59, 1.38it/s]
Training 1/3 epoch (loss 1.4575): 11%|β | 224/2046 [02:45<21:59, 1.38it/s]
Training 1/3 epoch (loss 1.4575): 11%|β | 225/2046 [02:45<21:30, 1.41it/s]
Training 1/3 epoch (loss 1.4277): 11%|β | 225/2046 [02:46<21:30, 1.41it/s]
Training 1/3 epoch (loss 1.4277): 11%|β | 226/2046 [02:46<20:38, 1.47it/s]
Training 1/3 epoch (loss 1.3808): 11%|β | 226/2046 [02:46<20:38, 1.47it/s]
Training 1/3 epoch (loss 1.3808): 11%|β | 227/2046 [02:46<20:12, 1.50it/s]
Training 1/3 epoch (loss 1.4045): 11%|β | 227/2046 [02:47<20:12, 1.50it/s]
Training 1/3 epoch (loss 1.4045): 11%|β | 228/2046 [02:47<19:41, 1.54it/s]
Training 1/3 epoch (loss 1.5163): 11%|β | 228/2046 [02:48<19:41, 1.54it/s]
Training 1/3 epoch (loss 1.5163): 11%|β | 229/2046 [02:48<19:17, 1.57it/s]
Training 1/3 epoch (loss 1.4929): 11%|β | 229/2046 [02:48<19:17, 1.57it/s]
Training 1/3 epoch (loss 1.4929): 11%|β | 230/2046 [02:48<20:22, 1.49it/s]
Training 1/3 epoch (loss 1.3176): 11%|β | 230/2046 [02:49<20:22, 1.49it/s]
Training 1/3 epoch (loss 1.3176): 11%|ββ | 231/2046 [02:49<21:32, 1.40it/s]
Training 1/3 epoch (loss 1.3424): 11%|ββ | 231/2046 [02:50<21:32, 1.40it/s]
Training 1/3 epoch (loss 1.3424): 11%|ββ | 232/2046 [02:50<22:17, 1.36it/s]
Training 1/3 epoch (loss 1.5590): 11%|ββ | 232/2046 [02:51<22:17, 1.36it/s]
Training 1/3 epoch (loss 1.5590): 11%|ββ | 233/2046 [02:51<22:01, 1.37it/s]
Training 1/3 epoch (loss 1.2534): 11%|ββ | 233/2046 [02:51<22:01, 1.37it/s]
Training 1/3 epoch (loss 1.2534): 11%|ββ | 234/2046 [02:51<20:58, 1.44it/s]
Training 1/3 epoch (loss 1.3382): 11%|ββ | 234/2046 [02:52<20:58, 1.44it/s]
Training 1/3 epoch (loss 1.3382): 11%|ββ | 235/2046 [02:52<20:14, 1.49it/s]
Training 1/3 epoch (loss 1.3808): 11%|ββ | 235/2046 [02:53<20:14, 1.49it/s]
Training 1/3 epoch (loss 1.3808): 12%|ββ | 236/2046 [02:53<21:06, 1.43it/s]
Training 1/3 epoch (loss 1.3804): 12%|ββ | 236/2046 [02:53<21:06, 1.43it/s]
Training 1/3 epoch (loss 1.3804): 12%|ββ | 237/2046 [02:53<22:43, 1.33it/s]
Training 1/3 epoch (loss 1.4012): 12%|ββ | 237/2046 [02:54<22:43, 1.33it/s]
Training 1/3 epoch (loss 1.4012): 12%|ββ | 238/2046 [02:54<23:13, 1.30it/s]
Training 1/3 epoch (loss 1.5018): 12%|ββ | 238/2046 [02:55<23:13, 1.30it/s]
Training 1/3 epoch (loss 1.5018): 12%|ββ | 239/2046 [02:55<23:20, 1.29it/s]
Training 1/3 epoch (loss 1.3556): 12%|ββ | 239/2046 [02:56<23:20, 1.29it/s]
Training 1/3 epoch (loss 1.3556): 12%|ββ | 240/2046 [02:56<23:17, 1.29it/s]
Training 1/3 epoch (loss 1.4416): 12%|ββ | 240/2046 [02:57<23:17, 1.29it/s]
Training 1/3 epoch (loss 1.4416): 12%|ββ | 241/2046 [02:57<22:21, 1.35it/s]
Training 1/3 epoch (loss 1.3613): 12%|ββ | 241/2046 [02:57<22:21, 1.35it/s]
Training 1/3 epoch (loss 1.3613): 12%|ββ | 242/2046 [02:57<21:17, 1.41it/s]
Training 1/3 epoch (loss 1.3293): 12%|ββ | 242/2046 [02:58<21:17, 1.41it/s]
Training 1/3 epoch (loss 1.3293): 12%|ββ | 243/2046 [02:58<20:38, 1.46it/s]
Training 1/3 epoch (loss 1.4934): 12%|ββ | 243/2046 [02:58<20:38, 1.46it/s]
Training 1/3 epoch (loss 1.4934): 12%|ββ | 244/2046 [02:58<20:07, 1.49it/s]
Training 1/3 epoch (loss 1.4480): 12%|ββ | 244/2046 [02:59<20:07, 1.49it/s]
Training 1/3 epoch (loss 1.4480): 12%|ββ | 245/2046 [02:59<19:32, 1.54it/s]
Training 1/3 epoch (loss 1.3208): 12%|ββ | 245/2046 [03:00<19:32, 1.54it/s]
Training 1/3 epoch (loss 1.3208): 12%|ββ | 246/2046 [03:00<20:16, 1.48it/s]
Training 1/3 epoch (loss 1.3165): 12%|ββ | 246/2046 [03:00<20:16, 1.48it/s]
Training 1/3 epoch (loss 1.3165): 12%|ββ | 247/2046 [03:00<20:50, 1.44it/s]
Training 1/3 epoch (loss 1.4752): 12%|ββ | 247/2046 [03:01<20:50, 1.44it/s]
Training 1/3 epoch (loss 1.4752): 12%|ββ | 248/2046 [03:01<22:13, 1.35it/s]
Training 1/3 epoch (loss 1.1899): 12%|ββ | 248/2046 [03:02<22:13, 1.35it/s]
Training 1/3 epoch (loss 1.1899): 12%|ββ | 249/2046 [03:02<21:17, 1.41it/s]
Training 1/3 epoch (loss 1.1815): 12%|ββ | 249/2046 [03:03<21:17, 1.41it/s]
Training 1/3 epoch (loss 1.1815): 12%|ββ | 250/2046 [03:03<21:24, 1.40it/s]
Training 1/3 epoch (loss 1.4894): 12%|ββ | 250/2046 [03:03<21:24, 1.40it/s]
Training 1/3 epoch (loss 1.4894): 12%|ββ | 251/2046 [03:03<21:21, 1.40it/s]
Training 1/3 epoch (loss 1.3250): 12%|ββ | 251/2046 [03:04<21:21, 1.40it/s]
Training 1/3 epoch (loss 1.3250): 12%|ββ | 252/2046 [03:04<21:33, 1.39it/s]
Training 1/3 epoch (loss 1.4761): 12%|ββ | 252/2046 [03:05<21:33, 1.39it/s]
Training 1/3 epoch (loss 1.4761): 12%|ββ | 253/2046 [03:05<22:31, 1.33it/s]
Training 1/3 epoch (loss 1.1840): 12%|ββ | 253/2046 [03:06<22:31, 1.33it/s]
Training 1/3 epoch (loss 1.1840): 12%|ββ | 254/2046 [03:06<22:55, 1.30it/s]
Training 1/3 epoch (loss 1.2828): 12%|ββ | 254/2046 [03:06<22:55, 1.30it/s]
Training 1/3 epoch (loss 1.2828): 12%|ββ | 255/2046 [03:06<21:50, 1.37it/s]
Training 1/3 epoch (loss 1.5240): 12%|ββ | 255/2046 [03:07<21:50, 1.37it/s]
Training 1/3 epoch (loss 1.5240): 13%|ββ | 256/2046 [03:07<22:19, 1.34it/s]
Training 1/3 epoch (loss 1.5387): 13%|ββ | 256/2046 [03:08<22:19, 1.34it/s]
Training 1/3 epoch (loss 1.5387): 13%|ββ | 257/2046 [03:08<24:09, 1.23it/s]
Training 1/3 epoch (loss 1.3839): 13%|ββ | 257/2046 [03:09<24:09, 1.23it/s]
Training 1/3 epoch (loss 1.3839): 13%|ββ | 258/2046 [03:09<23:04, 1.29it/s]
Training 1/3 epoch (loss 1.2123): 13%|ββ | 258/2046 [03:10<23:04, 1.29it/s]
Training 1/3 epoch (loss 1.2123): 13%|ββ | 259/2046 [03:10<22:00, 1.35it/s]
Training 1/3 epoch (loss 1.5085): 13%|ββ | 259/2046 [03:10<22:00, 1.35it/s]
Training 1/3 epoch (loss 1.5085): 13%|ββ | 260/2046 [03:10<21:18, 1.40it/s]
Training 1/3 epoch (loss 1.2938): 13%|ββ | 260/2046 [03:11<21:18, 1.40it/s]
Training 1/3 epoch (loss 1.2938): 13%|ββ | 261/2046 [03:11<20:25, 1.46it/s]
Training 1/3 epoch (loss 1.3105): 13%|ββ | 261/2046 [03:11<20:25, 1.46it/s]
Training 1/3 epoch (loss 1.3105): 13%|ββ | 262/2046 [03:11<19:58, 1.49it/s]
Training 1/3 epoch (loss 1.3323): 13%|ββ | 262/2046 [03:12<19:58, 1.49it/s]
Training 1/3 epoch (loss 1.3323): 13%|ββ | 263/2046 [03:12<19:52, 1.50it/s]
Training 1/3 epoch (loss 1.5146): 13%|ββ | 263/2046 [03:13<19:52, 1.50it/s]
Training 1/3 epoch (loss 1.5146): 13%|ββ | 264/2046 [03:13<20:32, 1.45it/s]
Training 1/3 epoch (loss 1.3531): 13%|ββ | 264/2046 [03:13<20:32, 1.45it/s]
Training 1/3 epoch (loss 1.3531): 13%|ββ | 265/2046 [03:13<19:51, 1.49it/s]
Training 1/3 epoch (loss 1.2726): 13%|ββ | 265/2046 [03:14<19:51, 1.49it/s]
Training 1/3 epoch (loss 1.2726): 13%|ββ | 266/2046 [03:14<20:07, 1.47it/s]
Training 1/3 epoch (loss 1.2886): 13%|ββ | 266/2046 [03:15<20:07, 1.47it/s]
Training 1/3 epoch (loss 1.2886): 13%|ββ | 267/2046 [03:15<19:42, 1.51it/s]
Training 1/3 epoch (loss 1.3412): 13%|ββ | 267/2046 [03:16<19:42, 1.51it/s]
Training 1/3 epoch (loss 1.3412): 13%|ββ | 268/2046 [03:16<20:48, 1.42it/s]
Training 1/3 epoch (loss 1.4051): 13%|ββ | 268/2046 [03:16<20:48, 1.42it/s]
Training 1/3 epoch (loss 1.4051): 13%|ββ | 269/2046 [03:16<19:56, 1.49it/s]
Training 1/3 epoch (loss 1.1556): 13%|ββ | 269/2046 [03:17<19:56, 1.49it/s]
Training 1/3 epoch (loss 1.1556): 13%|ββ | 270/2046 [03:17<20:17, 1.46it/s]
Training 1/3 epoch (loss 1.5297): 13%|ββ | 270/2046 [03:18<20:17, 1.46it/s]
Training 1/3 epoch (loss 1.5297): 13%|ββ | 271/2046 [03:18<21:04, 1.40it/s]
Training 1/3 epoch (loss 1.3867): 13%|ββ | 271/2046 [03:19<21:04, 1.40it/s]
Training 1/3 epoch (loss 1.3867): 13%|ββ | 272/2046 [03:19<22:27, 1.32it/s]
Training 1/3 epoch (loss 1.5779): 13%|ββ | 272/2046 [03:19<22:27, 1.32it/s]
Training 1/3 epoch (loss 1.5779): 13%|ββ | 273/2046 [03:19<22:39, 1.30it/s]
Training 1/3 epoch (loss 1.5891): 13%|ββ | 273/2046 [03:20<22:39, 1.30it/s]
Training 1/3 epoch (loss 1.5891): 13%|ββ | 274/2046 [03:20<21:52, 1.35it/s]
Training 1/3 epoch (loss 1.2218): 13%|ββ | 274/2046 [03:21<21:52, 1.35it/s]
Training 1/3 epoch (loss 1.2218): 13%|ββ | 275/2046 [03:21<21:00, 1.40it/s]
Training 1/3 epoch (loss 1.3062): 13%|ββ | 275/2046 [03:21<21:00, 1.40it/s]
Training 1/3 epoch (loss 1.3062): 13%|ββ | 276/2046 [03:21<20:17, 1.45it/s]
Training 1/3 epoch (loss 1.5834): 13%|ββ | 276/2046 [03:22<20:17, 1.45it/s]
Training 1/3 epoch (loss 1.5834): 14%|ββ | 277/2046 [03:22<19:56, 1.48it/s]
Training 1/3 epoch (loss 1.1943): 14%|ββ | 277/2046 [03:23<19:56, 1.48it/s]
Training 1/3 epoch (loss 1.1943): 14%|ββ | 278/2046 [03:23<21:31, 1.37it/s]
Training 1/3 epoch (loss 1.3029): 14%|ββ | 278/2046 [03:24<21:31, 1.37it/s]
Training 1/3 epoch (loss 1.3029): 14%|ββ | 279/2046 [03:24<21:14, 1.39it/s]
Training 1/3 epoch (loss 1.3493): 14%|ββ | 279/2046 [03:24<21:14, 1.39it/s]
Training 1/3 epoch (loss 1.3493): 14%|ββ | 280/2046 [03:24<22:47, 1.29it/s]
Training 1/3 epoch (loss 1.3884): 14%|ββ | 280/2046 [03:25<22:47, 1.29it/s]
Training 1/3 epoch (loss 1.3884): 14%|ββ | 281/2046 [03:25<21:20, 1.38it/s]
Training 1/3 epoch (loss 1.6855): 14%|ββ | 281/2046 [03:26<21:20, 1.38it/s]
Training 1/3 epoch (loss 1.6855): 14%|ββ | 282/2046 [03:26<21:43, 1.35it/s]
Training 1/3 epoch (loss 1.1628): 14%|ββ | 282/2046 [03:26<21:43, 1.35it/s]
Training 1/3 epoch (loss 1.1628): 14%|ββ | 283/2046 [03:26<20:43, 1.42it/s]
Training 1/3 epoch (loss 1.2662): 14%|ββ | 283/2046 [03:27<20:43, 1.42it/s]
Training 1/3 epoch (loss 1.2662): 14%|ββ | 284/2046 [03:27<20:15, 1.45it/s]
Training 1/3 epoch (loss 1.3457): 14%|ββ | 284/2046 [03:28<20:15, 1.45it/s]
Training 1/3 epoch (loss 1.3457): 14%|ββ | 285/2046 [03:28<19:38, 1.49it/s]
Training 1/3 epoch (loss 1.3838): 14%|ββ | 285/2046 [03:28<19:38, 1.49it/s]
Training 1/3 epoch (loss 1.3838): 14%|ββ | 286/2046 [03:28<20:01, 1.47it/s]
Training 1/3 epoch (loss 1.2694): 14%|ββ | 286/2046 [03:29<20:01, 1.47it/s]
Training 1/3 epoch (loss 1.2694): 14%|ββ | 287/2046 [03:29<20:00, 1.47it/s]
Training 1/3 epoch (loss 1.3388): 14%|ββ | 287/2046 [03:30<20:00, 1.47it/s]
Training 1/3 epoch (loss 1.3388): 14%|ββ | 288/2046 [03:30<20:18, 1.44it/s]
Training 1/3 epoch (loss 1.3729): 14%|ββ | 288/2046 [03:31<20:18, 1.44it/s]
Training 1/3 epoch (loss 1.3729): 14%|ββ | 289/2046 [03:31<21:03, 1.39it/s]
Training 1/3 epoch (loss 1.4200): 14%|ββ | 289/2046 [03:31<21:03, 1.39it/s]
Training 1/3 epoch (loss 1.4200): 14%|ββ | 290/2046 [03:31<21:59, 1.33it/s]
Training 1/3 epoch (loss 1.5499): 14%|ββ | 290/2046 [03:32<21:59, 1.33it/s]
Training 1/3 epoch (loss 1.5499): 14%|ββ | 291/2046 [03:32<22:20, 1.31it/s]
Training 1/3 epoch (loss 1.3694): 14%|ββ | 291/2046 [03:33<22:20, 1.31it/s]
Training 1/3 epoch (loss 1.3694): 14%|ββ | 292/2046 [03:33<22:44, 1.29it/s]
Training 1/3 epoch (loss 1.5587): 14%|ββ | 292/2046 [03:34<22:44, 1.29it/s]
Training 1/3 epoch (loss 1.5587): 14%|ββ | 293/2046 [03:34<22:04, 1.32it/s]
Training 1/3 epoch (loss 1.4697): 14%|ββ | 293/2046 [03:34<22:04, 1.32it/s]
Training 1/3 epoch (loss 1.4697): 14%|ββ | 294/2046 [03:34<21:02, 1.39it/s]
Training 1/3 epoch (loss 1.1234): 14%|ββ | 294/2046 [03:35<21:02, 1.39it/s]
Training 1/3 epoch (loss 1.1234): 14%|ββ | 295/2046 [03:35<20:06, 1.45it/s]
Training 1/3 epoch (loss 1.3687): 14%|ββ | 295/2046 [03:36<20:06, 1.45it/s]
Training 1/3 epoch (loss 1.3687): 14%|ββ | 296/2046 [03:36<21:32, 1.35it/s]
Training 1/3 epoch (loss 1.5823): 14%|ββ | 296/2046 [03:37<21:32, 1.35it/s]
Training 1/3 epoch (loss 1.5823): 15%|ββ | 297/2046 [03:37<22:03, 1.32it/s]
Training 1/3 epoch (loss 1.4892): 15%|ββ | 297/2046 [03:37<22:03, 1.32it/s]
Training 1/3 epoch (loss 1.4892): 15%|ββ | 298/2046 [03:37<21:22, 1.36it/s]
Training 1/3 epoch (loss 1.1859): 15%|ββ | 298/2046 [03:38<21:22, 1.36it/s]
Training 1/3 epoch (loss 1.1859): 15%|ββ | 299/2046 [03:38<22:43, 1.28it/s]
Training 1/3 epoch (loss 1.3410): 15%|ββ | 299/2046 [03:39<22:43, 1.28it/s]
Training 1/3 epoch (loss 1.3410): 15%|ββ | 300/2046 [03:39<21:51, 1.33it/s]
Training 1/3 epoch (loss 1.2706): 15%|ββ | 300/2046 [03:40<21:51, 1.33it/s]
Training 1/3 epoch (loss 1.2706): 15%|ββ | 301/2046 [03:40<20:51, 1.39it/s]
Training 1/3 epoch (loss 1.3946): 15%|ββ | 301/2046 [03:40<20:51, 1.39it/s]
Training 1/3 epoch (loss 1.3946): 15%|ββ | 302/2046 [03:40<19:57, 1.46it/s]
Training 1/3 epoch (loss 1.0967): 15%|ββ | 302/2046 [03:41<19:57, 1.46it/s]
Training 1/3 epoch (loss 1.0967): 15%|ββ | 303/2046 [03:41<19:27, 1.49it/s]
Training 1/3 epoch (loss 1.5047): 15%|ββ | 303/2046 [03:42<19:27, 1.49it/s]
Training 1/3 epoch (loss 1.5047): 15%|ββ | 304/2046 [03:42<21:01, 1.38it/s]
Training 1/3 epoch (loss 1.4282): 15%|ββ | 304/2046 [03:42<21:01, 1.38it/s]
Training 1/3 epoch (loss 1.4282): 15%|ββ | 305/2046 [03:42<20:48, 1.39it/s]
Training 1/3 epoch (loss 1.1580): 15%|ββ | 305/2046 [03:43<20:48, 1.39it/s]
Training 1/3 epoch (loss 1.1580): 15%|ββ | 306/2046 [03:43<19:50, 1.46it/s]
Training 1/3 epoch (loss 1.2863): 15%|ββ | 306/2046 [03:44<19:50, 1.46it/s]
Training 1/3 epoch (loss 1.2863): 15%|ββ | 307/2046 [03:44<19:48, 1.46it/s]
Training 1/3 epoch (loss 1.1715): 15%|ββ | 307/2046 [03:44<19:48, 1.46it/s]
Training 1/3 epoch (loss 1.1715): 15%|ββ | 308/2046 [03:44<20:09, 1.44it/s]
Training 1/3 epoch (loss 1.3179): 15%|ββ | 308/2046 [03:45<20:09, 1.44it/s]
Training 1/3 epoch (loss 1.3179): 15%|ββ | 309/2046 [03:45<19:28, 1.49it/s]
Training 1/3 epoch (loss 1.1771): 15%|ββ | 309/2046 [03:46<19:28, 1.49it/s]
Training 1/3 epoch (loss 1.1771): 15%|ββ | 310/2046 [03:46<19:25, 1.49it/s]
Training 1/3 epoch (loss 1.2849): 15%|ββ | 310/2046 [03:46<19:25, 1.49it/s]
Training 1/3 epoch (loss 1.2849): 15%|ββ | 311/2046 [03:46<18:53, 1.53it/s]
Training 1/3 epoch (loss 1.2553): 15%|ββ | 311/2046 [03:47<18:53, 1.53it/s]
Training 1/3 epoch (loss 1.2553): 15%|ββ | 312/2046 [03:47<19:35, 1.47it/s]
Training 1/3 epoch (loss 1.3379): 15%|ββ | 312/2046 [03:48<19:35, 1.47it/s]
Training 1/3 epoch (loss 1.3379): 15%|ββ | 313/2046 [03:48<19:24, 1.49it/s]
Training 1/3 epoch (loss 1.3481): 15%|ββ | 313/2046 [03:48<19:24, 1.49it/s]
Training 1/3 epoch (loss 1.3481): 15%|ββ | 314/2046 [03:48<20:19, 1.42it/s]
Training 1/3 epoch (loss 1.4191): 15%|ββ | 314/2046 [03:49<20:19, 1.42it/s]
Training 1/3 epoch (loss 1.4191): 15%|ββ | 315/2046 [03:49<20:27, 1.41it/s]
Training 1/3 epoch (loss 1.3981): 15%|ββ | 315/2046 [03:50<20:27, 1.41it/s]
Training 1/3 epoch (loss 1.3981): 15%|ββ | 316/2046 [03:50<21:03, 1.37it/s]
Training 1/3 epoch (loss 1.4324): 15%|ββ | 316/2046 [03:51<21:03, 1.37it/s]
Training 1/3 epoch (loss 1.4324): 15%|ββ | 317/2046 [03:51<21:53, 1.32it/s]
Training 1/3 epoch (loss 1.4879): 15%|ββ | 317/2046 [03:51<21:53, 1.32it/s]
Training 1/3 epoch (loss 1.4879): 16%|ββ | 318/2046 [03:51<20:40, 1.39it/s]
Training 1/3 epoch (loss 1.4688): 16%|ββ | 318/2046 [03:52<20:40, 1.39it/s]
Training 1/3 epoch (loss 1.4688): 16%|ββ | 319/2046 [03:52<20:18, 1.42it/s]
Training 1/3 epoch (loss 1.4564): 16%|ββ | 319/2046 [03:53<20:18, 1.42it/s]
Training 1/3 epoch (loss 1.4564): 16%|ββ | 320/2046 [03:53<22:30, 1.28it/s]
Training 1/3 epoch (loss 1.4361): 16%|ββ | 320/2046 [03:54<22:30, 1.28it/s]
Training 1/3 epoch (loss 1.4361): 16%|ββ | 321/2046 [03:54<22:38, 1.27it/s]
Training 1/3 epoch (loss 1.2474): 16%|ββ | 321/2046 [03:54<22:38, 1.27it/s]
Training 1/3 epoch (loss 1.2474): 16%|ββ | 322/2046 [03:54<21:21, 1.35it/s]
Training 1/3 epoch (loss 1.3316): 16%|ββ | 322/2046 [03:55<21:21, 1.35it/s]
Training 1/3 epoch (loss 1.3316): 16%|ββ | 323/2046 [03:55<21:09, 1.36it/s]
Training 1/3 epoch (loss 1.3133): 16%|ββ | 323/2046 [03:56<21:09, 1.36it/s]
Training 1/3 epoch (loss 1.3133): 16%|ββ | 324/2046 [03:56<20:32, 1.40it/s]
Training 1/3 epoch (loss 1.2134): 16%|ββ | 324/2046 [03:57<20:32, 1.40it/s]
Training 1/3 epoch (loss 1.2134): 16%|ββ | 325/2046 [03:57<21:02, 1.36it/s]
Training 1/3 epoch (loss 1.4047): 16%|ββ | 325/2046 [03:57<21:02, 1.36it/s]
Training 1/3 epoch (loss 1.4047): 16%|ββ | 326/2046 [03:57<20:11, 1.42it/s]
Training 1/3 epoch (loss 1.4613): 16%|ββ | 326/2046 [03:58<20:11, 1.42it/s]
Training 1/3 epoch (loss 1.4613): 16%|ββ | 327/2046 [03:58<20:00, 1.43it/s]
Training 1/3 epoch (loss 1.3142): 16%|ββ | 327/2046 [03:59<20:00, 1.43it/s]
Training 1/3 epoch (loss 1.3142): 16%|ββ | 328/2046 [03:59<21:05, 1.36it/s]
Training 1/3 epoch (loss 1.1978): 16%|ββ | 328/2046 [03:59<21:05, 1.36it/s]
Training 1/3 epoch (loss 1.1978): 16%|ββ | 329/2046 [03:59<20:44, 1.38it/s]
Training 1/3 epoch (loss 1.5019): 16%|ββ | 329/2046 [04:00<20:44, 1.38it/s]
Training 1/3 epoch (loss 1.5019): 16%|ββ | 330/2046 [04:00<20:13, 1.41it/s]
Training 1/3 epoch (loss 1.2616): 16%|ββ | 330/2046 [04:01<20:13, 1.41it/s]
Training 1/3 epoch (loss 1.2616): 16%|ββ | 331/2046 [04:01<20:06, 1.42it/s]
Training 1/3 epoch (loss 1.2281): 16%|ββ | 331/2046 [04:02<20:06, 1.42it/s]
Training 1/3 epoch (loss 1.2281): 16%|ββ | 332/2046 [04:02<21:00, 1.36it/s]
Training 1/3 epoch (loss 1.2244): 16%|ββ | 332/2046 [04:02<21:00, 1.36it/s]
Training 1/3 epoch (loss 1.2244): 16%|ββ | 333/2046 [04:02<20:37, 1.38it/s]
Training 1/3 epoch (loss 1.5817): 16%|ββ | 333/2046 [04:03<20:37, 1.38it/s]
Training 1/3 epoch (loss 1.5817): 16%|ββ | 334/2046 [04:03<19:49, 1.44it/s]
Training 1/3 epoch (loss 1.4320): 16%|ββ | 334/2046 [04:04<19:49, 1.44it/s]
Training 1/3 epoch (loss 1.4320): 16%|ββ | 335/2046 [04:04<20:18, 1.40it/s]
Training 1/3 epoch (loss 1.2720): 16%|ββ | 335/2046 [04:04<20:18, 1.40it/s]
Training 1/3 epoch (loss 1.2720): 16%|ββ | 336/2046 [04:04<21:15, 1.34it/s]
Training 1/3 epoch (loss 1.4353): 16%|ββ | 336/2046 [04:05<21:15, 1.34it/s]
Training 1/3 epoch (loss 1.4353): 16%|ββ | 337/2046 [04:05<21:08, 1.35it/s]
Training 1/3 epoch (loss 1.4782): 16%|ββ | 337/2046 [04:06<21:08, 1.35it/s]
Training 1/3 epoch (loss 1.4782): 17%|ββ | 338/2046 [04:06<20:10, 1.41it/s]
Training 1/3 epoch (loss 1.5457): 17%|ββ | 338/2046 [04:06<20:10, 1.41it/s]
Training 1/3 epoch (loss 1.5457): 17%|ββ | 339/2046 [04:06<19:33, 1.45it/s]
Training 1/3 epoch (loss 1.3054): 17%|ββ | 339/2046 [04:07<19:33, 1.45it/s]
Training 1/3 epoch (loss 1.3054): 17%|ββ | 340/2046 [04:07<19:41, 1.44it/s]
Training 1/3 epoch (loss 1.2173): 17%|ββ | 340/2046 [04:08<19:41, 1.44it/s]
Training 1/3 epoch (loss 1.2173): 17%|ββ | 341/2046 [04:08<20:12, 1.41it/s]
Training 1/3 epoch (loss 1.4227): 17%|ββ | 341/2046 [04:09<20:12, 1.41it/s]
Training 1/3 epoch (loss 1.4227): 17%|ββ | 342/2046 [04:09<19:25, 1.46it/s]
Training 1/3 epoch (loss 1.2346): 17%|ββ | 342/2046 [04:09<19:25, 1.46it/s]
Training 1/3 epoch (loss 1.2346): 17%|ββ | 343/2046 [04:09<18:55, 1.50it/s]
Training 1/3 epoch (loss 1.2469): 17%|ββ | 343/2046 [04:10<18:55, 1.50it/s]
Training 1/3 epoch (loss 1.2469): 17%|ββ | 344/2046 [04:10<19:28, 1.46it/s]
Training 1/3 epoch (loss 1.2111): 17%|ββ | 344/2046 [04:11<19:28, 1.46it/s]
Training 1/3 epoch (loss 1.2111): 17%|ββ | 345/2046 [04:11<18:56, 1.50it/s]
Training 1/3 epoch (loss 1.5433): 17%|ββ | 345/2046 [04:11<18:56, 1.50it/s]
Training 1/3 epoch (loss 1.5433): 17%|ββ | 346/2046 [04:11<18:26, 1.54it/s]
Training 1/3 epoch (loss 1.1502): 17%|ββ | 346/2046 [04:12<18:26, 1.54it/s]
Training 1/3 epoch (loss 1.1502): 17%|ββ | 347/2046 [04:12<18:15, 1.55it/s]
Training 1/3 epoch (loss 1.2867): 17%|ββ | 347/2046 [04:13<18:15, 1.55it/s]
Training 1/3 epoch (loss 1.2867): 17%|ββ | 348/2046 [04:13<18:48, 1.50it/s]
Training 1/3 epoch (loss 1.3927): 17%|ββ | 348/2046 [04:13<18:48, 1.50it/s]
Training 1/3 epoch (loss 1.3927): 17%|ββ | 349/2046 [04:13<18:54, 1.50it/s]
Training 1/3 epoch (loss 1.3471): 17%|ββ | 349/2046 [04:14<18:54, 1.50it/s]
Training 1/3 epoch (loss 1.3471): 17%|ββ | 350/2046 [04:14<19:04, 1.48it/s]
Training 1/3 epoch (loss 1.2358): 17%|ββ | 350/2046 [04:15<19:04, 1.48it/s]
Training 1/3 epoch (loss 1.2358): 17%|ββ | 351/2046 [04:15<19:56, 1.42it/s]
Training 1/3 epoch (loss 1.4829): 17%|ββ | 351/2046 [04:16<19:56, 1.42it/s]
Training 1/3 epoch (loss 1.4829): 17%|ββ | 352/2046 [04:16<21:16, 1.33it/s]
Training 1/3 epoch (loss 1.2856): 17%|ββ | 352/2046 [04:16<21:16, 1.33it/s]
Training 1/3 epoch (loss 1.2856): 17%|ββ | 353/2046 [04:16<22:23, 1.26it/s]
Training 1/3 epoch (loss 1.2136): 17%|ββ | 353/2046 [04:17<22:23, 1.26it/s]
Training 1/3 epoch (loss 1.2136): 17%|ββ | 354/2046 [04:17<21:25, 1.32it/s]
Training 1/3 epoch (loss 1.3638): 17%|ββ | 354/2046 [04:18<21:25, 1.32it/s]
Training 1/3 epoch (loss 1.3638): 17%|ββ | 355/2046 [04:18<20:55, 1.35it/s]
Training 1/3 epoch (loss 1.3975): 17%|ββ | 355/2046 [04:19<20:55, 1.35it/s]
Training 1/3 epoch (loss 1.3975): 17%|ββ | 356/2046 [04:19<21:31, 1.31it/s]
Training 1/3 epoch (loss 1.2919): 17%|ββ | 356/2046 [04:19<21:31, 1.31it/s]
Training 1/3 epoch (loss 1.2919): 17%|ββ | 357/2046 [04:19<21:41, 1.30it/s]
Training 1/3 epoch (loss 1.5007): 17%|ββ | 357/2046 [04:20<21:41, 1.30it/s]
Training 1/3 epoch (loss 1.5007): 17%|ββ | 358/2046 [04:20<22:01, 1.28it/s]
Training 1/3 epoch (loss 1.3464): 17%|ββ | 358/2046 [04:21<22:01, 1.28it/s]
Training 1/3 epoch (loss 1.3464): 18%|ββ | 359/2046 [04:21<20:50, 1.35it/s]
Training 1/3 epoch (loss 1.5226): 18%|ββ | 359/2046 [04:22<20:50, 1.35it/s]
Training 1/3 epoch (loss 1.5226): 18%|ββ | 360/2046 [04:22<21:34, 1.30it/s]
Training 1/3 epoch (loss 1.4504): 18%|ββ | 360/2046 [04:22<21:34, 1.30it/s]
Training 1/3 epoch (loss 1.4504): 18%|ββ | 361/2046 [04:22<21:36, 1.30it/s]
Training 1/3 epoch (loss 1.3457): 18%|ββ | 361/2046 [04:23<21:36, 1.30it/s]
Training 1/3 epoch (loss 1.3457): 18%|ββ | 362/2046 [04:23<22:18, 1.26it/s]
Training 1/3 epoch (loss 1.6159): 18%|ββ | 362/2046 [04:24<22:18, 1.26it/s]
Training 1/3 epoch (loss 1.6159): 18%|ββ | 363/2046 [04:24<21:24, 1.31it/s]
Training 1/3 epoch (loss 1.4400): 18%|ββ | 363/2046 [04:25<21:24, 1.31it/s]
Training 1/3 epoch (loss 1.4400): 18%|ββ | 364/2046 [04:25<22:38, 1.24it/s]
Training 1/3 epoch (loss 1.1858): 18%|ββ | 364/2046 [04:26<22:38, 1.24it/s]
Training 1/3 epoch (loss 1.1858): 18%|ββ | 365/2046 [04:26<21:42, 1.29it/s]
Training 1/3 epoch (loss 1.4507): 18%|ββ | 365/2046 [04:26<21:42, 1.29it/s]
Training 1/3 epoch (loss 1.4507): 18%|ββ | 366/2046 [04:26<20:30, 1.37it/s]
Training 1/3 epoch (loss 1.3477): 18%|ββ | 366/2046 [04:27<20:30, 1.37it/s]
Training 1/3 epoch (loss 1.3477): 18%|ββ | 367/2046 [04:27<19:38, 1.43it/s]
Training 1/3 epoch (loss 1.3803): 18%|ββ | 367/2046 [04:28<19:38, 1.43it/s]
Training 1/3 epoch (loss 1.3803): 18%|ββ | 368/2046 [04:28<20:06, 1.39it/s]
Training 1/3 epoch (loss 1.5209): 18%|ββ | 368/2046 [04:28<20:06, 1.39it/s]
Training 1/3 epoch (loss 1.5209): 18%|ββ | 369/2046 [04:28<19:50, 1.41it/s]
Training 1/3 epoch (loss 1.4846): 18%|ββ | 369/2046 [04:29<19:50, 1.41it/s]
Training 1/3 epoch (loss 1.4846): 18%|ββ | 370/2046 [04:29<20:27, 1.37it/s]
Training 1/3 epoch (loss 1.2104): 18%|ββ | 370/2046 [04:30<20:27, 1.37it/s]
Training 1/3 epoch (loss 1.2104): 18%|ββ | 371/2046 [04:30<19:28, 1.43it/s]
Training 1/3 epoch (loss 1.3847): 18%|ββ | 371/2046 [04:30<19:28, 1.43it/s]
Training 1/3 epoch (loss 1.3847): 18%|ββ | 372/2046 [04:30<18:58, 1.47it/s]
Training 1/3 epoch (loss 1.4046): 18%|ββ | 372/2046 [04:31<18:58, 1.47it/s]
Training 1/3 epoch (loss 1.4046): 18%|ββ | 373/2046 [04:31<19:34, 1.42it/s]
Training 1/3 epoch (loss 1.2649): 18%|ββ | 373/2046 [04:32<19:34, 1.42it/s]
Training 1/3 epoch (loss 1.2649): 18%|ββ | 374/2046 [04:32<18:59, 1.47it/s]
Training 1/3 epoch (loss 1.4142): 18%|ββ | 374/2046 [04:32<18:59, 1.47it/s]
Training 1/3 epoch (loss 1.4142): 18%|ββ | 375/2046 [04:32<18:24, 1.51it/s]
Training 1/3 epoch (loss 1.4007): 18%|ββ | 375/2046 [04:33<18:24, 1.51it/s]
Training 1/3 epoch (loss 1.4007): 18%|ββ | 376/2046 [04:33<20:01, 1.39it/s]
Training 1/3 epoch (loss 1.4162): 18%|ββ | 376/2046 [04:34<20:01, 1.39it/s]
Training 1/3 epoch (loss 1.4162): 18%|ββ | 377/2046 [04:34<19:57, 1.39it/s]
Training 1/3 epoch (loss 1.3231): 18%|ββ | 377/2046 [04:35<19:57, 1.39it/s]
Training 1/3 epoch (loss 1.3231): 18%|ββ | 378/2046 [04:35<19:14, 1.44it/s]
Training 1/3 epoch (loss 1.4924): 18%|ββ | 378/2046 [04:35<19:14, 1.44it/s]
Training 1/3 epoch (loss 1.4924): 19%|ββ | 379/2046 [04:35<20:28, 1.36it/s]
Training 1/3 epoch (loss 1.1553): 19%|ββ | 379/2046 [04:36<20:28, 1.36it/s]
Training 1/3 epoch (loss 1.1553): 19%|ββ | 380/2046 [04:36<19:53, 1.40it/s]
Training 1/3 epoch (loss 1.3906): 19%|ββ | 380/2046 [04:37<19:53, 1.40it/s]
Training 1/3 epoch (loss 1.3906): 19%|ββ | 381/2046 [04:37<19:06, 1.45it/s]
Training 1/3 epoch (loss 1.4412): 19%|ββ | 381/2046 [04:37<19:06, 1.45it/s]
Training 1/3 epoch (loss 1.4412): 19%|ββ | 382/2046 [04:37<18:56, 1.46it/s]
Training 1/3 epoch (loss 1.2699): 19%|ββ | 382/2046 [04:38<18:56, 1.46it/s]
Training 1/3 epoch (loss 1.2699): 19%|ββ | 383/2046 [04:38<18:45, 1.48it/s]
Training 1/3 epoch (loss 1.1897): 19%|ββ | 383/2046 [04:39<18:45, 1.48it/s]
Training 1/3 epoch (loss 1.1897): 19%|ββ | 384/2046 [04:39<19:13, 1.44it/s]
Training 1/3 epoch (loss 1.4282): 19%|ββ | 384/2046 [04:39<19:13, 1.44it/s]
Training 1/3 epoch (loss 1.4282): 19%|ββ | 385/2046 [04:39<18:52, 1.47it/s]
Training 1/3 epoch (loss 1.3923): 19%|ββ | 385/2046 [04:40<18:52, 1.47it/s]
Training 1/3 epoch (loss 1.3923): 19%|ββ | 386/2046 [04:40<18:52, 1.47it/s]
Training 1/3 epoch (loss 1.3418): 19%|ββ | 386/2046 [04:41<18:52, 1.47it/s]
Training 1/3 epoch (loss 1.3418): 19%|ββ | 387/2046 [04:41<18:45, 1.47it/s]
Training 1/3 epoch (loss 1.3158): 19%|ββ | 387/2046 [04:41<18:45, 1.47it/s]
Training 1/3 epoch (loss 1.3158): 19%|ββ | 388/2046 [04:41<18:29, 1.49it/s]
Training 1/3 epoch (loss 1.2793): 19%|ββ | 388/2046 [04:42<18:29, 1.49it/s]
Training 1/3 epoch (loss 1.2793): 19%|ββ | 389/2046 [04:42<18:37, 1.48it/s]
Training 1/3 epoch (loss 1.5465): 19%|ββ | 389/2046 [04:43<18:37, 1.48it/s]
Training 1/3 epoch (loss 1.5465): 19%|ββ | 390/2046 [04:43<18:24, 1.50it/s]
Training 1/3 epoch (loss 1.1814): 19%|ββ | 390/2046 [04:43<18:24, 1.50it/s]
Training 1/3 epoch (loss 1.1814): 19%|ββ | 391/2046 [04:43<18:04, 1.53it/s]
Training 1/3 epoch (loss 1.1125): 19%|ββ | 391/2046 [04:44<18:04, 1.53it/s]
Training 1/3 epoch (loss 1.1125): 19%|ββ | 392/2046 [04:44<21:43, 1.27it/s]
Training 1/3 epoch (loss 1.2198): 19%|ββ | 392/2046 [04:45<21:43, 1.27it/s]
Training 1/3 epoch (loss 1.2198): 19%|ββ | 393/2046 [04:45<20:30, 1.34it/s]
Training 1/3 epoch (loss 1.1995): 19%|ββ | 393/2046 [04:46<20:30, 1.34it/s]
Training 1/3 epoch (loss 1.1995): 19%|ββ | 394/2046 [04:46<19:36, 1.40it/s]
Training 1/3 epoch (loss 1.2747): 19%|ββ | 394/2046 [04:46<19:36, 1.40it/s]
Training 1/3 epoch (loss 1.2747): 19%|ββ | 395/2046 [04:46<18:44, 1.47it/s]
Training 1/3 epoch (loss 1.3997): 19%|ββ | 395/2046 [04:47<18:44, 1.47it/s]
Training 1/3 epoch (loss 1.3997): 19%|ββ | 396/2046 [04:47<19:15, 1.43it/s]
Training 1/3 epoch (loss 1.3830): 19%|ββ | 396/2046 [04:48<19:15, 1.43it/s]
Training 1/3 epoch (loss 1.3830): 19%|ββ | 397/2046 [04:48<20:04, 1.37it/s]
Training 1/3 epoch (loss 1.4717): 19%|ββ | 397/2046 [04:49<20:04, 1.37it/s]
Training 1/3 epoch (loss 1.4717): 19%|ββ | 398/2046 [04:49<20:55, 1.31it/s]
Training 1/3 epoch (loss 1.2905): 19%|ββ | 398/2046 [04:50<20:55, 1.31it/s]
Training 1/3 epoch (loss 1.2905): 20%|ββ | 399/2046 [04:50<22:24, 1.23it/s]
Training 1/3 epoch (loss 1.1400): 20%|ββ | 399/2046 [04:50<22:24, 1.23it/s]
Training 1/3 epoch (loss 1.1400): 20%|ββ | 400/2046 [04:50<21:53, 1.25it/s]
Training 1/3 epoch (loss 1.3691): 20%|ββ | 400/2046 [04:51<21:53, 1.25it/s]
Training 1/3 epoch (loss 1.3691): 20%|ββ | 401/2046 [04:51<20:24, 1.34it/s]
Training 1/3 epoch (loss 1.4493): 20%|ββ | 401/2046 [04:52<20:24, 1.34it/s]
Training 1/3 epoch (loss 1.4493): 20%|ββ | 402/2046 [04:52<19:28, 1.41it/s]
Training 1/3 epoch (loss 1.5006): 20%|ββ | 402/2046 [04:52<19:28, 1.41it/s]
Training 1/3 epoch (loss 1.5006): 20%|ββ | 403/2046 [04:52<20:09, 1.36it/s]
Training 1/3 epoch (loss 1.1730): 20%|ββ | 403/2046 [04:53<20:09, 1.36it/s]
Training 1/3 epoch (loss 1.1730): 20%|ββ | 404/2046 [04:53<19:59, 1.37it/s]
Training 1/3 epoch (loss 1.3271): 20%|ββ | 404/2046 [04:54<19:59, 1.37it/s]
Training 1/3 epoch (loss 1.3271): 20%|ββ | 405/2046 [04:54<20:32, 1.33it/s]
Training 1/3 epoch (loss 1.2248): 20%|ββ | 405/2046 [04:55<20:32, 1.33it/s]
Training 1/3 epoch (loss 1.2248): 20%|ββ | 406/2046 [04:55<20:53, 1.31it/s]
Training 1/3 epoch (loss 1.2692): 20%|ββ | 406/2046 [04:55<20:53, 1.31it/s]
Training 1/3 epoch (loss 1.2692): 20%|ββ | 407/2046 [04:55<20:14, 1.35it/s]
Training 1/3 epoch (loss 1.3499): 20%|ββ | 407/2046 [04:56<20:14, 1.35it/s]
Training 1/3 epoch (loss 1.3499): 20%|ββ | 408/2046 [04:56<21:01, 1.30it/s]
Training 1/3 epoch (loss 1.2457): 20%|ββ | 408/2046 [04:57<21:01, 1.30it/s]
Training 1/3 epoch (loss 1.2457): 20%|ββ | 409/2046 [04:57<21:59, 1.24it/s]
Training 1/3 epoch (loss 1.5245): 20%|ββ | 409/2046 [04:58<21:59, 1.24it/s]
Training 1/3 epoch (loss 1.5245): 20%|ββ | 410/2046 [04:58<20:43, 1.32it/s]
Training 1/3 epoch (loss 1.3297): 20%|ββ | 410/2046 [04:59<20:43, 1.32it/s]
Training 1/3 epoch (loss 1.3297): 20%|ββ | 411/2046 [04:59<20:14, 1.35it/s]
Training 1/3 epoch (loss 1.2966): 20%|ββ | 411/2046 [04:59<20:14, 1.35it/s]
Training 1/3 epoch (loss 1.2966): 20%|ββ | 412/2046 [04:59<19:11, 1.42it/s]
Training 1/3 epoch (loss 1.2991): 20%|ββ | 412/2046 [05:00<19:11, 1.42it/s]
Training 1/3 epoch (loss 1.2991): 20%|ββ | 413/2046 [05:00<19:24, 1.40it/s]
Training 1/3 epoch (loss 1.1489): 20%|ββ | 413/2046 [05:01<19:24, 1.40it/s]
Training 1/3 epoch (loss 1.1489): 20%|ββ | 414/2046 [05:01<18:57, 1.43it/s]
Training 1/3 epoch (loss 1.4380): 20%|ββ | 414/2046 [05:01<18:57, 1.43it/s]
Training 1/3 epoch (loss 1.4380): 20%|ββ | 415/2046 [05:01<19:47, 1.37it/s]
Training 1/3 epoch (loss 1.2123): 20%|ββ | 415/2046 [05:02<19:47, 1.37it/s]
Training 1/3 epoch (loss 1.2123): 20%|ββ | 416/2046 [05:02<20:21, 1.33it/s]
Training 1/3 epoch (loss 1.5458): 20%|ββ | 416/2046 [05:03<20:21, 1.33it/s]
Training 1/3 epoch (loss 1.5458): 20%|ββ | 417/2046 [05:03<20:56, 1.30it/s]
Training 1/3 epoch (loss 1.3524): 20%|ββ | 417/2046 [05:04<20:56, 1.30it/s]
Training 1/3 epoch (loss 1.3524): 20%|ββ | 418/2046 [05:04<20:12, 1.34it/s]
Training 1/3 epoch (loss 1.4763): 20%|ββ | 418/2046 [05:04<20:12, 1.34it/s]
Training 1/3 epoch (loss 1.4763): 20%|ββ | 419/2046 [05:04<19:53, 1.36it/s]
Training 1/3 epoch (loss 1.4282): 20%|ββ | 419/2046 [05:05<19:53, 1.36it/s]
Training 1/3 epoch (loss 1.4282): 21%|ββ | 420/2046 [05:05<18:53, 1.43it/s]
Training 1/3 epoch (loss 1.1966): 21%|ββ | 420/2046 [05:06<18:53, 1.43it/s]
Training 1/3 epoch (loss 1.1966): 21%|ββ | 421/2046 [05:06<20:59, 1.29it/s]
Training 1/3 epoch (loss 1.3284): 21%|ββ | 421/2046 [05:07<20:59, 1.29it/s]
Training 1/3 epoch (loss 1.3284): 21%|ββ | 422/2046 [05:07<20:20, 1.33it/s]
Training 1/3 epoch (loss 1.2711): 21%|ββ | 422/2046 [05:07<20:20, 1.33it/s]
Training 1/3 epoch (loss 1.2711): 21%|ββ | 423/2046 [05:07<20:32, 1.32it/s]
Training 1/3 epoch (loss 1.4265): 21%|ββ | 423/2046 [05:08<20:32, 1.32it/s]
Training 1/3 epoch (loss 1.4265): 21%|ββ | 424/2046 [05:08<21:24, 1.26it/s]
Training 1/3 epoch (loss 1.2731): 21%|ββ | 424/2046 [05:09<21:24, 1.26it/s]
Training 1/3 epoch (loss 1.2731): 21%|ββ | 425/2046 [05:09<20:03, 1.35it/s]
Training 1/3 epoch (loss 1.0972): 21%|ββ | 425/2046 [05:10<20:03, 1.35it/s]
Training 1/3 epoch (loss 1.0972): 21%|ββ | 426/2046 [05:10<19:01, 1.42it/s]
Training 1/3 epoch (loss 1.2602): 21%|ββ | 426/2046 [05:10<19:01, 1.42it/s]
Training 1/3 epoch (loss 1.2602): 21%|ββ | 427/2046 [05:10<18:28, 1.46it/s]
Training 1/3 epoch (loss 1.3077): 21%|ββ | 427/2046 [05:11<18:28, 1.46it/s]
Training 1/3 epoch (loss 1.3077): 21%|ββ | 428/2046 [05:11<18:13, 1.48it/s]
Training 1/3 epoch (loss 1.3873): 21%|ββ | 428/2046 [05:11<18:13, 1.48it/s]
Training 1/3 epoch (loss 1.3873): 21%|ββ | 429/2046 [05:11<17:55, 1.50it/s]
Training 1/3 epoch (loss 1.2358): 21%|ββ | 429/2046 [05:12<17:55, 1.50it/s]
Training 1/3 epoch (loss 1.2358): 21%|ββ | 430/2046 [05:12<18:05, 1.49it/s]
Training 1/3 epoch (loss 1.3500): 21%|ββ | 430/2046 [05:13<18:05, 1.49it/s]
Training 1/3 epoch (loss 1.3500): 21%|ββ | 431/2046 [05:13<17:54, 1.50it/s]
Training 1/3 epoch (loss 1.4313): 21%|ββ | 431/2046 [05:14<17:54, 1.50it/s]
Training 1/3 epoch (loss 1.4313): 21%|ββ | 432/2046 [05:14<19:28, 1.38it/s]
Training 1/3 epoch (loss 1.1852): 21%|ββ | 432/2046 [05:14<19:28, 1.38it/s]
Training 1/3 epoch (loss 1.1852): 21%|ββ | 433/2046 [05:14<18:56, 1.42it/s]
Training 1/3 epoch (loss 1.4267): 21%|ββ | 433/2046 [05:15<18:56, 1.42it/s]
Training 1/3 epoch (loss 1.4267): 21%|ββ | 434/2046 [05:15<19:26, 1.38it/s]
Training 1/3 epoch (loss 1.3870): 21%|ββ | 434/2046 [05:16<19:26, 1.38it/s]
Training 1/3 epoch (loss 1.3870): 21%|βββ | 435/2046 [05:16<19:00, 1.41it/s]
Training 1/3 epoch (loss 1.0864): 21%|βββ | 435/2046 [05:16<19:00, 1.41it/s]
Training 1/3 epoch (loss 1.0864): 21%|βββ | 436/2046 [05:16<19:05, 1.41it/s]
Training 1/3 epoch (loss 1.2829): 21%|βββ | 436/2046 [05:17<19:05, 1.41it/s]
Training 1/3 epoch (loss 1.2829): 21%|βββ | 437/2046 [05:17<19:01, 1.41it/s]
Training 1/3 epoch (loss 1.2556): 21%|βββ | 437/2046 [05:18<19:01, 1.41it/s]
Training 1/3 epoch (loss 1.2556): 21%|βββ | 438/2046 [05:18<20:40, 1.30it/s]
Training 1/3 epoch (loss 1.3941): 21%|βββ | 438/2046 [05:19<20:40, 1.30it/s]
Training 1/3 epoch (loss 1.3941): 21%|βββ | 439/2046 [05:19<21:32, 1.24it/s]
Training 1/3 epoch (loss 1.1629): 21%|βββ | 439/2046 [05:20<21:32, 1.24it/s]
Training 1/3 epoch (loss 1.1629): 22%|βββ | 440/2046 [05:20<22:24, 1.19it/s]
Training 1/3 epoch (loss 1.0424): 22%|βββ | 440/2046 [05:21<22:24, 1.19it/s]
Training 1/3 epoch (loss 1.0424): 22%|βββ | 441/2046 [05:21<21:16, 1.26it/s]
Training 1/3 epoch (loss 1.5141): 22%|βββ | 441/2046 [05:21<21:16, 1.26it/s]
Training 1/3 epoch (loss 1.5141): 22%|βββ | 442/2046 [05:21<20:31, 1.30it/s]
Training 1/3 epoch (loss 1.3952): 22%|βββ | 442/2046 [05:22<20:31, 1.30it/s]
Training 1/3 epoch (loss 1.3952): 22%|βββ | 443/2046 [05:22<19:18, 1.38it/s]
Training 1/3 epoch (loss 1.3514): 22%|βββ | 443/2046 [05:23<19:18, 1.38it/s]
Training 1/3 epoch (loss 1.3514): 22%|βββ | 444/2046 [05:23<20:24, 1.31it/s]
Training 1/3 epoch (loss 1.4326): 22%|βββ | 444/2046 [05:23<20:24, 1.31it/s]
Training 1/3 epoch (loss 1.4326): 22%|βββ | 445/2046 [05:23<19:31, 1.37it/s]
Training 1/3 epoch (loss 1.2183): 22%|βββ | 445/2046 [05:24<19:31, 1.37it/s]
Training 1/3 epoch (loss 1.2183): 22%|βββ | 446/2046 [05:24<20:14, 1.32it/s]
Training 1/3 epoch (loss 1.3292): 22%|βββ | 446/2046 [05:25<20:14, 1.32it/s]
Training 1/3 epoch (loss 1.3292): 22%|βββ | 447/2046 [05:25<19:28, 1.37it/s]
Training 1/3 epoch (loss 1.3348): 22%|βββ | 447/2046 [05:26<19:28, 1.37it/s]
Training 1/3 epoch (loss 1.3348): 22%|βββ | 448/2046 [05:26<20:39, 1.29it/s]
Training 1/3 epoch (loss 1.3190): 22%|βββ | 448/2046 [05:27<20:39, 1.29it/s]
Training 1/3 epoch (loss 1.3190): 22%|βββ | 449/2046 [05:27<20:28, 1.30it/s]
Training 1/3 epoch (loss 1.3578): 22%|βββ | 449/2046 [05:27<20:28, 1.30it/s]
Training 1/3 epoch (loss 1.3578): 22%|βββ | 450/2046 [05:27<19:30, 1.36it/s]
Training 1/3 epoch (loss 1.0801): 22%|βββ | 450/2046 [05:28<19:30, 1.36it/s]
Training 1/3 epoch (loss 1.0801): 22%|βββ | 451/2046 [05:28<18:35, 1.43it/s]
Training 1/3 epoch (loss 1.3678): 22%|βββ | 451/2046 [05:29<18:35, 1.43it/s]
Training 1/3 epoch (loss 1.3678): 22%|βββ | 452/2046 [05:29<18:19, 1.45it/s]
Training 1/3 epoch (loss 1.2943): 22%|βββ | 452/2046 [05:29<18:19, 1.45it/s]
Training 1/3 epoch (loss 1.2943): 22%|βββ | 453/2046 [05:29<17:47, 1.49it/s]
Training 1/3 epoch (loss 1.4114): 22%|βββ | 453/2046 [05:30<17:47, 1.49it/s]
Training 1/3 epoch (loss 1.4114): 22%|βββ | 454/2046 [05:30<17:43, 1.50it/s]
Training 1/3 epoch (loss 1.2101): 22%|βββ | 454/2046 [05:31<17:43, 1.50it/s]
Training 1/3 epoch (loss 1.2101): 22%|βββ | 455/2046 [05:31<18:06, 1.46it/s]
Training 1/3 epoch (loss 1.2407): 22%|βββ | 455/2046 [05:31<18:06, 1.46it/s]
Training 1/3 epoch (loss 1.2407): 22%|βββ | 456/2046 [05:31<19:36, 1.35it/s]
Training 1/3 epoch (loss 1.3679): 22%|βββ | 456/2046 [05:32<19:36, 1.35it/s]
Training 1/3 epoch (loss 1.3679): 22%|βββ | 457/2046 [05:32<19:16, 1.37it/s]
Training 1/3 epoch (loss 1.1873): 22%|βββ | 457/2046 [05:33<19:16, 1.37it/s]
Training 1/3 epoch (loss 1.1873): 22%|βββ | 458/2046 [05:33<18:38, 1.42it/s]
Training 1/3 epoch (loss 1.4092): 22%|βββ | 458/2046 [05:33<18:38, 1.42it/s]
Training 1/3 epoch (loss 1.4092): 22%|βββ | 459/2046 [05:33<18:13, 1.45it/s]
Training 1/3 epoch (loss 1.0546): 22%|βββ | 459/2046 [05:34<18:13, 1.45it/s]
Training 1/3 epoch (loss 1.0546): 22%|βββ | 460/2046 [05:34<18:25, 1.43it/s]
Training 1/3 epoch (loss 1.1777): 22%|βββ | 460/2046 [05:35<18:25, 1.43it/s]
Training 1/3 epoch (loss 1.1777): 23%|βββ | 461/2046 [05:35<18:21, 1.44it/s]
Training 1/3 epoch (loss 1.3624): 23%|βββ | 461/2046 [05:35<18:21, 1.44it/s]
Training 1/3 epoch (loss 1.3624): 23%|βββ | 462/2046 [05:35<17:57, 1.47it/s]
Training 1/3 epoch (loss 1.2730): 23%|βββ | 462/2046 [05:36<17:57, 1.47it/s]
Training 1/3 epoch (loss 1.2730): 23%|βββ | 463/2046 [05:36<18:00, 1.47it/s]
Training 1/3 epoch (loss 1.3526): 23%|βββ | 463/2046 [05:37<18:00, 1.47it/s]
Training 1/3 epoch (loss 1.3526): 23%|βββ | 464/2046 [05:37<18:58, 1.39it/s]
Training 1/3 epoch (loss 1.1497): 23%|βββ | 464/2046 [05:38<18:58, 1.39it/s]
Training 1/3 epoch (loss 1.1497): 23%|βββ | 465/2046 [05:38<19:06, 1.38it/s]
Training 1/3 epoch (loss 1.1466): 23%|βββ | 465/2046 [05:39<19:06, 1.38it/s]
Training 1/3 epoch (loss 1.1466): 23%|βββ | 466/2046 [05:39<19:55, 1.32it/s]
Training 1/3 epoch (loss 1.0724): 23%|βββ | 466/2046 [05:39<19:55, 1.32it/s]
Training 1/3 epoch (loss 1.0724): 23%|βββ | 467/2046 [05:39<19:21, 1.36it/s]
Training 1/3 epoch (loss 1.2064): 23%|βββ | 467/2046 [05:40<19:21, 1.36it/s]
Training 1/3 epoch (loss 1.2064): 23%|βββ | 468/2046 [05:40<18:27, 1.43it/s]
Training 1/3 epoch (loss 1.2445): 23%|βββ | 468/2046 [05:40<18:27, 1.43it/s]
Training 1/3 epoch (loss 1.2445): 23%|βββ | 469/2046 [05:40<17:45, 1.48it/s]
Training 1/3 epoch (loss 1.3480): 23%|βββ | 469/2046 [05:41<17:45, 1.48it/s]
Training 1/3 epoch (loss 1.3480): 23%|βββ | 470/2046 [05:41<17:47, 1.48it/s]
Training 1/3 epoch (loss 1.3362): 23%|βββ | 470/2046 [05:42<17:47, 1.48it/s]
Training 1/3 epoch (loss 1.3362): 23%|βββ | 471/2046 [05:42<17:24, 1.51it/s]
Training 1/3 epoch (loss 1.4062): 23%|βββ | 471/2046 [05:43<17:24, 1.51it/s]
Training 1/3 epoch (loss 1.4062): 23%|βββ | 472/2046 [05:43<18:52, 1.39it/s]
Training 1/3 epoch (loss 1.4487): 23%|βββ | 472/2046 [05:43<18:52, 1.39it/s]
Training 1/3 epoch (loss 1.4487): 23%|βββ | 473/2046 [05:43<19:29, 1.34it/s]
Training 1/3 epoch (loss 1.1495): 23%|βββ | 473/2046 [05:44<19:29, 1.34it/s]
Training 1/3 epoch (loss 1.1495): 23%|βββ | 474/2046 [05:44<18:36, 1.41it/s]
Training 1/3 epoch (loss 1.1843): 23%|βββ | 474/2046 [05:45<18:36, 1.41it/s]
Training 1/3 epoch (loss 1.1843): 23%|βββ | 475/2046 [05:45<18:29, 1.42it/s]
Training 1/3 epoch (loss 1.1804): 23%|βββ | 475/2046 [05:45<18:29, 1.42it/s]
Training 1/3 epoch (loss 1.1804): 23%|βββ | 476/2046 [05:45<18:26, 1.42it/s]
Training 1/3 epoch (loss 1.2060): 23%|βββ | 476/2046 [05:46<18:26, 1.42it/s]
Training 1/3 epoch (loss 1.2060): 23%|βββ | 477/2046 [05:46<17:40, 1.48it/s]
Training 1/3 epoch (loss 1.3657): 23%|βββ | 477/2046 [05:47<17:40, 1.48it/s]
Training 1/3 epoch (loss 1.3657): 23%|βββ | 478/2046 [05:47<17:10, 1.52it/s]
Training 1/3 epoch (loss 1.2497): 23%|βββ | 478/2046 [05:47<17:10, 1.52it/s]
Training 1/3 epoch (loss 1.2497): 23%|βββ | 479/2046 [05:47<17:56, 1.46it/s]
Training 1/3 epoch (loss 1.2974): 23%|βββ | 479/2046 [05:48<17:56, 1.46it/s]
Training 1/3 epoch (loss 1.2974): 23%|βββ | 480/2046 [05:48<19:52, 1.31it/s]
Training 1/3 epoch (loss 1.3981): 23%|βββ | 480/2046 [05:49<19:52, 1.31it/s]
Training 1/3 epoch (loss 1.3981): 24%|βββ | 481/2046 [05:49<20:42, 1.26it/s]
Training 1/3 epoch (loss 1.2812): 24%|βββ | 481/2046 [05:50<20:42, 1.26it/s]
Training 1/3 epoch (loss 1.2812): 24%|βββ | 482/2046 [05:50<19:36, 1.33it/s]
Training 1/3 epoch (loss 1.3807): 24%|βββ | 482/2046 [05:51<19:36, 1.33it/s]
Training 1/3 epoch (loss 1.3807): 24%|βββ | 483/2046 [05:51<18:51, 1.38it/s]
Training 1/3 epoch (loss 1.2499): 24%|βββ | 483/2046 [05:51<18:51, 1.38it/s]
Training 1/3 epoch (loss 1.2499): 24%|βββ | 484/2046 [05:51<18:29, 1.41it/s]
Training 1/3 epoch (loss 1.3009): 24%|βββ | 484/2046 [05:52<18:29, 1.41it/s]
Training 1/3 epoch (loss 1.3009): 24%|βββ | 485/2046 [05:52<19:39, 1.32it/s]
Training 1/3 epoch (loss 1.1838): 24%|βββ | 485/2046 [05:53<19:39, 1.32it/s]
Training 1/3 epoch (loss 1.1838): 24%|βββ | 486/2046 [05:53<20:27, 1.27it/s]
Training 1/3 epoch (loss 1.4409): 24%|βββ | 486/2046 [05:54<20:27, 1.27it/s]
Training 1/3 epoch (loss 1.4409): 24%|βββ | 487/2046 [05:54<21:09, 1.23it/s]
Training 1/3 epoch (loss 1.2824): 24%|βββ | 487/2046 [05:55<21:09, 1.23it/s]
Training 1/3 epoch (loss 1.2824): 24%|βββ | 488/2046 [05:55<20:29, 1.27it/s]
Training 1/3 epoch (loss 1.4339): 24%|βββ | 488/2046 [05:55<20:29, 1.27it/s]
Training 1/3 epoch (loss 1.4339): 24%|βββ | 489/2046 [05:55<21:09, 1.23it/s]
Training 1/3 epoch (loss 1.2771): 24%|βββ | 489/2046 [05:56<21:09, 1.23it/s]
Training 1/3 epoch (loss 1.2771): 24%|βββ | 490/2046 [05:56<20:18, 1.28it/s]
Training 1/3 epoch (loss 1.4986): 24%|βββ | 490/2046 [05:57<20:18, 1.28it/s]
Training 1/3 epoch (loss 1.4986): 24%|βββ | 491/2046 [05:57<19:44, 1.31it/s]
Training 1/3 epoch (loss 1.3263): 24%|βββ | 491/2046 [05:57<19:44, 1.31it/s]
Training 1/3 epoch (loss 1.3263): 24%|βββ | 492/2046 [05:57<18:39, 1.39it/s]
Training 1/3 epoch (loss 1.3442): 24%|βββ | 492/2046 [05:58<18:39, 1.39it/s]
Training 1/3 epoch (loss 1.3442): 24%|βββ | 493/2046 [05:58<18:53, 1.37it/s]
Training 1/3 epoch (loss 1.4311): 24%|βββ | 493/2046 [05:59<18:53, 1.37it/s]
Training 1/3 epoch (loss 1.4311): 24%|βββ | 494/2046 [05:59<18:49, 1.37it/s]
Training 1/3 epoch (loss 1.3650): 24%|βββ | 494/2046 [06:00<18:49, 1.37it/s]
Training 1/3 epoch (loss 1.3650): 24%|βββ | 495/2046 [06:00<18:24, 1.40it/s]
Training 1/3 epoch (loss 1.4868): 24%|βββ | 495/2046 [06:00<18:24, 1.40it/s]
Training 1/3 epoch (loss 1.4868): 24%|βββ | 496/2046 [06:00<19:32, 1.32it/s]
Training 1/3 epoch (loss 1.4568): 24%|βββ | 496/2046 [06:01<19:32, 1.32it/s]
Training 1/3 epoch (loss 1.4568): 24%|βββ | 497/2046 [06:01<18:52, 1.37it/s]
Training 1/3 epoch (loss 1.3591): 24%|βββ | 497/2046 [06:02<18:52, 1.37it/s]
Training 1/3 epoch (loss 1.3591): 24%|βββ | 498/2046 [06:02<18:14, 1.41it/s]
Training 1/3 epoch (loss 1.4990): 24%|βββ | 498/2046 [06:02<18:14, 1.41it/s]
Training 1/3 epoch (loss 1.4990): 24%|βββ | 499/2046 [06:02<18:18, 1.41it/s]
Training 1/3 epoch (loss 1.1904): 24%|βββ | 499/2046 [06:03<18:18, 1.41it/s]
Training 1/3 epoch (loss 1.1904): 24%|βββ | 500/2046 [06:03<18:51, 1.37it/s]
Training 1/3 epoch (loss 1.2680): 24%|βββ | 500/2046 [06:04<18:51, 1.37it/s]
Training 1/3 epoch (loss 1.2680): 24%|βββ | 501/2046 [06:04<18:18, 1.41it/s]
Training 1/3 epoch (loss 1.4400): 24%|βββ | 501/2046 [06:05<18:18, 1.41it/s]
Training 1/3 epoch (loss 1.4400): 25%|βββ | 502/2046 [06:05<18:06, 1.42it/s]
Training 1/3 epoch (loss 1.2188): 25%|βββ | 502/2046 [06:05<18:06, 1.42it/s]
Training 1/3 epoch (loss 1.2188): 25%|βββ | 503/2046 [06:05<17:26, 1.47it/s]
Training 1/3 epoch (loss 1.2209): 25%|βββ | 503/2046 [06:06<17:26, 1.47it/s]
Training 1/3 epoch (loss 1.2209): 25%|βββ | 504/2046 [06:06<18:54, 1.36it/s]
Training 1/3 epoch (loss 1.5835): 25%|βββ | 504/2046 [06:07<18:54, 1.36it/s]
Training 1/3 epoch (loss 1.5835): 25%|βββ | 505/2046 [06:07<18:05, 1.42it/s]
Training 1/3 epoch (loss 1.3556): 25%|βββ | 505/2046 [06:07<18:05, 1.42it/s]
Training 1/3 epoch (loss 1.3556): 25%|βββ | 506/2046 [06:07<17:33, 1.46it/s]
Training 1/3 epoch (loss 1.3713): 25%|βββ | 506/2046 [06:08<17:33, 1.46it/s]
Training 1/3 epoch (loss 1.3713): 25%|βββ | 507/2046 [06:08<17:26, 1.47it/s]
Training 1/3 epoch (loss 1.4862): 25%|βββ | 507/2046 [06:09<17:26, 1.47it/s]
Training 1/3 epoch (loss 1.4862): 25%|βββ | 508/2046 [06:09<17:06, 1.50it/s]
Training 1/3 epoch (loss 1.3713): 25%|βββ | 508/2046 [06:09<17:06, 1.50it/s]
Training 1/3 epoch (loss 1.3713): 25%|βββ | 509/2046 [06:09<16:48, 1.52it/s]
Training 1/3 epoch (loss 1.1411): 25%|βββ | 509/2046 [06:10<16:48, 1.52it/s]
Training 1/3 epoch (loss 1.1411): 25%|βββ | 510/2046 [06:10<16:30, 1.55it/s]
Training 1/3 epoch (loss 1.2996): 25%|βββ | 510/2046 [06:11<16:30, 1.55it/s]
Training 1/3 epoch (loss 1.2996): 25%|βββ | 511/2046 [06:11<16:12, 1.58it/s]
Training 1/3 epoch (loss 1.2315): 25%|βββ | 511/2046 [06:11<16:12, 1.58it/s]
Training 1/3 epoch (loss 1.2315): 25%|βββ | 512/2046 [06:11<17:13, 1.48it/s]
Training 1/3 epoch (loss 1.4908): 25%|βββ | 512/2046 [06:12<17:13, 1.48it/s]
Training 1/3 epoch (loss 1.4908): 25%|βββ | 513/2046 [06:12<17:24, 1.47it/s]
Training 1/3 epoch (loss 1.3901): 25%|βββ | 513/2046 [06:13<17:24, 1.47it/s]
Training 1/3 epoch (loss 1.3901): 25%|βββ | 514/2046 [06:13<17:05, 1.49it/s]
Training 1/3 epoch (loss 1.3385): 25%|βββ | 514/2046 [06:13<17:05, 1.49it/s]
Training 1/3 epoch (loss 1.3385): 25%|βββ | 515/2046 [06:13<16:35, 1.54it/s]
Training 1/3 epoch (loss 1.2845): 25%|βββ | 515/2046 [06:14<16:35, 1.54it/s]
Training 1/3 epoch (loss 1.2845): 25%|βββ | 516/2046 [06:14<16:12, 1.57it/s]
Training 1/3 epoch (loss 1.1508): 25%|βββ | 516/2046 [06:15<16:12, 1.57it/s]
Training 1/3 epoch (loss 1.1508): 25%|βββ | 517/2046 [06:15<16:21, 1.56it/s]
Training 1/3 epoch (loss 1.4111): 25%|βββ | 517/2046 [06:15<16:21, 1.56it/s]
Training 1/3 epoch (loss 1.4111): 25%|βββ | 518/2046 [06:15<16:58, 1.50it/s]
Training 1/3 epoch (loss 1.0997): 25%|βββ | 518/2046 [06:16<16:58, 1.50it/s]
Training 1/3 epoch (loss 1.0997): 25%|βββ | 519/2046 [06:16<16:36, 1.53it/s]
Training 1/3 epoch (loss 1.2324): 25%|βββ | 519/2046 [06:17<16:36, 1.53it/s]
Training 1/3 epoch (loss 1.2324): 25%|βββ | 520/2046 [06:17<17:47, 1.43it/s]
Training 1/3 epoch (loss 1.0895): 25%|βββ | 520/2046 [06:17<17:47, 1.43it/s]
Training 1/3 epoch (loss 1.0895): 25%|βββ | 521/2046 [06:17<18:01, 1.41it/s]
Training 1/3 epoch (loss 1.4567): 25%|βββ | 521/2046 [06:18<18:01, 1.41it/s]
Training 1/3 epoch (loss 1.4567): 26%|βββ | 522/2046 [06:18<18:45, 1.35it/s]
Training 1/3 epoch (loss 1.1849): 26%|βββ | 522/2046 [06:19<18:45, 1.35it/s]
Training 1/3 epoch (loss 1.1849): 26%|βββ | 523/2046 [06:19<18:52, 1.35it/s]
Training 1/3 epoch (loss 1.1017): 26%|βββ | 523/2046 [06:20<18:52, 1.35it/s]
Training 1/3 epoch (loss 1.1017): 26%|βββ | 524/2046 [06:20<18:57, 1.34it/s]
Training 1/3 epoch (loss 1.3905): 26%|βββ | 524/2046 [06:20<18:57, 1.34it/s]
Training 1/3 epoch (loss 1.3905): 26%|βββ | 525/2046 [06:20<17:58, 1.41it/s]
Training 1/3 epoch (loss 1.4442): 26%|βββ | 525/2046 [06:21<17:58, 1.41it/s]
Training 1/3 epoch (loss 1.4442): 26%|βββ | 526/2046 [06:21<17:23, 1.46it/s]
Training 1/3 epoch (loss 1.3233): 26%|βββ | 526/2046 [06:22<17:23, 1.46it/s]
Training 1/3 epoch (loss 1.3233): 26%|βββ | 527/2046 [06:22<16:50, 1.50it/s]
Training 1/3 epoch (loss 1.2822): 26%|βββ | 527/2046 [06:23<16:50, 1.50it/s]
Training 1/3 epoch (loss 1.2822): 26%|βββ | 528/2046 [06:23<20:21, 1.24it/s]
Training 1/3 epoch (loss 1.0164): 26%|βββ | 528/2046 [06:24<20:21, 1.24it/s]
Training 1/3 epoch (loss 1.0164): 26%|βββ | 529/2046 [06:24<20:10, 1.25it/s]
Training 1/3 epoch (loss 1.0921): 26%|βββ | 529/2046 [06:24<20:10, 1.25it/s]
Training 1/3 epoch (loss 1.0921): 26%|βββ | 530/2046 [06:24<19:53, 1.27it/s]
Training 1/3 epoch (loss 1.2729): 26%|βββ | 530/2046 [06:25<19:53, 1.27it/s]
Training 1/3 epoch (loss 1.2729): 26%|βββ | 531/2046 [06:25<18:52, 1.34it/s]
Training 1/3 epoch (loss 1.4493): 26%|βββ | 531/2046 [06:26<18:52, 1.34it/s]
Training 1/3 epoch (loss 1.4493): 26%|βββ | 532/2046 [06:26<20:36, 1.22it/s]
Training 1/3 epoch (loss 1.3258): 26%|βββ | 532/2046 [06:27<20:36, 1.22it/s]
Training 1/3 epoch (loss 1.3258): 26%|βββ | 533/2046 [06:27<19:22, 1.30it/s]
Training 1/3 epoch (loss 1.2966): 26%|βββ | 533/2046 [06:27<19:22, 1.30it/s]
Training 1/3 epoch (loss 1.2966): 26%|βββ | 534/2046 [06:27<19:11, 1.31it/s]
Training 1/3 epoch (loss 1.2348): 26%|βββ | 534/2046 [06:28<19:11, 1.31it/s]
Training 1/3 epoch (loss 1.2348): 26%|βββ | 535/2046 [06:28<18:03, 1.40it/s]
Training 1/3 epoch (loss 1.1935): 26%|βββ | 535/2046 [06:29<18:03, 1.40it/s]
Training 1/3 epoch (loss 1.1935): 26%|βββ | 536/2046 [06:29<19:19, 1.30it/s]
Training 1/3 epoch (loss 1.3190): 26%|βββ | 536/2046 [06:30<19:19, 1.30it/s]
Training 1/3 epoch (loss 1.3190): 26%|βββ | 537/2046 [06:30<19:10, 1.31it/s]
Training 1/3 epoch (loss 1.1788): 26%|βββ | 537/2046 [06:30<19:10, 1.31it/s]
Training 1/3 epoch (loss 1.1788): 26%|βββ | 538/2046 [06:30<19:04, 1.32it/s]
Training 1/3 epoch (loss 1.2628): 26%|βββ | 538/2046 [06:31<19:04, 1.32it/s]
Training 1/3 epoch (loss 1.2628): 26%|βββ | 539/2046 [06:31<18:04, 1.39it/s]
Training 1/3 epoch (loss 1.2650): 26%|βββ | 539/2046 [06:32<18:04, 1.39it/s]
Training 1/3 epoch (loss 1.2650): 26%|βββ | 540/2046 [06:32<17:14, 1.46it/s]
Training 1/3 epoch (loss 1.2761): 26%|βββ | 540/2046 [06:32<17:14, 1.46it/s]
Training 1/3 epoch (loss 1.2761): 26%|βββ | 541/2046 [06:32<17:47, 1.41it/s]
Training 1/3 epoch (loss 1.2681): 26%|βββ | 541/2046 [06:33<17:47, 1.41it/s]
Training 1/3 epoch (loss 1.2681): 26%|βββ | 542/2046 [06:33<17:21, 1.44it/s]
Training 1/3 epoch (loss 1.3707): 26%|βββ | 542/2046 [06:34<17:21, 1.44it/s]
Training 1/3 epoch (loss 1.3707): 27%|βββ | 543/2046 [06:34<16:39, 1.50it/s]
Training 1/3 epoch (loss 1.1994): 27%|βββ | 543/2046 [06:34<16:39, 1.50it/s]
Training 1/3 epoch (loss 1.1994): 27%|βββ | 544/2046 [06:34<18:04, 1.38it/s]
Training 1/3 epoch (loss 1.3588): 27%|βββ | 544/2046 [06:35<18:04, 1.38it/s]
Training 1/3 epoch (loss 1.3588): 27%|βββ | 545/2046 [06:35<17:54, 1.40it/s]
Training 1/3 epoch (loss 1.4415): 27%|βββ | 545/2046 [06:36<17:54, 1.40it/s]
Training 1/3 epoch (loss 1.4415): 27%|βββ | 546/2046 [06:36<17:20, 1.44it/s]
Training 1/3 epoch (loss 1.4160): 27%|βββ | 546/2046 [06:36<17:20, 1.44it/s]
Training 1/3 epoch (loss 1.4160): 27%|βββ | 547/2046 [06:36<16:51, 1.48it/s]
Training 1/3 epoch (loss 1.1636): 27%|βββ | 547/2046 [06:37<16:51, 1.48it/s]
Training 1/3 epoch (loss 1.1636): 27%|βββ | 548/2046 [06:37<16:58, 1.47it/s]
Training 1/3 epoch (loss 1.4226): 27%|βββ | 548/2046 [06:38<16:58, 1.47it/s]
Training 1/3 epoch (loss 1.4226): 27%|βββ | 549/2046 [06:38<17:28, 1.43it/s]
Training 1/3 epoch (loss 1.5610): 27%|βββ | 549/2046 [06:38<17:28, 1.43it/s]
Training 1/3 epoch (loss 1.5610): 27%|βββ | 550/2046 [06:38<16:44, 1.49it/s]
Training 1/3 epoch (loss 1.3023): 27%|βββ | 550/2046 [06:39<16:44, 1.49it/s]
Training 1/3 epoch (loss 1.3023): 27%|βββ | 551/2046 [06:39<16:17, 1.53it/s]
Training 1/3 epoch (loss 1.0592): 27%|βββ | 551/2046 [06:40<16:17, 1.53it/s]
Training 1/3 epoch (loss 1.0592): 27%|βββ | 552/2046 [06:40<17:11, 1.45it/s]
Training 1/3 epoch (loss 1.3854): 27%|βββ | 552/2046 [06:40<17:11, 1.45it/s]
Training 1/3 epoch (loss 1.3854): 27%|βββ | 553/2046 [06:40<16:46, 1.48it/s]
Training 1/3 epoch (loss 1.3955): 27%|βββ | 553/2046 [06:41<16:46, 1.48it/s]
Training 1/3 epoch (loss 1.3955): 27%|βββ | 554/2046 [06:41<16:18, 1.52it/s]
Training 1/3 epoch (loss 0.9802): 27%|βββ | 554/2046 [06:42<16:18, 1.52it/s]
Training 1/3 epoch (loss 0.9802): 27%|βββ | 555/2046 [06:42<16:19, 1.52it/s]
Training 1/3 epoch (loss 1.3450): 27%|βββ | 555/2046 [06:42<16:19, 1.52it/s]
Training 1/3 epoch (loss 1.3450): 27%|βββ | 556/2046 [06:42<15:59, 1.55it/s]
Training 1/3 epoch (loss 1.3326): 27%|βββ | 556/2046 [06:43<15:59, 1.55it/s]
Training 1/3 epoch (loss 1.3326): 27%|βββ | 557/2046 [06:43<15:47, 1.57it/s]
Training 1/3 epoch (loss 1.3311): 27%|βββ | 557/2046 [06:44<15:47, 1.57it/s]
Training 1/3 epoch (loss 1.3311): 27%|βββ | 558/2046 [06:44<16:33, 1.50it/s]
Training 1/3 epoch (loss 1.2375): 27%|βββ | 558/2046 [06:44<16:33, 1.50it/s]
Training 1/3 epoch (loss 1.2375): 27%|βββ | 559/2046 [06:44<16:20, 1.52it/s]
Training 1/3 epoch (loss 1.3073): 27%|βββ | 559/2046 [06:45<16:20, 1.52it/s]
Training 1/3 epoch (loss 1.3073): 27%|βββ | 560/2046 [06:45<18:29, 1.34it/s]
Training 1/3 epoch (loss 1.3424): 27%|βββ | 560/2046 [06:46<18:29, 1.34it/s]
Training 1/3 epoch (loss 1.3424): 27%|βββ | 561/2046 [06:46<17:36, 1.41it/s]
Training 1/3 epoch (loss 1.0953): 27%|βββ | 561/2046 [06:47<17:36, 1.41it/s]
Training 1/3 epoch (loss 1.0953): 27%|βββ | 562/2046 [06:47<17:02, 1.45it/s]
Training 1/3 epoch (loss 1.1058): 27%|βββ | 562/2046 [06:47<17:02, 1.45it/s]
Training 1/3 epoch (loss 1.1058): 28%|βββ | 563/2046 [06:47<16:28, 1.50it/s]
Training 1/3 epoch (loss 1.2042): 28%|βββ | 563/2046 [06:48<16:28, 1.50it/s]
Training 1/3 epoch (loss 1.2042): 28%|βββ | 564/2046 [06:48<16:59, 1.45it/s]
Training 1/3 epoch (loss 1.0401): 28%|βββ | 564/2046 [06:49<16:59, 1.45it/s]
Training 1/3 epoch (loss 1.0401): 28%|βββ | 565/2046 [06:49<17:51, 1.38it/s]
Training 1/3 epoch (loss 1.3665): 28%|βββ | 565/2046 [06:50<17:51, 1.38it/s]
Training 1/3 epoch (loss 1.3665): 28%|βββ | 566/2046 [06:50<19:02, 1.29it/s]
Training 1/3 epoch (loss 1.2683): 28%|βββ | 566/2046 [06:50<19:02, 1.29it/s]
Training 1/3 epoch (loss 1.2683): 28%|βββ | 567/2046 [06:50<18:53, 1.30it/s]
Training 1/3 epoch (loss 1.1239): 28%|βββ | 567/2046 [06:51<18:53, 1.30it/s]
Training 1/3 epoch (loss 1.1239): 28%|βββ | 568/2046 [06:51<19:28, 1.27it/s]
Training 1/3 epoch (loss 1.3622): 28%|βββ | 568/2046 [06:52<19:28, 1.27it/s]
Training 1/3 epoch (loss 1.3622): 28%|βββ | 569/2046 [06:52<18:11, 1.35it/s]
Training 1/3 epoch (loss 1.3167): 28%|βββ | 569/2046 [06:53<18:11, 1.35it/s]
Training 1/3 epoch (loss 1.3167): 28%|βββ | 570/2046 [06:53<18:25, 1.34it/s]
Training 1/3 epoch (loss 1.2421): 28%|βββ | 570/2046 [06:53<18:25, 1.34it/s]
Training 1/3 epoch (loss 1.2421): 28%|βββ | 571/2046 [06:53<18:50, 1.30it/s]
Training 1/3 epoch (loss 1.3467): 28%|βββ | 571/2046 [06:54<18:50, 1.30it/s]
Training 1/3 epoch (loss 1.3467): 28%|βββ | 572/2046 [06:54<19:48, 1.24it/s]
Training 1/3 epoch (loss 1.0707): 28%|βββ | 572/2046 [06:55<19:48, 1.24it/s]
Training 1/3 epoch (loss 1.0707): 28%|βββ | 573/2046 [06:55<19:45, 1.24it/s]
Training 1/3 epoch (loss 1.3673): 28%|βββ | 573/2046 [06:56<19:45, 1.24it/s]
Training 1/3 epoch (loss 1.3673): 28%|βββ | 574/2046 [06:56<18:35, 1.32it/s]
Training 1/3 epoch (loss 1.2430): 28%|βββ | 574/2046 [06:57<18:35, 1.32it/s]
Training 1/3 epoch (loss 1.2430): 28%|βββ | 575/2046 [06:57<19:20, 1.27it/s]
Training 1/3 epoch (loss 1.2985): 28%|βββ | 575/2046 [06:57<19:20, 1.27it/s]
Training 1/3 epoch (loss 1.2985): 28%|βββ | 576/2046 [06:57<19:10, 1.28it/s]
Training 1/3 epoch (loss 1.3743): 28%|βββ | 576/2046 [06:58<19:10, 1.28it/s]
Training 1/3 epoch (loss 1.3743): 28%|βββ | 577/2046 [06:58<18:07, 1.35it/s]
Training 1/3 epoch (loss 1.2062): 28%|βββ | 577/2046 [06:59<18:07, 1.35it/s]
Training 1/3 epoch (loss 1.2062): 28%|βββ | 578/2046 [06:59<17:44, 1.38it/s]
Training 1/3 epoch (loss 1.2084): 28%|βββ | 578/2046 [06:59<17:44, 1.38it/s]
Training 1/3 epoch (loss 1.2084): 28%|βββ | 579/2046 [06:59<17:39, 1.39it/s]
Training 1/3 epoch (loss 1.4550): 28%|βββ | 579/2046 [07:00<17:39, 1.39it/s]
Training 1/3 epoch (loss 1.4550): 28%|βββ | 580/2046 [07:00<17:26, 1.40it/s]
Training 1/3 epoch (loss 1.3434): 28%|βββ | 580/2046 [07:01<17:26, 1.40it/s]
Training 1/3 epoch (loss 1.3434): 28%|βββ | 581/2046 [07:01<16:51, 1.45it/s]
Training 1/3 epoch (loss 1.1900): 28%|βββ | 581/2046 [07:01<16:51, 1.45it/s]
Training 1/3 epoch (loss 1.1900): 28%|βββ | 582/2046 [07:01<16:40, 1.46it/s]
Training 1/3 epoch (loss 1.2060): 28%|βββ | 582/2046 [07:02<16:40, 1.46it/s]
Training 1/3 epoch (loss 1.2060): 28%|βββ | 583/2046 [07:02<16:22, 1.49it/s]
Training 1/3 epoch (loss 1.4155): 28%|βββ | 583/2046 [07:03<16:22, 1.49it/s]
Training 1/3 epoch (loss 1.4155): 29%|βββ | 584/2046 [07:03<17:39, 1.38it/s]
Training 1/3 epoch (loss 1.1907): 29%|βββ | 584/2046 [07:04<17:39, 1.38it/s]
Training 1/3 epoch (loss 1.1907): 29%|βββ | 585/2046 [07:04<17:29, 1.39it/s]
Training 1/3 epoch (loss 1.4031): 29%|βββ | 585/2046 [07:04<17:29, 1.39it/s]
Training 1/3 epoch (loss 1.4031): 29%|βββ | 586/2046 [07:04<18:16, 1.33it/s]
Training 1/3 epoch (loss 1.5325): 29%|βββ | 586/2046 [07:05<18:16, 1.33it/s]
Training 1/3 epoch (loss 1.5325): 29%|βββ | 587/2046 [07:05<17:36, 1.38it/s]
Training 1/3 epoch (loss 1.2965): 29%|βββ | 587/2046 [07:06<17:36, 1.38it/s]
Training 1/3 epoch (loss 1.2965): 29%|βββ | 588/2046 [07:06<16:47, 1.45it/s]
Training 1/3 epoch (loss 1.1258): 29%|βββ | 588/2046 [07:06<16:47, 1.45it/s]
Training 1/3 epoch (loss 1.1258): 29%|βββ | 589/2046 [07:06<17:08, 1.42it/s]
Training 1/3 epoch (loss 1.3686): 29%|βββ | 589/2046 [07:07<17:08, 1.42it/s]
Training 1/3 epoch (loss 1.3686): 29%|βββ | 590/2046 [07:07<16:54, 1.44it/s]
Training 1/3 epoch (loss 1.2464): 29%|βββ | 590/2046 [07:08<16:54, 1.44it/s]
Training 1/3 epoch (loss 1.2464): 29%|βββ | 591/2046 [07:08<16:39, 1.46it/s]
Training 1/3 epoch (loss 1.1973): 29%|βββ | 591/2046 [07:09<16:39, 1.46it/s]
Training 1/3 epoch (loss 1.1973): 29%|βββ | 592/2046 [07:09<18:57, 1.28it/s]
Training 1/3 epoch (loss 1.3453): 29%|βββ | 592/2046 [07:09<18:57, 1.28it/s]
Training 1/3 epoch (loss 1.3453): 29%|βββ | 593/2046 [07:09<17:51, 1.36it/s]
Training 1/3 epoch (loss 1.1565): 29%|βββ | 593/2046 [07:10<17:51, 1.36it/s]
Training 1/3 epoch (loss 1.1565): 29%|βββ | 594/2046 [07:10<17:09, 1.41it/s]
Training 1/3 epoch (loss 1.1970): 29%|βββ | 594/2046 [07:11<17:09, 1.41it/s]
Training 1/3 epoch (loss 1.1970): 29%|βββ | 595/2046 [07:11<16:25, 1.47it/s]
Training 1/3 epoch (loss 1.1941): 29%|βββ | 595/2046 [07:11<16:25, 1.47it/s]
Training 1/3 epoch (loss 1.1941): 29%|βββ | 596/2046 [07:11<16:12, 1.49it/s]
Training 1/3 epoch (loss 1.2547): 29%|βββ | 596/2046 [07:12<16:12, 1.49it/s]
Training 1/3 epoch (loss 1.2547): 29%|βββ | 597/2046 [07:12<16:05, 1.50it/s]
Training 1/3 epoch (loss 1.2920): 29%|βββ | 597/2046 [07:13<16:05, 1.50it/s]
Training 1/3 epoch (loss 1.2920): 29%|βββ | 598/2046 [07:13<15:49, 1.52it/s]
Training 1/3 epoch (loss 1.1857): 29%|βββ | 598/2046 [07:13<15:49, 1.52it/s]
Training 1/3 epoch (loss 1.1857): 29%|βββ | 599/2046 [07:13<16:35, 1.45it/s]
Training 1/3 epoch (loss 1.2584): 29%|βββ | 599/2046 [07:14<16:35, 1.45it/s]
Training 1/3 epoch (loss 1.2584): 29%|βββ | 600/2046 [07:14<18:04, 1.33it/s]
Training 1/3 epoch (loss 1.2529): 29%|βββ | 600/2046 [07:15<18:04, 1.33it/s]
Training 1/3 epoch (loss 1.2529): 29%|βββ | 601/2046 [07:15<18:26, 1.31it/s]
Training 1/3 epoch (loss 1.2997): 29%|βββ | 601/2046 [07:16<18:26, 1.31it/s]
Training 1/3 epoch (loss 1.2997): 29%|βββ | 602/2046 [07:16<18:11, 1.32it/s]
Training 1/3 epoch (loss 1.3201): 29%|βββ | 602/2046 [07:16<18:11, 1.32it/s]
Training 1/3 epoch (loss 1.3201): 29%|βββ | 603/2046 [07:16<17:26, 1.38it/s]
Training 1/3 epoch (loss 1.1795): 29%|βββ | 603/2046 [07:17<17:26, 1.38it/s]
Training 1/3 epoch (loss 1.1795): 30%|βββ | 604/2046 [07:17<18:01, 1.33it/s]
Training 1/3 epoch (loss 1.2009): 30%|βββ | 604/2046 [07:18<18:01, 1.33it/s]
Training 1/3 epoch (loss 1.2009): 30%|βββ | 605/2046 [07:18<19:04, 1.26it/s]
Training 1/3 epoch (loss 1.4459): 30%|βββ | 605/2046 [07:19<19:04, 1.26it/s]
Training 1/3 epoch (loss 1.4459): 30%|βββ | 606/2046 [07:19<19:15, 1.25it/s]
Training 1/3 epoch (loss 1.3206): 30%|βββ | 606/2046 [07:20<19:15, 1.25it/s]
Training 1/3 epoch (loss 1.3206): 30%|βββ | 607/2046 [07:20<19:57, 1.20it/s]
Training 1/3 epoch (loss 1.2328): 30%|βββ | 607/2046 [07:21<19:57, 1.20it/s]
Training 1/3 epoch (loss 1.2328): 30%|βββ | 608/2046 [07:21<19:46, 1.21it/s]
Training 1/3 epoch (loss 1.1131): 30%|βββ | 608/2046 [07:21<19:46, 1.21it/s]
Training 1/3 epoch (loss 1.1131): 30%|βββ | 609/2046 [07:21<19:05, 1.25it/s]
Training 1/3 epoch (loss 1.5814): 30%|βββ | 609/2046 [07:22<19:05, 1.25it/s]
Training 1/3 epoch (loss 1.5814): 30%|βββ | 610/2046 [07:22<18:26, 1.30it/s]
Training 1/3 epoch (loss 1.3932): 30%|βββ | 610/2046 [07:23<18:26, 1.30it/s]
Training 1/3 epoch (loss 1.3932): 30%|βββ | 611/2046 [07:23<19:33, 1.22it/s]
Training 1/3 epoch (loss 1.5205): 30%|βββ | 611/2046 [07:24<19:33, 1.22it/s]
Training 1/3 epoch (loss 1.5205): 30%|βββ | 612/2046 [07:24<19:35, 1.22it/s]
Training 1/3 epoch (loss 1.2108): 30%|βββ | 612/2046 [07:25<19:35, 1.22it/s]
Training 1/3 epoch (loss 1.2108): 30%|βββ | 613/2046 [07:25<19:20, 1.23it/s]
Training 1/3 epoch (loss 1.1022): 30%|βββ | 613/2046 [07:25<19:20, 1.23it/s]
Training 1/3 epoch (loss 1.1022): 30%|βββ | 614/2046 [07:25<18:39, 1.28it/s]
Training 1/3 epoch (loss 1.1245): 30%|βββ | 614/2046 [07:26<18:39, 1.28it/s]
Training 1/3 epoch (loss 1.1245): 30%|βββ | 615/2046 [07:26<17:30, 1.36it/s]
Training 1/3 epoch (loss 1.2378): 30%|βββ | 615/2046 [07:27<17:30, 1.36it/s]
Training 1/3 epoch (loss 1.2378): 30%|βββ | 616/2046 [07:27<18:20, 1.30it/s]
Training 1/3 epoch (loss 1.1435): 30%|βββ | 616/2046 [07:28<18:20, 1.30it/s]
Training 1/3 epoch (loss 1.1435): 30%|βββ | 617/2046 [07:28<17:19, 1.37it/s]
Training 1/3 epoch (loss 1.2612): 30%|βββ | 617/2046 [07:28<17:19, 1.37it/s]
Training 1/3 epoch (loss 1.2612): 30%|βββ | 618/2046 [07:28<19:02, 1.25it/s]
Training 1/3 epoch (loss 1.3976): 30%|βββ | 618/2046 [07:29<19:02, 1.25it/s]
Training 1/3 epoch (loss 1.3976): 30%|βββ | 619/2046 [07:29<17:45, 1.34it/s]
Training 1/3 epoch (loss 1.1346): 30%|βββ | 619/2046 [07:30<17:45, 1.34it/s]
Training 1/3 epoch (loss 1.1346): 30%|βββ | 620/2046 [07:30<16:57, 1.40it/s]
Training 1/3 epoch (loss 1.2697): 30%|βββ | 620/2046 [07:30<16:57, 1.40it/s]
Training 1/3 epoch (loss 1.2697): 30%|βββ | 621/2046 [07:30<16:26, 1.44it/s]
Training 1/3 epoch (loss 1.4471): 30%|βββ | 621/2046 [07:31<16:26, 1.44it/s]
Training 1/3 epoch (loss 1.4471): 30%|βββ | 622/2046 [07:31<16:07, 1.47it/s]
Training 1/3 epoch (loss 1.0862): 30%|βββ | 622/2046 [07:32<16:07, 1.47it/s]
Training 1/3 epoch (loss 1.0862): 30%|βββ | 623/2046 [07:32<16:06, 1.47it/s]
Training 1/3 epoch (loss 1.4197): 30%|βββ | 623/2046 [07:32<16:06, 1.47it/s]
Training 1/3 epoch (loss 1.4197): 30%|βββ | 624/2046 [07:32<16:35, 1.43it/s]
Training 1/3 epoch (loss 1.3636): 30%|βββ | 624/2046 [07:33<16:35, 1.43it/s]
Training 1/3 epoch (loss 1.3636): 31%|βββ | 625/2046 [07:33<17:00, 1.39it/s]
Training 1/3 epoch (loss 1.1772): 31%|βββ | 625/2046 [07:34<17:00, 1.39it/s]
Training 1/3 epoch (loss 1.1772): 31%|βββ | 626/2046 [07:34<16:24, 1.44it/s]
Training 1/3 epoch (loss 1.2778): 31%|βββ | 626/2046 [07:35<16:24, 1.44it/s]
Training 1/3 epoch (loss 1.2778): 31%|βββ | 627/2046 [07:35<16:12, 1.46it/s]
Training 1/3 epoch (loss 1.3187): 31%|βββ | 627/2046 [07:35<16:12, 1.46it/s]
Training 1/3 epoch (loss 1.3187): 31%|βββ | 628/2046 [07:35<16:14, 1.45it/s]
Training 1/3 epoch (loss 1.2450): 31%|βββ | 628/2046 [07:36<16:14, 1.45it/s]
Training 1/3 epoch (loss 1.2450): 31%|βββ | 629/2046 [07:36<16:50, 1.40it/s]
Training 1/3 epoch (loss 1.2618): 31%|βββ | 629/2046 [07:37<16:50, 1.40it/s]
Training 1/3 epoch (loss 1.2618): 31%|βββ | 630/2046 [07:37<16:08, 1.46it/s]
Training 1/3 epoch (loss 1.1377): 31%|βββ | 630/2046 [07:37<16:08, 1.46it/s]
Training 1/3 epoch (loss 1.1377): 31%|βββ | 631/2046 [07:37<17:35, 1.34it/s]
Training 1/3 epoch (loss 1.2688): 31%|βββ | 631/2046 [07:39<17:35, 1.34it/s]
Training 1/3 epoch (loss 1.2688): 31%|βββ | 632/2046 [07:39<19:49, 1.19it/s]
Training 1/3 epoch (loss 1.4455): 31%|βββ | 632/2046 [07:39<19:49, 1.19it/s]
Training 1/3 epoch (loss 1.4455): 31%|βββ | 633/2046 [07:39<18:19, 1.29it/s]
Training 1/3 epoch (loss 1.2564): 31%|βββ | 633/2046 [07:40<18:19, 1.29it/s]
Training 1/3 epoch (loss 1.2564): 31%|βββ | 634/2046 [07:40<17:18, 1.36it/s]
Training 1/3 epoch (loss 1.2785): 31%|βββ | 634/2046 [07:40<17:18, 1.36it/s]
Training 1/3 epoch (loss 1.2785): 31%|βββ | 635/2046 [07:40<16:37, 1.41it/s]
Training 1/3 epoch (loss 1.3118): 31%|βββ | 635/2046 [07:41<16:37, 1.41it/s]
Training 1/3 epoch (loss 1.3118): 31%|βββ | 636/2046 [07:41<15:51, 1.48it/s]
Training 1/3 epoch (loss 1.2402): 31%|βββ | 636/2046 [07:42<15:51, 1.48it/s]
Training 1/3 epoch (loss 1.2402): 31%|βββ | 637/2046 [07:42<15:34, 1.51it/s]
Training 1/3 epoch (loss 1.2461): 31%|βββ | 637/2046 [07:42<15:34, 1.51it/s]
Training 1/3 epoch (loss 1.2461): 31%|βββ | 638/2046 [07:42<15:22, 1.53it/s]
Training 1/3 epoch (loss 1.3134): 31%|βββ | 638/2046 [07:43<15:22, 1.53it/s]
Training 1/3 epoch (loss 1.3134): 31%|βββ | 639/2046 [07:43<15:22, 1.53it/s]
Training 1/3 epoch (loss 1.4084): 31%|βββ | 639/2046 [07:44<15:22, 1.53it/s]
Training 1/3 epoch (loss 1.4084): 31%|ββββ | 640/2046 [07:44<16:30, 1.42it/s]
Training 1/3 epoch (loss 1.3926): 31%|ββββ | 640/2046 [07:44<16:30, 1.42it/s]
Training 1/3 epoch (loss 1.3926): 31%|ββββ | 641/2046 [07:44<15:57, 1.47it/s]
Training 1/3 epoch (loss 1.0002): 31%|ββββ | 641/2046 [07:45<15:57, 1.47it/s]
Training 1/3 epoch (loss 1.0002): 31%|ββββ | 642/2046 [07:45<16:25, 1.42it/s]
Training 1/3 epoch (loss 1.2187): 31%|ββββ | 642/2046 [07:46<16:25, 1.42it/s]
Training 1/3 epoch (loss 1.2187): 31%|ββββ | 643/2046 [07:46<15:57, 1.46it/s]
Training 1/3 epoch (loss 1.1209): 31%|ββββ | 643/2046 [07:47<15:57, 1.46it/s]
Training 1/3 epoch (loss 1.1209): 31%|ββββ | 644/2046 [07:47<15:56, 1.47it/s]
Training 1/3 epoch (loss 1.1949): 31%|ββββ | 644/2046 [07:47<15:56, 1.47it/s]
Training 1/3 epoch (loss 1.1949): 32%|ββββ | 645/2046 [07:47<15:22, 1.52it/s]
Training 1/3 epoch (loss 1.2069): 32%|ββββ | 645/2046 [07:48<15:22, 1.52it/s]
Training 1/3 epoch (loss 1.2069): 32%|ββββ | 646/2046 [07:48<16:14, 1.44it/s]
Training 1/3 epoch (loss 1.1411): 32%|ββββ | 646/2046 [07:49<16:14, 1.44it/s]
Training 1/3 epoch (loss 1.1411): 32%|ββββ | 647/2046 [07:49<16:15, 1.43it/s]
Training 1/3 epoch (loss 1.1290): 32%|ββββ | 647/2046 [07:50<16:15, 1.43it/s]
Training 1/3 epoch (loss 1.1290): 32%|ββββ | 648/2046 [07:50<17:43, 1.32it/s]
Training 1/3 epoch (loss 1.2461): 32%|ββββ | 648/2046 [07:50<17:43, 1.32it/s]
Training 1/3 epoch (loss 1.2461): 32%|ββββ | 649/2046 [07:50<17:22, 1.34it/s]
Training 1/3 epoch (loss 1.1837): 32%|ββββ | 649/2046 [07:51<17:22, 1.34it/s]
Training 1/3 epoch (loss 1.1837): 32%|ββββ | 650/2046 [07:51<16:56, 1.37it/s]
Training 1/3 epoch (loss 1.3113): 32%|ββββ | 650/2046 [07:52<16:56, 1.37it/s]
Training 1/3 epoch (loss 1.3113): 32%|ββββ | 651/2046 [07:52<16:07, 1.44it/s]
Training 1/3 epoch (loss 1.3674): 32%|ββββ | 651/2046 [07:52<16:07, 1.44it/s]
Training 1/3 epoch (loss 1.3674): 32%|ββββ | 652/2046 [07:52<16:07, 1.44it/s]
Training 1/3 epoch (loss 1.3325): 32%|ββββ | 652/2046 [07:53<16:07, 1.44it/s]
Training 1/3 epoch (loss 1.3325): 32%|ββββ | 653/2046 [07:53<16:47, 1.38it/s]
Training 1/3 epoch (loss 1.0663): 32%|ββββ | 653/2046 [07:54<16:47, 1.38it/s]
Training 1/3 epoch (loss 1.0663): 32%|ββββ | 654/2046 [07:54<17:19, 1.34it/s]
Training 1/3 epoch (loss 1.1728): 32%|ββββ | 654/2046 [07:55<17:19, 1.34it/s]
Training 1/3 epoch (loss 1.1728): 32%|ββββ | 655/2046 [07:55<16:57, 1.37it/s]
Training 1/3 epoch (loss 1.2392): 32%|ββββ | 655/2046 [07:55<16:57, 1.37it/s]
Training 1/3 epoch (loss 1.2392): 32%|ββββ | 656/2046 [07:55<17:38, 1.31it/s]
Training 1/3 epoch (loss 1.0408): 32%|ββββ | 656/2046 [07:56<17:38, 1.31it/s]
Training 1/3 epoch (loss 1.0408): 32%|ββββ | 657/2046 [07:56<17:01, 1.36it/s]
Training 1/3 epoch (loss 1.2389): 32%|ββββ | 657/2046 [07:57<17:01, 1.36it/s]
Training 1/3 epoch (loss 1.2389): 32%|ββββ | 658/2046 [07:57<17:05, 1.35it/s]
Training 1/3 epoch (loss 1.2235): 32%|ββββ | 658/2046 [07:57<17:05, 1.35it/s]
Training 1/3 epoch (loss 1.2235): 32%|ββββ | 659/2046 [07:57<16:40, 1.39it/s]
Training 1/3 epoch (loss 1.4380): 32%|ββββ | 659/2046 [07:58<16:40, 1.39it/s]
Training 1/3 epoch (loss 1.4380): 32%|ββββ | 660/2046 [07:58<15:59, 1.44it/s]
Training 1/3 epoch (loss 1.0925): 32%|ββββ | 660/2046 [07:59<15:59, 1.44it/s]
Training 1/3 epoch (loss 1.0925): 32%|ββββ | 661/2046 [07:59<15:35, 1.48it/s]
Training 1/3 epoch (loss 1.3385): 32%|ββββ | 661/2046 [07:59<15:35, 1.48it/s]
Training 1/3 epoch (loss 1.3385): 32%|ββββ | 662/2046 [07:59<15:32, 1.48it/s]
Training 1/3 epoch (loss 1.0216): 32%|ββββ | 662/2046 [08:00<15:32, 1.48it/s]
Training 1/3 epoch (loss 1.0216): 32%|ββββ | 663/2046 [08:00<15:17, 1.51it/s]
Training 1/3 epoch (loss 1.3586): 32%|ββββ | 663/2046 [08:01<15:17, 1.51it/s]
Training 1/3 epoch (loss 1.3586): 32%|ββββ | 664/2046 [08:01<16:55, 1.36it/s]
Training 1/3 epoch (loss 1.3428): 32%|ββββ | 664/2046 [08:02<16:55, 1.36it/s]
Training 1/3 epoch (loss 1.3428): 33%|ββββ | 665/2046 [08:02<16:20, 1.41it/s]
Training 1/3 epoch (loss 1.3668): 33%|ββββ | 665/2046 [08:02<16:20, 1.41it/s]
Training 1/3 epoch (loss 1.3668): 33%|ββββ | 666/2046 [08:02<16:27, 1.40it/s]
Training 1/3 epoch (loss 1.1419): 33%|ββββ | 666/2046 [08:03<16:27, 1.40it/s]
Training 1/3 epoch (loss 1.1419): 33%|ββββ | 667/2046 [08:03<16:31, 1.39it/s]
Training 1/3 epoch (loss 1.0005): 33%|ββββ | 667/2046 [08:04<16:31, 1.39it/s]
Training 1/3 epoch (loss 1.0005): 33%|ββββ | 668/2046 [08:04<17:05, 1.34it/s]
Training 1/3 epoch (loss 1.1256): 33%|ββββ | 668/2046 [08:04<17:05, 1.34it/s]
Training 1/3 epoch (loss 1.1256): 33%|ββββ | 669/2046 [08:04<16:25, 1.40it/s]
Training 1/3 epoch (loss 1.3752): 33%|ββββ | 669/2046 [08:05<16:25, 1.40it/s]
Training 1/3 epoch (loss 1.3752): 33%|ββββ | 670/2046 [08:05<17:09, 1.34it/s]
Training 1/3 epoch (loss 1.0817): 33%|ββββ | 670/2046 [08:06<17:09, 1.34it/s]
Training 1/3 epoch (loss 1.0817): 33%|ββββ | 671/2046 [08:06<16:29, 1.39it/s]
Training 1/3 epoch (loss 1.1507): 33%|ββββ | 671/2046 [08:07<16:29, 1.39it/s]
Training 1/3 epoch (loss 1.1507): 33%|ββββ | 672/2046 [08:07<16:48, 1.36it/s]
Training 1/3 epoch (loss 1.3130): 33%|ββββ | 672/2046 [08:08<16:48, 1.36it/s]
Training 1/3 epoch (loss 1.3130): 33%|ββββ | 673/2046 [08:08<17:11, 1.33it/s]
Training 1/3 epoch (loss 1.3993): 33%|ββββ | 673/2046 [08:08<17:11, 1.33it/s]
Training 1/3 epoch (loss 1.3993): 33%|ββββ | 674/2046 [08:08<16:27, 1.39it/s]
Training 1/3 epoch (loss 1.1535): 33%|ββββ | 674/2046 [08:09<16:27, 1.39it/s]
Training 1/3 epoch (loss 1.1535): 33%|ββββ | 675/2046 [08:09<16:03, 1.42it/s]
Training 1/3 epoch (loss 1.4949): 33%|ββββ | 675/2046 [08:09<16:03, 1.42it/s]
Training 1/3 epoch (loss 1.4949): 33%|ββββ | 676/2046 [08:09<15:49, 1.44it/s]
Training 1/3 epoch (loss 1.0747): 33%|ββββ | 676/2046 [08:10<15:49, 1.44it/s]
Training 1/3 epoch (loss 1.0747): 33%|ββββ | 677/2046 [08:10<15:14, 1.50it/s]
Training 1/3 epoch (loss 1.3601): 33%|ββββ | 677/2046 [08:11<15:14, 1.50it/s]
Training 1/3 epoch (loss 1.3601): 33%|ββββ | 678/2046 [08:11<14:52, 1.53it/s]
Training 1/3 epoch (loss 1.1355): 33%|ββββ | 678/2046 [08:11<14:52, 1.53it/s]
Training 1/3 epoch (loss 1.1355): 33%|ββββ | 679/2046 [08:11<15:06, 1.51it/s]
Training 1/3 epoch (loss 1.1183): 33%|ββββ | 679/2046 [08:12<15:06, 1.51it/s]
Training 1/3 epoch (loss 1.1183): 33%|ββββ | 680/2046 [08:12<15:39, 1.45it/s]
Training 1/3 epoch (loss 1.1931): 33%|ββββ | 680/2046 [08:13<15:39, 1.45it/s]
Training 1/3 epoch (loss 1.1931): 33%|ββββ | 681/2046 [08:13<16:06, 1.41it/s]
Training 1/3 epoch (loss 1.1275): 33%|ββββ | 681/2046 [08:14<16:06, 1.41it/s]
Training 1/3 epoch (loss 1.1275): 33%|ββββ | 682/2046 [08:14<15:28, 1.47it/s]
Training 2/3 epoch (loss 1.1908): 33%|ββββ | 682/2046 [08:14<15:28, 1.47it/s]
Training 2/3 epoch (loss 1.1908): 33%|ββββ | 683/2046 [08:14<15:37, 1.45it/s]
Training 2/3 epoch (loss 1.1219): 33%|ββββ | 683/2046 [08:15<15:37, 1.45it/s]
Training 2/3 epoch (loss 1.1219): 33%|ββββ | 684/2046 [08:15<15:50, 1.43it/s]
Training 2/3 epoch (loss 1.2767): 33%|ββββ | 684/2046 [08:16<15:50, 1.43it/s]
Training 2/3 epoch (loss 1.2767): 33%|ββββ | 685/2046 [08:16<16:13, 1.40it/s]
Training 2/3 epoch (loss 1.2107): 33%|ββββ | 685/2046 [08:16<16:13, 1.40it/s]
Training 2/3 epoch (loss 1.2107): 34%|ββββ | 686/2046 [08:16<15:35, 1.45it/s]
Training 2/3 epoch (loss 0.9624): 34%|ββββ | 686/2046 [08:17<15:35, 1.45it/s]
Training 2/3 epoch (loss 0.9624): 34%|ββββ | 687/2046 [08:17<16:02, 1.41it/s]
Training 2/3 epoch (loss 1.2560): 34%|ββββ | 687/2046 [08:18<16:02, 1.41it/s]
Training 2/3 epoch (loss 1.2560): 34%|ββββ | 688/2046 [08:18<17:33, 1.29it/s]
Training 2/3 epoch (loss 1.1469): 34%|ββββ | 688/2046 [08:19<17:33, 1.29it/s]
Training 2/3 epoch (loss 1.1469): 34%|ββββ | 689/2046 [08:19<19:43, 1.15it/s]
Training 2/3 epoch (loss 1.3729): 34%|ββββ | 689/2046 [08:20<19:43, 1.15it/s]
Training 2/3 epoch (loss 1.3729): 34%|ββββ | 690/2046 [08:20<18:33, 1.22it/s]
Training 2/3 epoch (loss 1.1191): 34%|ββββ | 690/2046 [08:21<18:33, 1.22it/s]
Training 2/3 epoch (loss 1.1191): 34%|ββββ | 691/2046 [08:21<19:46, 1.14it/s]
Training 2/3 epoch (loss 1.2251): 34%|ββββ | 691/2046 [08:21<19:46, 1.14it/s]
Training 2/3 epoch (loss 1.2251): 34%|ββββ | 692/2046 [08:21<18:02, 1.25it/s]
Training 2/3 epoch (loss 1.1102): 34%|ββββ | 692/2046 [08:22<18:02, 1.25it/s]
Training 2/3 epoch (loss 1.1102): 34%|ββββ | 693/2046 [08:22<17:20, 1.30it/s]
Training 2/3 epoch (loss 1.1865): 34%|ββββ | 693/2046 [08:23<17:20, 1.30it/s]
Training 2/3 epoch (loss 1.1865): 34%|ββββ | 694/2046 [08:23<16:25, 1.37it/s]
Training 2/3 epoch (loss 1.1875): 34%|ββββ | 694/2046 [08:24<16:25, 1.37it/s]
Training 2/3 epoch (loss 1.1875): 34%|ββββ | 695/2046 [08:24<17:43, 1.27it/s]
Training 2/3 epoch (loss 1.0318): 34%|ββββ | 695/2046 [08:25<17:43, 1.27it/s]
Training 2/3 epoch (loss 1.0318): 34%|ββββ | 696/2046 [08:25<18:36, 1.21it/s]
Training 2/3 epoch (loss 1.1638): 34%|ββββ | 696/2046 [08:25<18:36, 1.21it/s]
Training 2/3 epoch (loss 1.1638): 34%|ββββ | 697/2046 [08:25<17:11, 1.31it/s]
Training 2/3 epoch (loss 1.1147): 34%|ββββ | 697/2046 [08:26<17:11, 1.31it/s]
Training 2/3 epoch (loss 1.1147): 34%|ββββ | 698/2046 [08:26<16:27, 1.37it/s]
Training 2/3 epoch (loss 1.2422): 34%|ββββ | 698/2046 [08:27<16:27, 1.37it/s]
Training 2/3 epoch (loss 1.2422): 34%|ββββ | 699/2046 [08:27<16:17, 1.38it/s]
Training 2/3 epoch (loss 1.2568): 34%|ββββ | 699/2046 [08:27<16:17, 1.38it/s]
Training 2/3 epoch (loss 1.2568): 34%|ββββ | 700/2046 [08:27<16:10, 1.39it/s]
Training 2/3 epoch (loss 1.2597): 34%|ββββ | 700/2046 [08:28<16:10, 1.39it/s]
Training 2/3 epoch (loss 1.2597): 34%|ββββ | 701/2046 [08:28<15:23, 1.46it/s]
Training 2/3 epoch (loss 1.3618): 34%|ββββ | 701/2046 [08:29<15:23, 1.46it/s]
Training 2/3 epoch (loss 1.3618): 34%|ββββ | 702/2046 [08:29<15:00, 1.49it/s]
Training 2/3 epoch (loss 1.0957): 34%|ββββ | 702/2046 [08:29<15:00, 1.49it/s]
Training 2/3 epoch (loss 1.0957): 34%|ββββ | 703/2046 [08:29<15:34, 1.44it/s]
Training 2/3 epoch (loss 1.0195): 34%|ββββ | 703/2046 [08:30<15:34, 1.44it/s]
Training 2/3 epoch (loss 1.0195): 34%|ββββ | 704/2046 [08:30<16:28, 1.36it/s]
Training 2/3 epoch (loss 1.1162): 34%|ββββ | 704/2046 [08:31<16:28, 1.36it/s]
Training 2/3 epoch (loss 1.1162): 34%|ββββ | 705/2046 [08:31<15:47, 1.42it/s]
Training 2/3 epoch (loss 1.2650): 34%|ββββ | 705/2046 [08:31<15:47, 1.42it/s]
Training 2/3 epoch (loss 1.2650): 35%|ββββ | 706/2046 [08:31<15:25, 1.45it/s]
Training 2/3 epoch (loss 0.9815): 35%|ββββ | 706/2046 [08:32<15:25, 1.45it/s]
Training 2/3 epoch (loss 0.9815): 35%|ββββ | 707/2046 [08:32<15:12, 1.47it/s]
Training 2/3 epoch (loss 0.9893): 35%|ββββ | 707/2046 [08:33<15:12, 1.47it/s]
Training 2/3 epoch (loss 0.9893): 35%|ββββ | 708/2046 [08:33<15:19, 1.46it/s]
Training 2/3 epoch (loss 1.3955): 35%|ββββ | 708/2046 [08:33<15:19, 1.46it/s]
Training 2/3 epoch (loss 1.3955): 35%|ββββ | 709/2046 [08:33<15:17, 1.46it/s]
Training 2/3 epoch (loss 1.0205): 35%|ββββ | 709/2046 [08:34<15:17, 1.46it/s]
Training 2/3 epoch (loss 1.0205): 35%|ββββ | 710/2046 [08:34<15:39, 1.42it/s]
Training 2/3 epoch (loss 0.9590): 35%|ββββ | 710/2046 [08:35<15:39, 1.42it/s]
Training 2/3 epoch (loss 0.9590): 35%|ββββ | 711/2046 [08:35<14:59, 1.48it/s]
Training 2/3 epoch (loss 1.2597): 35%|ββββ | 711/2046 [08:36<14:59, 1.48it/s]
Training 2/3 epoch (loss 1.2597): 35%|ββββ | 712/2046 [08:36<17:03, 1.30it/s]
Training 2/3 epoch (loss 1.2264): 35%|ββββ | 712/2046 [08:37<17:03, 1.30it/s]
Training 2/3 epoch (loss 1.2264): 35%|ββββ | 713/2046 [08:37<16:41, 1.33it/s]
Training 2/3 epoch (loss 1.4168): 35%|ββββ | 713/2046 [08:37<16:41, 1.33it/s]
Training 2/3 epoch (loss 1.4168): 35%|ββββ | 714/2046 [08:37<15:46, 1.41it/s]
Training 2/3 epoch (loss 0.9552): 35%|ββββ | 714/2046 [08:38<15:46, 1.41it/s]
Training 2/3 epoch (loss 0.9552): 35%|ββββ | 715/2046 [08:38<15:13, 1.46it/s]
Training 2/3 epoch (loss 1.1274): 35%|ββββ | 715/2046 [08:38<15:13, 1.46it/s]
Training 2/3 epoch (loss 1.1274): 35%|ββββ | 716/2046 [08:38<15:29, 1.43it/s]
Training 2/3 epoch (loss 1.0671): 35%|ββββ | 716/2046 [08:39<15:29, 1.43it/s]
Training 2/3 epoch (loss 1.0671): 35%|ββββ | 717/2046 [08:39<15:43, 1.41it/s]
Training 2/3 epoch (loss 1.0736): 35%|ββββ | 717/2046 [08:40<15:43, 1.41it/s]
Training 2/3 epoch (loss 1.0736): 35%|ββββ | 718/2046 [08:40<15:06, 1.46it/s]
Training 2/3 epoch (loss 1.0020): 35%|ββββ | 718/2046 [08:40<15:06, 1.46it/s]
Training 2/3 epoch (loss 1.0020): 35%|ββββ | 719/2046 [08:40<14:42, 1.50it/s]
Training 2/3 epoch (loss 1.0923): 35%|ββββ | 719/2046 [08:41<14:42, 1.50it/s]
Training 2/3 epoch (loss 1.0923): 35%|ββββ | 720/2046 [08:41<15:46, 1.40it/s]
Training 2/3 epoch (loss 1.1556): 35%|ββββ | 720/2046 [08:42<15:46, 1.40it/s]
Training 2/3 epoch (loss 1.1556): 35%|ββββ | 721/2046 [08:42<16:45, 1.32it/s]
Training 2/3 epoch (loss 1.0550): 35%|ββββ | 721/2046 [08:43<16:45, 1.32it/s]
Training 2/3 epoch (loss 1.0550): 35%|ββββ | 722/2046 [08:43<15:59, 1.38it/s]
Training 2/3 epoch (loss 0.9700): 35%|ββββ | 722/2046 [08:44<15:59, 1.38it/s]
Training 2/3 epoch (loss 0.9700): 35%|ββββ | 723/2046 [08:44<17:08, 1.29it/s]
Training 2/3 epoch (loss 1.1242): 35%|ββββ | 723/2046 [08:44<17:08, 1.29it/s]
Training 2/3 epoch (loss 1.1242): 35%|ββββ | 724/2046 [08:44<16:01, 1.37it/s]
Training 2/3 epoch (loss 0.9795): 35%|ββββ | 724/2046 [08:45<16:01, 1.37it/s]
Training 2/3 epoch (loss 0.9795): 35%|ββββ | 725/2046 [08:45<17:25, 1.26it/s]
Training 2/3 epoch (loss 0.9824): 35%|ββββ | 725/2046 [08:46<17:25, 1.26it/s]
Training 2/3 epoch (loss 0.9824): 35%|ββββ | 726/2046 [08:46<16:06, 1.37it/s]
Training 2/3 epoch (loss 1.1038): 35%|ββββ | 726/2046 [08:46<16:06, 1.37it/s]
Training 2/3 epoch (loss 1.1038): 36%|ββββ | 727/2046 [08:46<15:22, 1.43it/s]
Training 2/3 epoch (loss 1.1675): 36%|ββββ | 727/2046 [08:47<15:22, 1.43it/s]
Training 2/3 epoch (loss 1.1675): 36%|ββββ | 728/2046 [08:47<16:13, 1.35it/s]
Training 2/3 epoch (loss 1.0779): 36%|ββββ | 728/2046 [08:48<16:13, 1.35it/s]
Training 2/3 epoch (loss 1.0779): 36%|ββββ | 729/2046 [08:48<16:32, 1.33it/s]
Training 2/3 epoch (loss 0.8952): 36%|ββββ | 729/2046 [08:49<16:32, 1.33it/s]
Training 2/3 epoch (loss 0.8952): 36%|ββββ | 730/2046 [08:49<16:41, 1.31it/s]
Training 2/3 epoch (loss 0.9137): 36%|ββββ | 730/2046 [08:50<16:41, 1.31it/s]
Training 2/3 epoch (loss 0.9137): 36%|ββββ | 731/2046 [08:50<17:05, 1.28it/s]
Training 2/3 epoch (loss 0.9189): 36%|ββββ | 731/2046 [08:50<17:05, 1.28it/s]
Training 2/3 epoch (loss 0.9189): 36%|ββββ | 732/2046 [08:50<16:46, 1.31it/s]
Training 2/3 epoch (loss 1.1000): 36%|ββββ | 732/2046 [08:51<16:46, 1.31it/s]
Training 2/3 epoch (loss 1.1000): 36%|ββββ | 733/2046 [08:51<15:50, 1.38it/s]
Training 2/3 epoch (loss 0.9938): 36%|ββββ | 733/2046 [08:52<15:50, 1.38it/s]
Training 2/3 epoch (loss 0.9938): 36%|ββββ | 734/2046 [08:52<15:23, 1.42it/s]
Training 2/3 epoch (loss 0.8436): 36%|ββββ | 734/2046 [08:53<15:23, 1.42it/s]
Training 2/3 epoch (loss 0.8436): 36%|ββββ | 735/2046 [08:53<15:59, 1.37it/s]
Training 2/3 epoch (loss 0.9766): 36%|ββββ | 735/2046 [08:54<15:59, 1.37it/s]
Training 2/3 epoch (loss 0.9766): 36%|ββββ | 736/2046 [08:54<18:22, 1.19it/s]
Training 2/3 epoch (loss 0.9342): 36%|ββββ | 736/2046 [08:54<18:22, 1.19it/s]
Training 2/3 epoch (loss 0.9342): 36%|ββββ | 737/2046 [08:54<17:27, 1.25it/s]
Training 2/3 epoch (loss 0.9149): 36%|ββββ | 737/2046 [08:55<17:27, 1.25it/s]
Training 2/3 epoch (loss 0.9149): 36%|ββββ | 738/2046 [08:55<17:31, 1.24it/s]
Training 2/3 epoch (loss 0.9080): 36%|ββββ | 738/2046 [08:56<17:31, 1.24it/s]
Training 2/3 epoch (loss 0.9080): 36%|ββββ | 739/2046 [08:56<16:47, 1.30it/s]
Training 2/3 epoch (loss 0.9781): 36%|ββββ | 739/2046 [08:57<16:47, 1.30it/s]
Training 2/3 epoch (loss 0.9781): 36%|ββββ | 740/2046 [08:57<17:01, 1.28it/s]
Training 2/3 epoch (loss 0.8933): 36%|ββββ | 740/2046 [08:57<17:01, 1.28it/s]
Training 2/3 epoch (loss 0.8933): 36%|ββββ | 741/2046 [08:57<16:03, 1.36it/s]
Training 2/3 epoch (loss 0.9970): 36%|ββββ | 741/2046 [08:58<16:03, 1.36it/s]
Training 2/3 epoch (loss 0.9970): 36%|ββββ | 742/2046 [08:58<15:13, 1.43it/s]
Training 2/3 epoch (loss 0.9428): 36%|ββββ | 742/2046 [08:59<15:13, 1.43it/s]
Training 2/3 epoch (loss 0.9428): 36%|ββββ | 743/2046 [08:59<14:54, 1.46it/s]
Training 2/3 epoch (loss 0.9946): 36%|ββββ | 743/2046 [09:00<14:54, 1.46it/s]
Training 2/3 epoch (loss 0.9946): 36%|ββββ | 744/2046 [09:00<17:37, 1.23it/s]
Training 2/3 epoch (loss 1.0980): 36%|ββββ | 744/2046 [09:00<17:37, 1.23it/s]
Training 2/3 epoch (loss 1.0980): 36%|ββββ | 745/2046 [09:00<16:23, 1.32it/s]
Training 2/3 epoch (loss 1.0874): 36%|ββββ | 745/2046 [09:01<16:23, 1.32it/s]
Training 2/3 epoch (loss 1.0874): 36%|ββββ | 746/2046 [09:01<15:59, 1.35it/s]
Training 2/3 epoch (loss 1.1151): 36%|ββββ | 746/2046 [09:02<15:59, 1.35it/s]
Training 2/3 epoch (loss 1.1151): 37%|ββββ | 747/2046 [09:02<15:50, 1.37it/s]
Training 2/3 epoch (loss 0.8495): 37%|ββββ | 747/2046 [09:02<15:50, 1.37it/s]
Training 2/3 epoch (loss 0.8495): 37%|ββββ | 748/2046 [09:02<15:01, 1.44it/s]
Training 2/3 epoch (loss 1.0254): 37%|ββββ | 748/2046 [09:03<15:01, 1.44it/s]
Training 2/3 epoch (loss 1.0254): 37%|ββββ | 749/2046 [09:03<15:10, 1.42it/s]
Training 2/3 epoch (loss 0.9078): 37%|ββββ | 749/2046 [09:04<15:10, 1.42it/s]
Training 2/3 epoch (loss 0.9078): 37%|ββββ | 750/2046 [09:04<15:30, 1.39it/s]
Training 2/3 epoch (loss 1.0052): 37%|ββββ | 750/2046 [09:04<15:30, 1.39it/s]
Training 2/3 epoch (loss 1.0052): 37%|ββββ | 751/2046 [09:04<14:50, 1.45it/s]
Training 2/3 epoch (loss 0.9514): 37%|ββββ | 751/2046 [09:05<14:50, 1.45it/s]
Training 2/3 epoch (loss 0.9514): 37%|ββββ | 752/2046 [09:05<16:04, 1.34it/s]
Training 2/3 epoch (loss 0.8917): 37%|ββββ | 752/2046 [09:06<16:04, 1.34it/s]
Training 2/3 epoch (loss 0.8917): 37%|ββββ | 753/2046 [09:06<16:10, 1.33it/s]
Training 2/3 epoch (loss 0.8368): 37%|ββββ | 753/2046 [09:07<16:10, 1.33it/s]
Training 2/3 epoch (loss 0.8368): 37%|ββββ | 754/2046 [09:07<15:18, 1.41it/s]
Training 2/3 epoch (loss 0.8282): 37%|ββββ | 754/2046 [09:07<15:18, 1.41it/s]
Training 2/3 epoch (loss 0.8282): 37%|ββββ | 755/2046 [09:07<15:34, 1.38it/s]
Training 2/3 epoch (loss 0.7947): 37%|ββββ | 755/2046 [09:08<15:34, 1.38it/s]
Training 2/3 epoch (loss 0.7947): 37%|ββββ | 756/2046 [09:08<14:57, 1.44it/s]
Training 2/3 epoch (loss 0.7756): 37%|ββββ | 756/2046 [09:09<14:57, 1.44it/s]
Training 2/3 epoch (loss 0.7756): 37%|ββββ | 757/2046 [09:09<14:38, 1.47it/s]
Training 2/3 epoch (loss 0.7673): 37%|ββββ | 757/2046 [09:09<14:38, 1.47it/s]
Training 2/3 epoch (loss 0.7673): 37%|ββββ | 758/2046 [09:09<14:25, 1.49it/s]
Training 2/3 epoch (loss 0.8646): 37%|ββββ | 758/2046 [09:10<14:25, 1.49it/s]
Training 2/3 epoch (loss 0.8646): 37%|ββββ | 759/2046 [09:10<14:09, 1.52it/s]
Training 2/3 epoch (loss 0.7743): 37%|ββββ | 759/2046 [09:11<14:09, 1.52it/s]
Training 2/3 epoch (loss 0.7743): 37%|ββββ | 760/2046 [09:11<14:37, 1.47it/s]
Training 2/3 epoch (loss 0.7486): 37%|ββββ | 760/2046 [09:11<14:37, 1.47it/s]
Training 2/3 epoch (loss 0.7486): 37%|ββββ | 761/2046 [09:11<14:24, 1.49it/s]
Training 2/3 epoch (loss 0.6748): 37%|ββββ | 761/2046 [09:12<14:24, 1.49it/s]
Training 2/3 epoch (loss 0.6748): 37%|ββββ | 762/2046 [09:12<14:00, 1.53it/s]
Training 2/3 epoch (loss 0.6971): 37%|ββββ | 762/2046 [09:13<14:00, 1.53it/s]
Training 2/3 epoch (loss 0.6971): 37%|ββββ | 763/2046 [09:13<13:50, 1.55it/s]
Training 2/3 epoch (loss 0.8242): 37%|ββββ | 763/2046 [09:13<13:50, 1.55it/s]
Training 2/3 epoch (loss 0.8242): 37%|ββββ | 764/2046 [09:13<14:05, 1.52it/s]
Training 2/3 epoch (loss 0.7158): 37%|ββββ | 764/2046 [09:14<14:05, 1.52it/s]
Training 2/3 epoch (loss 0.7158): 37%|ββββ | 765/2046 [09:14<14:02, 1.52it/s]
Training 2/3 epoch (loss 0.6990): 37%|ββββ | 765/2046 [09:15<14:02, 1.52it/s]
Training 2/3 epoch (loss 0.6990): 37%|ββββ | 766/2046 [09:15<13:45, 1.55it/s]
Training 2/3 epoch (loss 0.6362): 37%|ββββ | 766/2046 [09:15<13:45, 1.55it/s]
Training 2/3 epoch (loss 0.6362): 37%|ββββ | 767/2046 [09:15<14:26, 1.48it/s]
Training 2/3 epoch (loss 0.8587): 37%|ββββ | 767/2046 [09:16<14:26, 1.48it/s]
Training 2/3 epoch (loss 0.8587): 38%|ββββ | 768/2046 [09:16<14:53, 1.43it/s]
Training 2/3 epoch (loss 0.7578): 38%|ββββ | 768/2046 [09:17<14:53, 1.43it/s]
Training 2/3 epoch (loss 0.7578): 38%|ββββ | 769/2046 [09:17<14:31, 1.47it/s]
Training 2/3 epoch (loss 0.8241): 38%|ββββ | 769/2046 [09:17<14:31, 1.47it/s]
Training 2/3 epoch (loss 0.8241): 38%|ββββ | 770/2046 [09:17<14:20, 1.48it/s]
Training 2/3 epoch (loss 0.6806): 38%|ββββ | 770/2046 [09:18<14:20, 1.48it/s]
Training 2/3 epoch (loss 0.6806): 38%|ββββ | 771/2046 [09:18<13:52, 1.53it/s]
Training 2/3 epoch (loss 0.7040): 38%|ββββ | 771/2046 [09:19<13:52, 1.53it/s]
Training 2/3 epoch (loss 0.7040): 38%|ββββ | 772/2046 [09:19<17:06, 1.24it/s]
Training 2/3 epoch (loss 0.6611): 38%|ββββ | 772/2046 [09:20<17:06, 1.24it/s]
Training 2/3 epoch (loss 0.6611): 38%|ββββ | 773/2046 [09:20<18:18, 1.16it/s]
Training 2/3 epoch (loss 0.7502): 38%|ββββ | 773/2046 [09:21<18:18, 1.16it/s]
Training 2/3 epoch (loss 0.7502): 38%|ββββ | 774/2046 [09:21<17:03, 1.24it/s]
Training 2/3 epoch (loss 0.7439): 38%|ββββ | 774/2046 [09:21<17:03, 1.24it/s]
Training 2/3 epoch (loss 0.7439): 38%|ββββ | 775/2046 [09:21<15:48, 1.34it/s]
Training 2/3 epoch (loss 0.7105): 38%|ββββ | 775/2046 [09:22<15:48, 1.34it/s]
Training 2/3 epoch (loss 0.7105): 38%|ββββ | 776/2046 [09:22<15:28, 1.37it/s]
Training 2/3 epoch (loss 0.8089): 38%|ββββ | 776/2046 [09:23<15:28, 1.37it/s]
Training 2/3 epoch (loss 0.8089): 38%|ββββ | 777/2046 [09:23<16:09, 1.31it/s]
Training 2/3 epoch (loss 0.6750): 38%|ββββ | 777/2046 [09:24<16:09, 1.31it/s]
Training 2/3 epoch (loss 0.6750): 38%|ββββ | 778/2046 [09:24<15:43, 1.34it/s]
Training 2/3 epoch (loss 0.6298): 38%|ββββ | 778/2046 [09:24<15:43, 1.34it/s]
Training 2/3 epoch (loss 0.6298): 38%|ββββ | 779/2046 [09:24<15:01, 1.41it/s]
Training 2/3 epoch (loss 0.5204): 38%|ββββ | 779/2046 [09:25<15:01, 1.41it/s]
Training 2/3 epoch (loss 0.5204): 38%|ββββ | 780/2046 [09:25<15:00, 1.41it/s]
Training 2/3 epoch (loss 0.6919): 38%|ββββ | 780/2046 [09:26<15:00, 1.41it/s]
Training 2/3 epoch (loss 0.6919): 38%|ββββ | 781/2046 [09:26<14:39, 1.44it/s]
Training 2/3 epoch (loss 0.7593): 38%|ββββ | 781/2046 [09:26<14:39, 1.44it/s]
Training 2/3 epoch (loss 0.7593): 38%|ββββ | 782/2046 [09:26<15:17, 1.38it/s]
Training 2/3 epoch (loss 0.6895): 38%|ββββ | 782/2046 [09:27<15:17, 1.38it/s]
Training 2/3 epoch (loss 0.6895): 38%|ββββ | 783/2046 [09:27<15:02, 1.40it/s]
Training 2/3 epoch (loss 0.6410): 38%|ββββ | 783/2046 [09:28<15:02, 1.40it/s]
Training 2/3 epoch (loss 0.6410): 38%|ββββ | 784/2046 [09:28<15:20, 1.37it/s]
Training 2/3 epoch (loss 0.6427): 38%|ββββ | 784/2046 [09:29<15:20, 1.37it/s]
Training 2/3 epoch (loss 0.6427): 38%|ββββ | 785/2046 [09:29<15:25, 1.36it/s]
Training 2/3 epoch (loss 0.6932): 38%|ββββ | 785/2046 [09:29<15:25, 1.36it/s]
Training 2/3 epoch (loss 0.6932): 38%|ββββ | 786/2046 [09:29<15:07, 1.39it/s]
Training 2/3 epoch (loss 0.7432): 38%|ββββ | 786/2046 [09:30<15:07, 1.39it/s]
Training 2/3 epoch (loss 0.7432): 38%|ββββ | 787/2046 [09:30<14:55, 1.41it/s]
Training 2/3 epoch (loss 0.5925): 38%|ββββ | 787/2046 [09:31<14:55, 1.41it/s]
Training 2/3 epoch (loss 0.5925): 39%|ββββ | 788/2046 [09:31<14:29, 1.45it/s]
Training 2/3 epoch (loss 0.7192): 39%|ββββ | 788/2046 [09:31<14:29, 1.45it/s]
Training 2/3 epoch (loss 0.7192): 39%|ββββ | 789/2046 [09:31<14:26, 1.45it/s]
Training 2/3 epoch (loss 0.7686): 39%|ββββ | 789/2046 [09:32<14:26, 1.45it/s]
Training 2/3 epoch (loss 0.7686): 39%|ββββ | 790/2046 [09:32<14:12, 1.47it/s]
Training 2/3 epoch (loss 0.6545): 39%|ββββ | 790/2046 [09:33<14:12, 1.47it/s]
Training 2/3 epoch (loss 0.6545): 39%|ββββ | 791/2046 [09:33<13:55, 1.50it/s]
Training 2/3 epoch (loss 0.7563): 39%|ββββ | 791/2046 [09:33<13:55, 1.50it/s]
Training 2/3 epoch (loss 0.7563): 39%|ββββ | 792/2046 [09:33<14:51, 1.41it/s]
Training 2/3 epoch (loss 0.6047): 39%|ββββ | 792/2046 [09:34<14:51, 1.41it/s]
Training 2/3 epoch (loss 0.6047): 39%|ββββ | 793/2046 [09:34<14:53, 1.40it/s]
Training 2/3 epoch (loss 0.5868): 39%|ββββ | 793/2046 [09:35<14:53, 1.40it/s]
Training 2/3 epoch (loss 0.5868): 39%|ββββ | 794/2046 [09:35<15:13, 1.37it/s]
Training 2/3 epoch (loss 0.5705): 39%|ββββ | 794/2046 [09:36<15:13, 1.37it/s]
Training 2/3 epoch (loss 0.5705): 39%|ββββ | 795/2046 [09:36<15:50, 1.32it/s]
Training 2/3 epoch (loss 0.4944): 39%|ββββ | 795/2046 [09:36<15:50, 1.32it/s]
Training 2/3 epoch (loss 0.4944): 39%|ββββ | 796/2046 [09:36<14:59, 1.39it/s]
Training 2/3 epoch (loss 0.6969): 39%|ββββ | 796/2046 [09:37<14:59, 1.39it/s]
Training 2/3 epoch (loss 0.6969): 39%|ββββ | 797/2046 [09:37<14:22, 1.45it/s]
Training 2/3 epoch (loss 0.4449): 39%|ββββ | 797/2046 [09:38<14:22, 1.45it/s]
Training 2/3 epoch (loss 0.4449): 39%|ββββ | 798/2046 [09:38<15:16, 1.36it/s]
Training 2/3 epoch (loss 0.6639): 39%|ββββ | 798/2046 [09:39<15:16, 1.36it/s]
Training 2/3 epoch (loss 0.6639): 39%|ββββ | 799/2046 [09:39<15:48, 1.32it/s]
Training 2/3 epoch (loss 0.6183): 39%|ββββ | 799/2046 [09:39<15:48, 1.32it/s]
Training 2/3 epoch (loss 0.6183): 39%|ββββ | 800/2046 [09:39<16:01, 1.30it/s]
Training 2/3 epoch (loss 0.5858): 39%|ββββ | 800/2046 [09:40<16:01, 1.30it/s]
Training 2/3 epoch (loss 0.5858): 39%|ββββ | 801/2046 [09:40<15:04, 1.38it/s]
Training 2/3 epoch (loss 0.6270): 39%|ββββ | 801/2046 [09:41<15:04, 1.38it/s]
Training 2/3 epoch (loss 0.6270): 39%|ββββ | 802/2046 [09:41<14:56, 1.39it/s]
Training 2/3 epoch (loss 0.6397): 39%|ββββ | 802/2046 [09:41<14:56, 1.39it/s]
Training 2/3 epoch (loss 0.6397): 39%|ββββ | 803/2046 [09:41<14:13, 1.46it/s]
Training 2/3 epoch (loss 0.6055): 39%|ββββ | 803/2046 [09:42<14:13, 1.46it/s]
Training 2/3 epoch (loss 0.6055): 39%|ββββ | 804/2046 [09:42<14:06, 1.47it/s]
Training 2/3 epoch (loss 0.4772): 39%|ββββ | 804/2046 [09:43<14:06, 1.47it/s]
Training 2/3 epoch (loss 0.4772): 39%|ββββ | 805/2046 [09:43<13:38, 1.52it/s]
Training 2/3 epoch (loss 0.6412): 39%|ββββ | 805/2046 [09:44<13:38, 1.52it/s]
Training 2/3 epoch (loss 0.6412): 39%|ββββ | 806/2046 [09:44<14:55, 1.39it/s]
Training 2/3 epoch (loss 0.6051): 39%|ββββ | 806/2046 [09:44<14:55, 1.39it/s]
Training 2/3 epoch (loss 0.6051): 39%|ββββ | 807/2046 [09:44<14:54, 1.39it/s]
Training 2/3 epoch (loss 0.5731): 39%|ββββ | 807/2046 [09:45<14:54, 1.39it/s]
Training 2/3 epoch (loss 0.5731): 39%|ββββ | 808/2046 [09:45<16:19, 1.26it/s]
Training 2/3 epoch (loss 0.5330): 39%|ββββ | 808/2046 [09:46<16:19, 1.26it/s]
Training 2/3 epoch (loss 0.5330): 40%|ββββ | 809/2046 [09:46<15:21, 1.34it/s]
Training 2/3 epoch (loss 0.4735): 40%|ββββ | 809/2046 [09:46<15:21, 1.34it/s]
Training 2/3 epoch (loss 0.4735): 40%|ββββ | 810/2046 [09:46<14:31, 1.42it/s]
Training 2/3 epoch (loss 0.5818): 40%|ββββ | 810/2046 [09:47<14:31, 1.42it/s]
Training 2/3 epoch (loss 0.5818): 40%|ββββ | 811/2046 [09:47<13:58, 1.47it/s]
Training 2/3 epoch (loss 0.5148): 40%|ββββ | 811/2046 [09:48<13:58, 1.47it/s]
Training 2/3 epoch (loss 0.5148): 40%|ββββ | 812/2046 [09:48<15:32, 1.32it/s]
Training 2/3 epoch (loss 0.5140): 40%|ββββ | 812/2046 [09:49<15:32, 1.32it/s]
Training 2/3 epoch (loss 0.5140): 40%|ββββ | 813/2046 [09:49<16:32, 1.24it/s]
Training 2/3 epoch (loss 0.4774): 40%|ββββ | 813/2046 [09:50<16:32, 1.24it/s]
Training 2/3 epoch (loss 0.4774): 40%|ββββ | 814/2046 [09:50<16:56, 1.21it/s]
Training 2/3 epoch (loss 0.5082): 40%|ββββ | 814/2046 [09:50<16:56, 1.21it/s]
Training 2/3 epoch (loss 0.5082): 40%|ββββ | 815/2046 [09:50<15:53, 1.29it/s]
Training 2/3 epoch (loss 0.5656): 40%|ββββ | 815/2046 [09:51<15:53, 1.29it/s]
Training 2/3 epoch (loss 0.5656): 40%|ββββ | 816/2046 [09:51<15:52, 1.29it/s]
Training 2/3 epoch (loss 0.6495): 40%|ββββ | 816/2046 [09:52<15:52, 1.29it/s]
Training 2/3 epoch (loss 0.6495): 40%|ββββ | 817/2046 [09:52<15:24, 1.33it/s]
Training 2/3 epoch (loss 0.7169): 40%|ββββ | 817/2046 [09:53<15:24, 1.33it/s]
Training 2/3 epoch (loss 0.7169): 40%|ββββ | 818/2046 [09:53<15:19, 1.34it/s]
Training 2/3 epoch (loss 0.6426): 40%|ββββ | 818/2046 [09:53<15:19, 1.34it/s]
Training 2/3 epoch (loss 0.6426): 40%|ββββ | 819/2046 [09:53<14:44, 1.39it/s]
Training 2/3 epoch (loss 0.5811): 40%|ββββ | 819/2046 [09:54<14:44, 1.39it/s]
Training 2/3 epoch (loss 0.5811): 40%|ββββ | 820/2046 [09:54<15:51, 1.29it/s]
Training 2/3 epoch (loss 0.5417): 40%|ββββ | 820/2046 [09:55<15:51, 1.29it/s]
Training 2/3 epoch (loss 0.5417): 40%|ββββ | 821/2046 [09:55<15:55, 1.28it/s]
Training 2/3 epoch (loss 0.6009): 40%|ββββ | 821/2046 [09:56<15:55, 1.28it/s]
Training 2/3 epoch (loss 0.6009): 40%|ββββ | 822/2046 [09:56<15:25, 1.32it/s]
Training 2/3 epoch (loss 0.6101): 40%|ββββ | 822/2046 [09:57<15:25, 1.32it/s]
Training 2/3 epoch (loss 0.6101): 40%|ββββ | 823/2046 [09:57<15:43, 1.30it/s]
Training 2/3 epoch (loss 0.6941): 40%|ββββ | 823/2046 [09:57<15:43, 1.30it/s]
Training 2/3 epoch (loss 0.6941): 40%|ββββ | 824/2046 [09:57<15:54, 1.28it/s]
Training 2/3 epoch (loss 0.5917): 40%|ββββ | 824/2046 [09:58<15:54, 1.28it/s]
Training 2/3 epoch (loss 0.5917): 40%|ββββ | 825/2046 [09:58<15:49, 1.29it/s]
Training 2/3 epoch (loss 0.6076): 40%|ββββ | 825/2046 [09:59<15:49, 1.29it/s]
Training 2/3 epoch (loss 0.6076): 40%|ββββ | 826/2046 [09:59<16:33, 1.23it/s]
Training 2/3 epoch (loss 0.5516): 40%|ββββ | 826/2046 [10:00<16:33, 1.23it/s]
Training 2/3 epoch (loss 0.5516): 40%|ββββ | 827/2046 [10:00<16:07, 1.26it/s]
Training 2/3 epoch (loss 0.5318): 40%|ββββ | 827/2046 [10:01<16:07, 1.26it/s]
Training 2/3 epoch (loss 0.5318): 40%|ββββ | 828/2046 [10:01<16:21, 1.24it/s]
Training 2/3 epoch (loss 0.4156): 40%|ββββ | 828/2046 [10:01<16:21, 1.24it/s]
Training 2/3 epoch (loss 0.4156): 41%|ββββ | 829/2046 [10:01<15:39, 1.30it/s]
Training 2/3 epoch (loss 0.5436): 41%|ββββ | 829/2046 [10:02<15:39, 1.30it/s]
Training 2/3 epoch (loss 0.5436): 41%|ββββ | 830/2046 [10:02<15:03, 1.35it/s]
Training 2/3 epoch (loss 0.7000): 41%|ββββ | 830/2046 [10:03<15:03, 1.35it/s]
Training 2/3 epoch (loss 0.7000): 41%|ββββ | 831/2046 [10:03<14:17, 1.42it/s]
Training 2/3 epoch (loss 0.4730): 41%|ββββ | 831/2046 [10:04<14:17, 1.42it/s]
Training 2/3 epoch (loss 0.4730): 41%|ββββ | 832/2046 [10:04<16:52, 1.20it/s]
Training 2/3 epoch (loss 0.5172): 41%|ββββ | 832/2046 [10:04<16:52, 1.20it/s]
Training 2/3 epoch (loss 0.5172): 41%|ββββ | 833/2046 [10:04<15:54, 1.27it/s]
Training 2/3 epoch (loss 0.5310): 41%|ββββ | 833/2046 [10:05<15:54, 1.27it/s]
Training 2/3 epoch (loss 0.5310): 41%|ββββ | 834/2046 [10:05<15:17, 1.32it/s]
Training 2/3 epoch (loss 0.5024): 41%|ββββ | 834/2046 [10:06<15:17, 1.32it/s]
Training 2/3 epoch (loss 0.5024): 41%|ββββ | 835/2046 [10:06<15:01, 1.34it/s]
Training 2/3 epoch (loss 0.5635): 41%|ββββ | 835/2046 [10:06<15:01, 1.34it/s]
Training 2/3 epoch (loss 0.5635): 41%|ββββ | 836/2046 [10:06<14:16, 1.41it/s]
Training 2/3 epoch (loss 0.4834): 41%|ββββ | 836/2046 [10:07<14:16, 1.41it/s]
Training 2/3 epoch (loss 0.4834): 41%|ββββ | 837/2046 [10:07<13:50, 1.46it/s]
Training 2/3 epoch (loss 0.4798): 41%|ββββ | 837/2046 [10:08<13:50, 1.46it/s]
Training 2/3 epoch (loss 0.4798): 41%|ββββ | 838/2046 [10:08<14:42, 1.37it/s]
Training 2/3 epoch (loss 0.5194): 41%|ββββ | 838/2046 [10:08<14:42, 1.37it/s]
Training 2/3 epoch (loss 0.5194): 41%|ββββ | 839/2046 [10:08<14:01, 1.43it/s]
Training 2/3 epoch (loss 0.4120): 41%|ββββ | 839/2046 [10:09<14:01, 1.43it/s]
Training 2/3 epoch (loss 0.4120): 41%|ββββ | 840/2046 [10:09<14:26, 1.39it/s]
Training 2/3 epoch (loss 0.6237): 41%|ββββ | 840/2046 [10:10<14:26, 1.39it/s]
Training 2/3 epoch (loss 0.6237): 41%|ββββ | 841/2046 [10:10<13:46, 1.46it/s]
Training 2/3 epoch (loss 0.4434): 41%|ββββ | 841/2046 [10:10<13:46, 1.46it/s]
Training 2/3 epoch (loss 0.4434): 41%|ββββ | 842/2046 [10:10<13:17, 1.51it/s]
Training 2/3 epoch (loss 0.6859): 41%|ββββ | 842/2046 [10:11<13:17, 1.51it/s]
Training 2/3 epoch (loss 0.6859): 41%|ββββ | 843/2046 [10:11<13:06, 1.53it/s]
Training 2/3 epoch (loss 0.5354): 41%|ββββ | 843/2046 [10:12<13:06, 1.53it/s]
Training 2/3 epoch (loss 0.5354): 41%|βββββ | 844/2046 [10:12<12:50, 1.56it/s]
Training 2/3 epoch (loss 0.6126): 41%|βββββ | 844/2046 [10:12<12:50, 1.56it/s]
Training 2/3 epoch (loss 0.6126): 41%|βββββ | 845/2046 [10:12<13:07, 1.53it/s]
Training 2/3 epoch (loss 0.6467): 41%|βββββ | 845/2046 [10:13<13:07, 1.53it/s]
Training 2/3 epoch (loss 0.6467): 41%|βββββ | 846/2046 [10:13<12:53, 1.55it/s]
Training 2/3 epoch (loss 0.5101): 41%|βββββ | 846/2046 [10:14<12:53, 1.55it/s]
Training 2/3 epoch (loss 0.5101): 41%|βββββ | 847/2046 [10:14<14:08, 1.41it/s]
Training 2/3 epoch (loss 0.4469): 41%|βββββ | 847/2046 [10:15<14:08, 1.41it/s]
Training 2/3 epoch (loss 0.4469): 41%|βββββ | 848/2046 [10:15<14:43, 1.36it/s]
Training 2/3 epoch (loss 0.6728): 41%|βββββ | 848/2046 [10:15<14:43, 1.36it/s]
Training 2/3 epoch (loss 0.6728): 41%|βββββ | 849/2046 [10:15<14:42, 1.36it/s]
Training 2/3 epoch (loss 0.5615): 41%|βββββ | 849/2046 [10:16<14:42, 1.36it/s]
Training 2/3 epoch (loss 0.5615): 42%|βββββ | 850/2046 [10:16<13:51, 1.44it/s]
Training 2/3 epoch (loss 0.4745): 42%|βββββ | 850/2046 [10:17<13:51, 1.44it/s]
Training 2/3 epoch (loss 0.4745): 42%|βββββ | 851/2046 [10:17<14:18, 1.39it/s]
Training 2/3 epoch (loss 0.3883): 42%|βββββ | 851/2046 [10:17<14:18, 1.39it/s]
Training 2/3 epoch (loss 0.3883): 42%|βββββ | 852/2046 [10:17<13:45, 1.45it/s]
Training 2/3 epoch (loss 0.5126): 42%|βββββ | 852/2046 [10:18<13:45, 1.45it/s]
Training 2/3 epoch (loss 0.5126): 42%|βββββ | 853/2046 [10:18<14:23, 1.38it/s]
Training 2/3 epoch (loss 0.4640): 42%|βββββ | 853/2046 [10:19<14:23, 1.38it/s]
Training 2/3 epoch (loss 0.4640): 42%|βββββ | 854/2046 [10:19<14:49, 1.34it/s]
Training 2/3 epoch (loss 0.5362): 42%|βββββ | 854/2046 [10:20<14:49, 1.34it/s]
Training 2/3 epoch (loss 0.5362): 42%|βββββ | 855/2046 [10:20<15:43, 1.26it/s]
Training 2/3 epoch (loss 0.5566): 42%|βββββ | 855/2046 [10:21<15:43, 1.26it/s]
Training 2/3 epoch (loss 0.5566): 42%|βββββ | 856/2046 [10:21<17:29, 1.13it/s]
Training 2/3 epoch (loss 0.5006): 42%|βββββ | 856/2046 [10:22<17:29, 1.13it/s]
Training 2/3 epoch (loss 0.5006): 42%|βββββ | 857/2046 [10:22<15:54, 1.25it/s]
Training 2/3 epoch (loss 0.6069): 42%|βββββ | 857/2046 [10:22<15:54, 1.25it/s]
Training 2/3 epoch (loss 0.6069): 42%|βββββ | 858/2046 [10:22<15:11, 1.30it/s]
Training 2/3 epoch (loss 0.5971): 42%|βββββ | 858/2046 [10:23<15:11, 1.30it/s]
Training 2/3 epoch (loss 0.5971): 42%|βββββ | 859/2046 [10:23<14:46, 1.34it/s]
Training 2/3 epoch (loss 0.4796): 42%|βββββ | 859/2046 [10:24<14:46, 1.34it/s]
Training 2/3 epoch (loss 0.4796): 42%|βββββ | 860/2046 [10:24<14:29, 1.36it/s]
Training 2/3 epoch (loss 0.4900): 42%|βββββ | 860/2046 [10:25<14:29, 1.36it/s]
Training 2/3 epoch (loss 0.4900): 42%|βββββ | 861/2046 [10:25<15:10, 1.30it/s]
Training 2/3 epoch (loss 0.4800): 42%|βββββ | 861/2046 [10:25<15:10, 1.30it/s]
Training 2/3 epoch (loss 0.4800): 42%|βββββ | 862/2046 [10:25<15:25, 1.28it/s]
Training 2/3 epoch (loss 0.4743): 42%|βββββ | 862/2046 [10:26<15:25, 1.28it/s]
Training 2/3 epoch (loss 0.4743): 42%|βββββ | 863/2046 [10:26<14:31, 1.36it/s]
Training 2/3 epoch (loss 0.6501): 42%|βββββ | 863/2046 [10:27<14:31, 1.36it/s]
Training 2/3 epoch (loss 0.6501): 42%|βββββ | 864/2046 [10:27<15:16, 1.29it/s]
Training 2/3 epoch (loss 0.5074): 42%|βββββ | 864/2046 [10:28<15:16, 1.29it/s]
Training 2/3 epoch (loss 0.5074): 42%|βββββ | 865/2046 [10:28<15:16, 1.29it/s]
Training 2/3 epoch (loss 0.5120): 42%|βββββ | 865/2046 [10:28<15:16, 1.29it/s]
Training 2/3 epoch (loss 0.5120): 42%|βββββ | 866/2046 [10:28<14:29, 1.36it/s]
Training 2/3 epoch (loss 0.6041): 42%|βββββ | 866/2046 [10:29<14:29, 1.36it/s]
Training 2/3 epoch (loss 0.6041): 42%|βββββ | 867/2046 [10:29<14:26, 1.36it/s]
Training 2/3 epoch (loss 0.5540): 42%|βββββ | 867/2046 [10:30<14:26, 1.36it/s]
Training 2/3 epoch (loss 0.5540): 42%|βββββ | 868/2046 [10:30<14:07, 1.39it/s]
Training 2/3 epoch (loss 0.5013): 42%|βββββ | 868/2046 [10:31<14:07, 1.39it/s]
Training 2/3 epoch (loss 0.5013): 42%|βββββ | 869/2046 [10:31<14:36, 1.34it/s]
Training 2/3 epoch (loss 0.5583): 42%|βββββ | 869/2046 [10:31<14:36, 1.34it/s]
Training 2/3 epoch (loss 0.5583): 43%|βββββ | 870/2046 [10:31<14:51, 1.32it/s]
Training 2/3 epoch (loss 0.5923): 43%|βββββ | 870/2046 [10:32<14:51, 1.32it/s]
Training 2/3 epoch (loss 0.5923): 43%|βββββ | 871/2046 [10:32<15:26, 1.27it/s]
Training 2/3 epoch (loss 0.6046): 43%|βββββ | 871/2046 [10:33<15:26, 1.27it/s]
Training 2/3 epoch (loss 0.6046): 43%|βββββ | 872/2046 [10:33<15:31, 1.26it/s]
Training 2/3 epoch (loss 0.5470): 43%|βββββ | 872/2046 [10:34<15:31, 1.26it/s]
Training 2/3 epoch (loss 0.5470): 43%|βββββ | 873/2046 [10:34<14:38, 1.34it/s]
Training 2/3 epoch (loss 0.4941): 43%|βββββ | 873/2046 [10:34<14:38, 1.34it/s]
Training 2/3 epoch (loss 0.4941): 43%|βββββ | 874/2046 [10:34<14:19, 1.36it/s]
Training 2/3 epoch (loss 0.6465): 43%|βββββ | 874/2046 [10:35<14:19, 1.36it/s]
Training 2/3 epoch (loss 0.6465): 43%|βββββ | 875/2046 [10:35<13:36, 1.43it/s]
Training 2/3 epoch (loss 0.5502): 43%|βββββ | 875/2046 [10:36<13:36, 1.43it/s]
Training 2/3 epoch (loss 0.5502): 43%|βββββ | 876/2046 [10:36<13:28, 1.45it/s]
Training 2/3 epoch (loss 0.6603): 43%|βββββ | 876/2046 [10:36<13:28, 1.45it/s]
Training 2/3 epoch (loss 0.6603): 43%|βββββ | 877/2046 [10:36<14:40, 1.33it/s]
Training 2/3 epoch (loss 0.4759): 43%|βββββ | 877/2046 [10:37<14:40, 1.33it/s]
Training 2/3 epoch (loss 0.4759): 43%|βββββ | 878/2046 [10:37<14:20, 1.36it/s]
Training 2/3 epoch (loss 0.5295): 43%|βββββ | 878/2046 [10:38<14:20, 1.36it/s]
Training 2/3 epoch (loss 0.5295): 43%|βββββ | 879/2046 [10:38<14:10, 1.37it/s]
Training 2/3 epoch (loss 0.5740): 43%|βββββ | 879/2046 [10:39<14:10, 1.37it/s]
Training 2/3 epoch (loss 0.5740): 43%|βββββ | 880/2046 [10:39<15:52, 1.22it/s]
Training 2/3 epoch (loss 0.4899): 43%|βββββ | 880/2046 [10:40<15:52, 1.22it/s]
Training 2/3 epoch (loss 0.4899): 43%|βββββ | 881/2046 [10:40<14:48, 1.31it/s]
Training 2/3 epoch (loss 0.5389): 43%|βββββ | 881/2046 [10:40<14:48, 1.31it/s]
Training 2/3 epoch (loss 0.5389): 43%|βββββ | 882/2046 [10:40<14:03, 1.38it/s]
Training 2/3 epoch (loss 0.5885): 43%|βββββ | 882/2046 [10:41<14:03, 1.38it/s]
Training 2/3 epoch (loss 0.5885): 43%|βββββ | 883/2046 [10:41<13:22, 1.45it/s]
Training 2/3 epoch (loss 0.6473): 43%|βββββ | 883/2046 [10:42<13:22, 1.45it/s]
Training 2/3 epoch (loss 0.6473): 43%|βββββ | 884/2046 [10:42<13:28, 1.44it/s]
Training 2/3 epoch (loss 0.5706): 43%|βββββ | 884/2046 [10:42<13:28, 1.44it/s]
Training 2/3 epoch (loss 0.5706): 43%|βββββ | 885/2046 [10:42<13:08, 1.47it/s]
Training 2/3 epoch (loss 0.6910): 43%|βββββ | 885/2046 [10:43<13:08, 1.47it/s]
Training 2/3 epoch (loss 0.6910): 43%|βββββ | 886/2046 [10:43<12:46, 1.51it/s]
Training 2/3 epoch (loss 0.5677): 43%|βββββ | 886/2046 [10:43<12:46, 1.51it/s]
Training 2/3 epoch (loss 0.5677): 43%|βββββ | 887/2046 [10:43<12:56, 1.49it/s]
Training 2/3 epoch (loss 0.5880): 43%|βββββ | 887/2046 [10:44<12:56, 1.49it/s]
Training 2/3 epoch (loss 0.5880): 43%|βββββ | 888/2046 [10:44<13:17, 1.45it/s]
Training 2/3 epoch (loss 0.5377): 43%|βββββ | 888/2046 [10:45<13:17, 1.45it/s]
Training 2/3 epoch (loss 0.5377): 43%|βββββ | 889/2046 [10:45<13:04, 1.47it/s]
Training 2/3 epoch (loss 0.5274): 43%|βββββ | 889/2046 [10:46<13:04, 1.47it/s]
Training 2/3 epoch (loss 0.5274): 43%|βββββ | 890/2046 [10:46<13:09, 1.46it/s]
Training 2/3 epoch (loss 0.5507): 43%|βββββ | 890/2046 [10:46<13:09, 1.46it/s]
Training 2/3 epoch (loss 0.5507): 44%|βββββ | 891/2046 [10:46<13:22, 1.44it/s]
Training 2/3 epoch (loss 0.4280): 44%|βββββ | 891/2046 [10:47<13:22, 1.44it/s]
Training 2/3 epoch (loss 0.4280): 44%|βββββ | 892/2046 [10:47<12:54, 1.49it/s]
Training 2/3 epoch (loss 0.5713): 44%|βββββ | 892/2046 [10:48<12:54, 1.49it/s]
Training 2/3 epoch (loss 0.5713): 44%|βββββ | 893/2046 [10:48<13:09, 1.46it/s]
Training 2/3 epoch (loss 0.4942): 44%|βββββ | 893/2046 [10:48<13:09, 1.46it/s]
Training 2/3 epoch (loss 0.4942): 44%|βββββ | 894/2046 [10:48<13:49, 1.39it/s]
Training 2/3 epoch (loss 0.4972): 44%|βββββ | 894/2046 [10:49<13:49, 1.39it/s]
Training 2/3 epoch (loss 0.4972): 44%|βββββ | 895/2046 [10:49<13:59, 1.37it/s]
Training 2/3 epoch (loss 0.4387): 44%|βββββ | 895/2046 [10:50<13:59, 1.37it/s]
Training 2/3 epoch (loss 0.4387): 44%|βββββ | 896/2046 [10:50<14:31, 1.32it/s]
Training 2/3 epoch (loss 0.5131): 44%|βββββ | 896/2046 [10:51<14:31, 1.32it/s]
Training 2/3 epoch (loss 0.5131): 44%|βββββ | 897/2046 [10:51<14:39, 1.31it/s]
Training 2/3 epoch (loss 0.5659): 44%|βββββ | 897/2046 [10:51<14:39, 1.31it/s]
Training 2/3 epoch (loss 0.5659): 44%|βββββ | 898/2046 [10:51<13:47, 1.39it/s]
Training 2/3 epoch (loss 0.4891): 44%|βββββ | 898/2046 [10:52<13:47, 1.39it/s]
Training 2/3 epoch (loss 0.4891): 44%|βββββ | 899/2046 [10:52<13:11, 1.45it/s]
Training 2/3 epoch (loss 0.4762): 44%|βββββ | 899/2046 [10:53<13:11, 1.45it/s]
Training 2/3 epoch (loss 0.4762): 44%|βββββ | 900/2046 [10:53<13:20, 1.43it/s]
Training 2/3 epoch (loss 0.5275): 44%|βββββ | 900/2046 [10:53<13:20, 1.43it/s]
Training 2/3 epoch (loss 0.5275): 44%|βββββ | 901/2046 [10:53<13:22, 1.43it/s]
Training 2/3 epoch (loss 0.3991): 44%|βββββ | 901/2046 [10:54<13:22, 1.43it/s]
Training 2/3 epoch (loss 0.3991): 44%|βββββ | 902/2046 [10:54<14:18, 1.33it/s]
Training 2/3 epoch (loss 0.4730): 44%|βββββ | 902/2046 [10:55<14:18, 1.33it/s]
Training 2/3 epoch (loss 0.4730): 44%|βββββ | 903/2046 [10:55<15:06, 1.26it/s]
Training 2/3 epoch (loss 0.5616): 44%|βββββ | 903/2046 [10:56<15:06, 1.26it/s]
Training 2/3 epoch (loss 0.5616): 44%|βββββ | 904/2046 [10:56<14:43, 1.29it/s]
Training 2/3 epoch (loss 0.5785): 44%|βββββ | 904/2046 [10:57<14:43, 1.29it/s]
Training 2/3 epoch (loss 0.5785): 44%|βββββ | 905/2046 [10:57<14:25, 1.32it/s]
Training 2/3 epoch (loss 0.5134): 44%|βββββ | 905/2046 [10:57<14:25, 1.32it/s]
Training 2/3 epoch (loss 0.5134): 44%|βββββ | 906/2046 [10:57<13:40, 1.39it/s]
Training 2/3 epoch (loss 0.5085): 44%|βββββ | 906/2046 [10:58<13:40, 1.39it/s]
Training 2/3 epoch (loss 0.5085): 44%|βββββ | 907/2046 [10:58<13:33, 1.40it/s]
Training 2/3 epoch (loss 0.6347): 44%|βββββ | 907/2046 [10:59<13:33, 1.40it/s]
Training 2/3 epoch (loss 0.6347): 44%|βββββ | 908/2046 [10:59<13:46, 1.38it/s]
Training 2/3 epoch (loss 0.5331): 44%|βββββ | 908/2046 [11:00<13:46, 1.38it/s]
Training 2/3 epoch (loss 0.5331): 44%|βββββ | 909/2046 [11:00<14:50, 1.28it/s]
Training 2/3 epoch (loss 0.5679): 44%|βββββ | 909/2046 [11:00<14:50, 1.28it/s]
Training 2/3 epoch (loss 0.5679): 44%|βββββ | 910/2046 [11:00<14:14, 1.33it/s]
Training 2/3 epoch (loss 0.5344): 44%|βββββ | 910/2046 [11:01<14:14, 1.33it/s]
Training 2/3 epoch (loss 0.5344): 45%|βββββ | 911/2046 [11:01<14:22, 1.32it/s]
Training 2/3 epoch (loss 0.5341): 45%|βββββ | 911/2046 [11:02<14:22, 1.32it/s]
Training 2/3 epoch (loss 0.5341): 45%|βββββ | 912/2046 [11:02<14:41, 1.29it/s]
Training 2/3 epoch (loss 0.4403): 45%|βββββ | 912/2046 [11:03<14:41, 1.29it/s]
Training 2/3 epoch (loss 0.4403): 45%|βββββ | 913/2046 [11:03<13:56, 1.36it/s]
Training 2/3 epoch (loss 0.5378): 45%|βββββ | 913/2046 [11:03<13:56, 1.36it/s]
Training 2/3 epoch (loss 0.5378): 45%|βββββ | 914/2046 [11:03<13:17, 1.42it/s]
Training 2/3 epoch (loss 0.5838): 45%|βββββ | 914/2046 [11:04<13:17, 1.42it/s]
Training 2/3 epoch (loss 0.5838): 45%|βββββ | 915/2046 [11:04<13:23, 1.41it/s]
Training 2/3 epoch (loss 0.5732): 45%|βββββ | 915/2046 [11:05<13:23, 1.41it/s]
Training 2/3 epoch (loss 0.5732): 45%|βββββ | 916/2046 [11:05<13:03, 1.44it/s]
Training 2/3 epoch (loss 0.5791): 45%|βββββ | 916/2046 [11:05<13:03, 1.44it/s]
Training 2/3 epoch (loss 0.5791): 45%|βββββ | 917/2046 [11:05<12:56, 1.45it/s]
Training 2/3 epoch (loss 0.5604): 45%|βββββ | 917/2046 [11:06<12:56, 1.45it/s]
Training 2/3 epoch (loss 0.5604): 45%|βββββ | 918/2046 [11:06<12:53, 1.46it/s]
Training 2/3 epoch (loss 0.5252): 45%|βββββ | 918/2046 [11:07<12:53, 1.46it/s]
Training 2/3 epoch (loss 0.5252): 45%|βββββ | 919/2046 [11:07<12:38, 1.49it/s]
Training 2/3 epoch (loss 0.4161): 45%|βββββ | 919/2046 [11:07<12:38, 1.49it/s]
Training 2/3 epoch (loss 0.4161): 45%|βββββ | 920/2046 [11:07<13:18, 1.41it/s]
Training 2/3 epoch (loss 0.5261): 45%|βββββ | 920/2046 [11:08<13:18, 1.41it/s]
Training 2/3 epoch (loss 0.5261): 45%|βββββ | 921/2046 [11:08<12:59, 1.44it/s]
Training 2/3 epoch (loss 0.4925): 45%|βββββ | 921/2046 [11:09<12:59, 1.44it/s]
Training 2/3 epoch (loss 0.4925): 45%|βββββ | 922/2046 [11:09<12:51, 1.46it/s]
Training 2/3 epoch (loss 0.7374): 45%|βββββ | 922/2046 [11:09<12:51, 1.46it/s]
Training 2/3 epoch (loss 0.7374): 45%|βββββ | 923/2046 [11:09<12:36, 1.48it/s]
Training 2/3 epoch (loss 0.6047): 45%|βββββ | 923/2046 [11:10<12:36, 1.48it/s]
Training 2/3 epoch (loss 0.6047): 45%|βββββ | 924/2046 [11:10<12:18, 1.52it/s]
Training 2/3 epoch (loss 0.5559): 45%|βββββ | 924/2046 [11:11<12:18, 1.52it/s]
Training 2/3 epoch (loss 0.5559): 45%|βββββ | 925/2046 [11:11<12:06, 1.54it/s]
Training 2/3 epoch (loss 0.6001): 45%|βββββ | 925/2046 [11:11<12:06, 1.54it/s]
Training 2/3 epoch (loss 0.6001): 45%|βββββ | 926/2046 [11:11<12:22, 1.51it/s]
Training 2/3 epoch (loss 0.6242): 45%|βββββ | 926/2046 [11:12<12:22, 1.51it/s]
Training 2/3 epoch (loss 0.6242): 45%|βββββ | 927/2046 [11:12<12:02, 1.55it/s]
Training 2/3 epoch (loss 0.4694): 45%|βββββ | 927/2046 [11:13<12:02, 1.55it/s]
Training 2/3 epoch (loss 0.4694): 45%|βββββ | 928/2046 [11:13<13:38, 1.37it/s]
Training 2/3 epoch (loss 0.5582): 45%|βββββ | 928/2046 [11:14<13:38, 1.37it/s]
Training 2/3 epoch (loss 0.5582): 45%|βββββ | 929/2046 [11:14<13:58, 1.33it/s]
Training 2/3 epoch (loss 0.6340): 45%|βββββ | 929/2046 [11:14<13:58, 1.33it/s]
Training 2/3 epoch (loss 0.6340): 45%|βββββ | 930/2046 [11:14<13:14, 1.40it/s]
Training 2/3 epoch (loss 0.6142): 45%|βββββ | 930/2046 [11:15<13:14, 1.40it/s]
Training 2/3 epoch (loss 0.6142): 46%|βββββ | 931/2046 [11:15<13:07, 1.42it/s]
Training 2/3 epoch (loss 0.5710): 46%|βββββ | 931/2046 [11:16<13:07, 1.42it/s]
Training 2/3 epoch (loss 0.5710): 46%|βββββ | 932/2046 [11:16<12:46, 1.45it/s]
Training 2/3 epoch (loss 0.6037): 46%|βββββ | 932/2046 [11:16<12:46, 1.45it/s]
Training 2/3 epoch (loss 0.6037): 46%|βββββ | 933/2046 [11:16<12:51, 1.44it/s]
Training 2/3 epoch (loss 0.5491): 46%|βββββ | 933/2046 [11:17<12:51, 1.44it/s]
Training 2/3 epoch (loss 0.5491): 46%|βββββ | 934/2046 [11:17<12:23, 1.49it/s]
Training 2/3 epoch (loss 0.4793): 46%|βββββ | 934/2046 [11:18<12:23, 1.49it/s]
Training 2/3 epoch (loss 0.4793): 46%|βββββ | 935/2046 [11:18<12:44, 1.45it/s]
Training 2/3 epoch (loss 0.5791): 46%|βββββ | 935/2046 [11:19<12:44, 1.45it/s]
Training 2/3 epoch (loss 0.5791): 46%|βββββ | 936/2046 [11:19<14:54, 1.24it/s]
Training 2/3 epoch (loss 0.5629): 46%|βββββ | 936/2046 [11:19<14:54, 1.24it/s]
Training 2/3 epoch (loss 0.5629): 46%|βββββ | 937/2046 [11:19<14:52, 1.24it/s]
Training 2/3 epoch (loss 0.5389): 46%|βββββ | 937/2046 [11:20<14:52, 1.24it/s]
Training 2/3 epoch (loss 0.5389): 46%|βββββ | 938/2046 [11:20<15:18, 1.21it/s]
Training 2/3 epoch (loss 0.6214): 46%|βββββ | 938/2046 [11:21<15:18, 1.21it/s]
Training 2/3 epoch (loss 0.6214): 46%|βββββ | 939/2046 [11:21<14:03, 1.31it/s]
Training 2/3 epoch (loss 0.5158): 46%|βββββ | 939/2046 [11:22<14:03, 1.31it/s]
Training 2/3 epoch (loss 0.5158): 46%|βββββ | 940/2046 [11:22<13:17, 1.39it/s]
Training 2/3 epoch (loss 0.5926): 46%|βββββ | 940/2046 [11:22<13:17, 1.39it/s]
Training 2/3 epoch (loss 0.5926): 46%|βββββ | 941/2046 [11:22<13:08, 1.40it/s]
Training 2/3 epoch (loss 0.5428): 46%|βββββ | 941/2046 [11:23<13:08, 1.40it/s]
Training 2/3 epoch (loss 0.5428): 46%|βββββ | 942/2046 [11:23<12:52, 1.43it/s]
Training 2/3 epoch (loss 0.4696): 46%|βββββ | 942/2046 [11:24<12:52, 1.43it/s]
Training 2/3 epoch (loss 0.4696): 46%|βββββ | 943/2046 [11:24<13:04, 1.41it/s]
Training 2/3 epoch (loss 0.6471): 46%|βββββ | 943/2046 [11:25<13:04, 1.41it/s]
Training 2/3 epoch (loss 0.6471): 46%|βββββ | 944/2046 [11:25<13:33, 1.35it/s]
Training 2/3 epoch (loss 0.4696): 46%|βββββ | 944/2046 [11:25<13:33, 1.35it/s]
Training 2/3 epoch (loss 0.4696): 46%|βββββ | 945/2046 [11:25<13:15, 1.38it/s]
Training 2/3 epoch (loss 0.5134): 46%|βββββ | 945/2046 [11:26<13:15, 1.38it/s]
Training 2/3 epoch (loss 0.5134): 46%|βββββ | 946/2046 [11:26<13:38, 1.34it/s]
Training 2/3 epoch (loss 0.5745): 46%|βββββ | 946/2046 [11:27<13:38, 1.34it/s]
Training 2/3 epoch (loss 0.5745): 46%|βββββ | 947/2046 [11:27<13:46, 1.33it/s]
Training 2/3 epoch (loss 0.5409): 46%|βββββ | 947/2046 [11:27<13:46, 1.33it/s]
Training 2/3 epoch (loss 0.5409): 46%|βββββ | 948/2046 [11:27<13:03, 1.40it/s]
Training 2/3 epoch (loss 0.4958): 46%|βββββ | 948/2046 [11:28<13:03, 1.40it/s]
Training 2/3 epoch (loss 0.4958): 46%|βββββ | 949/2046 [11:28<12:37, 1.45it/s]
Training 2/3 epoch (loss 0.5587): 46%|βββββ | 949/2046 [11:29<12:37, 1.45it/s]
Training 2/3 epoch (loss 0.5587): 46%|βββββ | 950/2046 [11:29<12:33, 1.46it/s]
Training 2/3 epoch (loss 0.7566): 46%|βββββ | 950/2046 [11:29<12:33, 1.46it/s]
Training 2/3 epoch (loss 0.7566): 46%|βββββ | 951/2046 [11:29<12:25, 1.47it/s]
Training 2/3 epoch (loss 0.5114): 46%|βββββ | 951/2046 [11:30<12:25, 1.47it/s]
Training 2/3 epoch (loss 0.5114): 47%|βββββ | 952/2046 [11:30<12:34, 1.45it/s]
Training 2/3 epoch (loss 0.6094): 47%|βββββ | 952/2046 [11:31<12:34, 1.45it/s]
Training 2/3 epoch (loss 0.6094): 47%|βββββ | 953/2046 [11:31<13:10, 1.38it/s]
Training 2/3 epoch (loss 0.6252): 47%|βββββ | 953/2046 [11:32<13:10, 1.38it/s]
Training 2/3 epoch (loss 0.6252): 47%|βββββ | 954/2046 [11:32<12:54, 1.41it/s]
Training 2/3 epoch (loss 0.8028): 47%|βββββ | 954/2046 [11:32<12:54, 1.41it/s]
Training 2/3 epoch (loss 0.8028): 47%|βββββ | 955/2046 [11:32<12:24, 1.47it/s]
Training 2/3 epoch (loss 0.6777): 47%|βββββ | 955/2046 [11:33<12:24, 1.47it/s]
Training 2/3 epoch (loss 0.6777): 47%|βββββ | 956/2046 [11:33<12:07, 1.50it/s]
Training 2/3 epoch (loss 0.4881): 47%|βββββ | 956/2046 [11:34<12:07, 1.50it/s]
Training 2/3 epoch (loss 0.4881): 47%|βββββ | 957/2046 [11:34<13:17, 1.37it/s]
Training 2/3 epoch (loss 0.5272): 47%|βββββ | 957/2046 [11:34<13:17, 1.37it/s]
Training 2/3 epoch (loss 0.5272): 47%|βββββ | 958/2046 [11:34<12:48, 1.42it/s]
Training 2/3 epoch (loss 0.5899): 47%|βββββ | 958/2046 [11:35<12:48, 1.42it/s]
Training 2/3 epoch (loss 0.5899): 47%|βββββ | 959/2046 [11:35<12:47, 1.42it/s]
Training 2/3 epoch (loss 0.5415): 47%|βββββ | 959/2046 [11:36<12:47, 1.42it/s]
Training 2/3 epoch (loss 0.5415): 47%|βββββ | 960/2046 [11:36<13:18, 1.36it/s]
Training 2/3 epoch (loss 0.5001): 47%|βββββ | 960/2046 [11:37<13:18, 1.36it/s]
Training 2/3 epoch (loss 0.5001): 47%|βββββ | 961/2046 [11:37<12:56, 1.40it/s]
Training 2/3 epoch (loss 0.5989): 47%|βββββ | 961/2046 [11:37<12:56, 1.40it/s]
Training 2/3 epoch (loss 0.5989): 47%|βββββ | 962/2046 [11:37<12:21, 1.46it/s]
Training 2/3 epoch (loss 0.5262): 47%|βββββ | 962/2046 [11:38<12:21, 1.46it/s]
Training 2/3 epoch (loss 0.5262): 47%|βββββ | 963/2046 [11:38<13:24, 1.35it/s]
Training 2/3 epoch (loss 0.6238): 47%|βββββ | 963/2046 [11:39<13:24, 1.35it/s]
Training 2/3 epoch (loss 0.6238): 47%|βββββ | 964/2046 [11:39<12:49, 1.41it/s]
Training 2/3 epoch (loss 0.4858): 47%|βββββ | 964/2046 [11:39<12:49, 1.41it/s]
Training 2/3 epoch (loss 0.4858): 47%|βββββ | 965/2046 [11:39<12:16, 1.47it/s]
Training 2/3 epoch (loss 0.5004): 47%|βββββ | 965/2046 [11:40<12:16, 1.47it/s]
Training 2/3 epoch (loss 0.5004): 47%|βββββ | 966/2046 [11:40<11:51, 1.52it/s]
Training 2/3 epoch (loss 0.5401): 47%|βββββ | 966/2046 [11:40<11:51, 1.52it/s]
Training 2/3 epoch (loss 0.5401): 47%|βββββ | 967/2046 [11:40<11:37, 1.55it/s]
Training 2/3 epoch (loss 0.5119): 47%|βββββ | 967/2046 [11:41<11:37, 1.55it/s]
Training 2/3 epoch (loss 0.5119): 47%|βββββ | 968/2046 [11:41<12:13, 1.47it/s]
Training 2/3 epoch (loss 0.5297): 47%|βββββ | 968/2046 [11:42<12:13, 1.47it/s]
Training 2/3 epoch (loss 0.5297): 47%|βββββ | 969/2046 [11:42<12:00, 1.50it/s]
Training 2/3 epoch (loss 0.5349): 47%|βββββ | 969/2046 [11:43<12:00, 1.50it/s]
Training 2/3 epoch (loss 0.5349): 47%|βββββ | 970/2046 [11:43<11:55, 1.50it/s]
Training 2/3 epoch (loss 0.5161): 47%|βββββ | 970/2046 [11:43<11:55, 1.50it/s]
Training 2/3 epoch (loss 0.5161): 47%|βββββ | 971/2046 [11:43<11:34, 1.55it/s]
Training 2/3 epoch (loss 0.4272): 47%|βββββ | 971/2046 [11:44<11:34, 1.55it/s]
Training 2/3 epoch (loss 0.4272): 48%|βββββ | 972/2046 [11:44<11:42, 1.53it/s]
Training 2/3 epoch (loss 0.5427): 48%|βββββ | 972/2046 [11:44<11:42, 1.53it/s]
Training 2/3 epoch (loss 0.5427): 48%|βββββ | 973/2046 [11:44<11:32, 1.55it/s]
Training 2/3 epoch (loss 0.4189): 48%|βββββ | 973/2046 [11:45<11:32, 1.55it/s]
Training 2/3 epoch (loss 0.4189): 48%|βββββ | 974/2046 [11:45<11:43, 1.52it/s]
Training 2/3 epoch (loss 0.6806): 48%|βββββ | 974/2046 [11:46<11:43, 1.52it/s]
Training 2/3 epoch (loss 0.6806): 48%|βββββ | 975/2046 [11:46<12:01, 1.48it/s]
Training 2/3 epoch (loss 0.4586): 48%|βββββ | 975/2046 [11:47<12:01, 1.48it/s]
Training 2/3 epoch (loss 0.4586): 48%|βββββ | 976/2046 [11:47<12:27, 1.43it/s]
Training 2/3 epoch (loss 0.4642): 48%|βββββ | 976/2046 [11:47<12:27, 1.43it/s]
Training 2/3 epoch (loss 0.4642): 48%|βββββ | 977/2046 [11:47<12:03, 1.48it/s]
Training 2/3 epoch (loss 0.6688): 48%|βββββ | 977/2046 [11:48<12:03, 1.48it/s]
Training 2/3 epoch (loss 0.6688): 48%|βββββ | 978/2046 [11:48<13:06, 1.36it/s]
Training 2/3 epoch (loss 0.5889): 48%|βββββ | 978/2046 [11:49<13:06, 1.36it/s]
Training 2/3 epoch (loss 0.5889): 48%|βββββ | 979/2046 [11:49<13:06, 1.36it/s]
Training 2/3 epoch (loss 0.5485): 48%|βββββ | 979/2046 [11:50<13:06, 1.36it/s]
Training 2/3 epoch (loss 0.5485): 48%|βββββ | 980/2046 [11:50<13:19, 1.33it/s]
Training 2/3 epoch (loss 0.6452): 48%|βββββ | 980/2046 [11:50<13:19, 1.33it/s]
Training 2/3 epoch (loss 0.6452): 48%|βββββ | 981/2046 [11:50<13:02, 1.36it/s]
Training 2/3 epoch (loss 0.5134): 48%|βββββ | 981/2046 [11:51<13:02, 1.36it/s]
Training 2/3 epoch (loss 0.5134): 48%|βββββ | 982/2046 [11:51<12:32, 1.41it/s]
Training 2/3 epoch (loss 0.4690): 48%|βββββ | 982/2046 [11:52<12:32, 1.41it/s]
Training 2/3 epoch (loss 0.4690): 48%|βββββ | 983/2046 [11:52<12:06, 1.46it/s]
Training 2/3 epoch (loss 0.4897): 48%|βββββ | 983/2046 [11:52<12:06, 1.46it/s]
Training 2/3 epoch (loss 0.4897): 48%|βββββ | 984/2046 [11:52<12:53, 1.37it/s]
Training 2/3 epoch (loss 0.4392): 48%|βββββ | 984/2046 [11:53<12:53, 1.37it/s]
Training 2/3 epoch (loss 0.4392): 48%|βββββ | 985/2046 [11:53<13:48, 1.28it/s]
Training 2/3 epoch (loss 0.5682): 48%|βββββ | 985/2046 [11:54<13:48, 1.28it/s]
Training 2/3 epoch (loss 0.5682): 48%|βββββ | 986/2046 [11:54<13:55, 1.27it/s]
Training 2/3 epoch (loss 0.4269): 48%|βββββ | 986/2046 [11:55<13:55, 1.27it/s]
Training 2/3 epoch (loss 0.4269): 48%|βββββ | 987/2046 [11:55<13:30, 1.31it/s]
Training 2/3 epoch (loss 0.4541): 48%|βββββ | 987/2046 [11:55<13:30, 1.31it/s]
Training 2/3 epoch (loss 0.4541): 48%|βββββ | 988/2046 [11:55<12:46, 1.38it/s]
Training 2/3 epoch (loss 0.5489): 48%|βββββ | 988/2046 [11:56<12:46, 1.38it/s]
Training 2/3 epoch (loss 0.5489): 48%|βββββ | 989/2046 [11:56<12:25, 1.42it/s]
Training 2/3 epoch (loss 0.4667): 48%|βββββ | 989/2046 [11:57<12:25, 1.42it/s]
Training 2/3 epoch (loss 0.4667): 48%|βββββ | 990/2046 [11:57<12:43, 1.38it/s]
Training 2/3 epoch (loss 0.5137): 48%|βββββ | 990/2046 [11:58<12:43, 1.38it/s]
Training 2/3 epoch (loss 0.5137): 48%|βββββ | 991/2046 [11:58<12:12, 1.44it/s]
Training 2/3 epoch (loss 0.4405): 48%|βββββ | 991/2046 [11:58<12:12, 1.44it/s]
Training 2/3 epoch (loss 0.4405): 48%|βββββ | 992/2046 [11:58<13:09, 1.33it/s]
Training 2/3 epoch (loss 0.3837): 48%|βββββ | 992/2046 [11:59<13:09, 1.33it/s]
Training 2/3 epoch (loss 0.3837): 49%|βββββ | 993/2046 [11:59<12:58, 1.35it/s]
Training 2/3 epoch (loss 0.5383): 49%|βββββ | 993/2046 [12:00<12:58, 1.35it/s]
Training 2/3 epoch (loss 0.5383): 49%|βββββ | 994/2046 [12:00<12:14, 1.43it/s]
Training 2/3 epoch (loss 0.4911): 49%|βββββ | 994/2046 [12:00<12:14, 1.43it/s]
Training 2/3 epoch (loss 0.4911): 49%|βββββ | 995/2046 [12:00<12:12, 1.43it/s]
Training 2/3 epoch (loss 0.6144): 49%|βββββ | 995/2046 [12:01<12:12, 1.43it/s]
Training 2/3 epoch (loss 0.6144): 49%|βββββ | 996/2046 [12:01<12:43, 1.38it/s]
Training 2/3 epoch (loss 0.4351): 49%|βββββ | 996/2046 [12:02<12:43, 1.38it/s]
Training 2/3 epoch (loss 0.4351): 49%|βββββ | 997/2046 [12:02<12:12, 1.43it/s]
Training 2/3 epoch (loss 0.6050): 49%|βββββ | 997/2046 [12:03<12:12, 1.43it/s]
Training 2/3 epoch (loss 0.6050): 49%|βββββ | 998/2046 [12:03<12:31, 1.39it/s]
Training 2/3 epoch (loss 0.5007): 49%|βββββ | 998/2046 [12:03<12:31, 1.39it/s]
Training 2/3 epoch (loss 0.5007): 49%|βββββ | 999/2046 [12:03<12:35, 1.39it/s]
Training 2/3 epoch (loss 0.5802): 49%|βββββ | 999/2046 [12:04<12:35, 1.39it/s]
Training 2/3 epoch (loss 0.5802): 49%|βββββ | 1000/2046 [12:04<12:45, 1.37it/s]
Training 2/3 epoch (loss 0.5159): 49%|βββββ | 1000/2046 [12:05<12:45, 1.37it/s]
Training 2/3 epoch (loss 0.5159): 49%|βββββ | 1001/2046 [12:05<12:23, 1.41it/s]
Training 2/3 epoch (loss 0.4918): 49%|βββββ | 1001/2046 [12:05<12:23, 1.41it/s]
Training 2/3 epoch (loss 0.4918): 49%|βββββ | 1002/2046 [12:05<12:38, 1.38it/s]
Training 2/3 epoch (loss 0.5936): 49%|βββββ | 1002/2046 [12:06<12:38, 1.38it/s]
Training 2/3 epoch (loss 0.5936): 49%|βββββ | 1003/2046 [12:06<12:31, 1.39it/s]
Training 2/3 epoch (loss 0.5023): 49%|βββββ | 1003/2046 [12:07<12:31, 1.39it/s]
Training 2/3 epoch (loss 0.5023): 49%|βββββ | 1004/2046 [12:07<12:03, 1.44it/s]
Training 2/3 epoch (loss 0.6273): 49%|βββββ | 1004/2046 [12:08<12:03, 1.44it/s]
Training 2/3 epoch (loss 0.6273): 49%|βββββ | 1005/2046 [12:08<12:40, 1.37it/s]
Training 2/3 epoch (loss 0.5260): 49%|βββββ | 1005/2046 [12:08<12:40, 1.37it/s]
Training 2/3 epoch (loss 0.5260): 49%|βββββ | 1006/2046 [12:08<12:04, 1.44it/s]
Training 2/3 epoch (loss 0.6145): 49%|βββββ | 1006/2046 [12:09<12:04, 1.44it/s]
Training 2/3 epoch (loss 0.6145): 49%|βββββ | 1007/2046 [12:09<11:38, 1.49it/s]
Training 2/3 epoch (loss 0.3991): 49%|βββββ | 1007/2046 [12:10<11:38, 1.49it/s]
Training 2/3 epoch (loss 0.3991): 49%|βββββ | 1008/2046 [12:10<11:54, 1.45it/s]
Training 2/3 epoch (loss 0.5379): 49%|βββββ | 1008/2046 [12:10<11:54, 1.45it/s]
Training 2/3 epoch (loss 0.5379): 49%|βββββ | 1009/2046 [12:10<11:45, 1.47it/s]
Training 2/3 epoch (loss 0.5090): 49%|βββββ | 1009/2046 [12:11<11:45, 1.47it/s]
Training 2/3 epoch (loss 0.5090): 49%|βββββ | 1010/2046 [12:11<11:23, 1.52it/s]
Training 2/3 epoch (loss 0.3951): 49%|βββββ | 1010/2046 [12:12<11:23, 1.52it/s]
Training 2/3 epoch (loss 0.3951): 49%|βββββ | 1011/2046 [12:12<11:41, 1.48it/s]
Training 2/3 epoch (loss 0.5071): 49%|βββββ | 1011/2046 [12:12<11:41, 1.48it/s]
Training 2/3 epoch (loss 0.5071): 49%|βββββ | 1012/2046 [12:12<11:25, 1.51it/s]
Training 2/3 epoch (loss 0.4552): 49%|βββββ | 1012/2046 [12:13<11:25, 1.51it/s]
Training 2/3 epoch (loss 0.4552): 50%|βββββ | 1013/2046 [12:13<11:07, 1.55it/s]
Training 2/3 epoch (loss 0.4108): 50%|βββββ | 1013/2046 [12:14<11:07, 1.55it/s]
Training 2/3 epoch (loss 0.4108): 50%|βββββ | 1014/2046 [12:14<11:17, 1.52it/s]
Training 2/3 epoch (loss 0.5032): 50%|βββββ | 1014/2046 [12:14<11:17, 1.52it/s]
Training 2/3 epoch (loss 0.5032): 50%|βββββ | 1015/2046 [12:14<11:51, 1.45it/s]
Training 2/3 epoch (loss 0.4619): 50%|βββββ | 1015/2046 [12:15<11:51, 1.45it/s]
Training 2/3 epoch (loss 0.4619): 50%|βββββ | 1016/2046 [12:15<13:00, 1.32it/s]
Training 2/3 epoch (loss 0.5048): 50%|βββββ | 1016/2046 [12:16<13:00, 1.32it/s]
Training 2/3 epoch (loss 0.5048): 50%|βββββ | 1017/2046 [12:16<13:00, 1.32it/s]
Training 2/3 epoch (loss 0.3930): 50%|βββββ | 1017/2046 [12:17<13:00, 1.32it/s]
Training 2/3 epoch (loss 0.3930): 50%|βββββ | 1018/2046 [12:17<12:16, 1.40it/s]
Training 2/3 epoch (loss 0.6775): 50%|βββββ | 1018/2046 [12:17<12:16, 1.40it/s]
Training 2/3 epoch (loss 0.6775): 50%|βββββ | 1019/2046 [12:17<11:48, 1.45it/s]
Training 2/3 epoch (loss 0.4987): 50%|βββββ | 1019/2046 [12:18<11:48, 1.45it/s]
Training 2/3 epoch (loss 0.4987): 50%|βββββ | 1020/2046 [12:18<12:20, 1.39it/s]
Training 2/3 epoch (loss 0.4900): 50%|βββββ | 1020/2046 [12:19<12:20, 1.39it/s]
Training 2/3 epoch (loss 0.4900): 50%|βββββ | 1021/2046 [12:19<12:35, 1.36it/s]
Training 2/3 epoch (loss 0.5298): 50%|βββββ | 1021/2046 [12:20<12:35, 1.36it/s]
Training 2/3 epoch (loss 0.5298): 50%|βββββ | 1022/2046 [12:20<13:35, 1.26it/s]
Training 2/3 epoch (loss 0.3687): 50%|βββββ | 1022/2046 [12:20<13:35, 1.26it/s]
Training 2/3 epoch (loss 0.3687): 50%|βββββ | 1023/2046 [12:20<12:43, 1.34it/s]
Training 2/3 epoch (loss 0.4755): 50%|βββββ | 1023/2046 [12:21<12:43, 1.34it/s]
Training 2/3 epoch (loss 0.4755): 50%|βββββ | 1024/2046 [12:21<12:53, 1.32it/s]
Training 2/3 epoch (loss 0.4541): 50%|βββββ | 1024/2046 [12:22<12:53, 1.32it/s]
Training 2/3 epoch (loss 0.4541): 50%|βββββ | 1025/2046 [12:22<12:59, 1.31it/s]
Training 2/3 epoch (loss 0.4941): 50%|βββββ | 1025/2046 [12:23<12:59, 1.31it/s]
Training 2/3 epoch (loss 0.4941): 50%|βββββ | 1026/2046 [12:23<13:39, 1.25it/s]
Training 2/3 epoch (loss 0.5604): 50%|βββββ | 1026/2046 [12:24<13:39, 1.25it/s]
Training 2/3 epoch (loss 0.5604): 50%|βββββ | 1027/2046 [12:24<14:02, 1.21it/s]
Training 2/3 epoch (loss 0.4029): 50%|βββββ | 1027/2046 [12:24<14:02, 1.21it/s]
Training 2/3 epoch (loss 0.4029): 50%|βββββ | 1028/2046 [12:24<13:33, 1.25it/s]
Training 2/3 epoch (loss 0.4275): 50%|βββββ | 1028/2046 [12:25<13:33, 1.25it/s]
Training 2/3 epoch (loss 0.4275): 50%|βββββ | 1029/2046 [12:25<12:31, 1.35it/s]
Training 2/3 epoch (loss 0.4325): 50%|βββββ | 1029/2046 [12:26<12:31, 1.35it/s]
Training 2/3 epoch (loss 0.4325): 50%|βββββ | 1030/2046 [12:26<13:13, 1.28it/s]
Training 2/3 epoch (loss 0.6463): 50%|βββββ | 1030/2046 [12:27<13:13, 1.28it/s]
Training 2/3 epoch (loss 0.6463): 50%|βββββ | 1031/2046 [12:27<12:41, 1.33it/s]
Training 2/3 epoch (loss 0.5229): 50%|βββββ | 1031/2046 [12:27<12:41, 1.33it/s]
Training 2/3 epoch (loss 0.5229): 50%|βββββ | 1032/2046 [12:27<13:08, 1.29it/s]
Training 2/3 epoch (loss 0.4925): 50%|βββββ | 1032/2046 [12:28<13:08, 1.29it/s]
Training 2/3 epoch (loss 0.4925): 50%|βββββ | 1033/2046 [12:28<13:10, 1.28it/s]
Training 2/3 epoch (loss 0.4254): 50%|βββββ | 1033/2046 [12:29<13:10, 1.28it/s]
Training 2/3 epoch (loss 0.4254): 51%|βββββ | 1034/2046 [12:29<13:18, 1.27it/s]
Training 2/3 epoch (loss 0.3932): 51%|βββββ | 1034/2046 [12:30<13:18, 1.27it/s]
Training 2/3 epoch (loss 0.3932): 51%|βββββ | 1035/2046 [12:30<12:51, 1.31it/s]
Training 2/3 epoch (loss 0.4541): 51%|βββββ | 1035/2046 [12:30<12:51, 1.31it/s]
Training 2/3 epoch (loss 0.4541): 51%|βββββ | 1036/2046 [12:30<12:33, 1.34it/s]
Training 2/3 epoch (loss 0.4685): 51%|βββββ | 1036/2046 [12:31<12:33, 1.34it/s]
Training 2/3 epoch (loss 0.4685): 51%|βββββ | 1037/2046 [12:31<12:10, 1.38it/s]
Training 2/3 epoch (loss 0.5424): 51%|βββββ | 1037/2046 [12:32<12:10, 1.38it/s]
Training 2/3 epoch (loss 0.5424): 51%|βββββ | 1038/2046 [12:32<11:38, 1.44it/s]
Training 2/3 epoch (loss 0.4343): 51%|βββββ | 1038/2046 [12:32<11:38, 1.44it/s]
Training 2/3 epoch (loss 0.4343): 51%|βββββ | 1039/2046 [12:32<11:18, 1.48it/s]
Training 2/3 epoch (loss 0.4953): 51%|βββββ | 1039/2046 [12:33<11:18, 1.48it/s]
Training 2/3 epoch (loss 0.4953): 51%|βββββ | 1040/2046 [12:33<12:20, 1.36it/s]
Training 2/3 epoch (loss 0.4254): 51%|βββββ | 1040/2046 [12:34<12:20, 1.36it/s]
Training 2/3 epoch (loss 0.4254): 51%|βββββ | 1041/2046 [12:34<12:11, 1.37it/s]
Training 2/3 epoch (loss 0.4897): 51%|βββββ | 1041/2046 [12:35<12:11, 1.37it/s]
Training 2/3 epoch (loss 0.4897): 51%|βββββ | 1042/2046 [12:35<12:24, 1.35it/s]
Training 2/3 epoch (loss 0.4343): 51%|βββββ | 1042/2046 [12:35<12:24, 1.35it/s]
Training 2/3 epoch (loss 0.4343): 51%|βββββ | 1043/2046 [12:35<12:08, 1.38it/s]
Training 2/3 epoch (loss 0.5169): 51%|βββββ | 1043/2046 [12:36<12:08, 1.38it/s]
Training 2/3 epoch (loss 0.5169): 51%|βββββ | 1044/2046 [12:36<12:06, 1.38it/s]
Training 2/3 epoch (loss 0.5957): 51%|βββββ | 1044/2046 [12:37<12:06, 1.38it/s]
Training 2/3 epoch (loss 0.5957): 51%|βββββ | 1045/2046 [12:37<11:57, 1.39it/s]
Training 2/3 epoch (loss 0.5629): 51%|βββββ | 1045/2046 [12:37<11:57, 1.39it/s]
Training 2/3 epoch (loss 0.5629): 51%|βββββ | 1046/2046 [12:37<11:39, 1.43it/s]
Training 2/3 epoch (loss 0.3904): 51%|βββββ | 1046/2046 [12:38<11:39, 1.43it/s]
Training 2/3 epoch (loss 0.3904): 51%|βββββ | 1047/2046 [12:38<12:39, 1.32it/s]
Training 2/3 epoch (loss 0.5002): 51%|βββββ | 1047/2046 [12:39<12:39, 1.32it/s]
Training 2/3 epoch (loss 0.5002): 51%|βββββ | 1048/2046 [12:39<12:48, 1.30it/s]
Training 2/3 epoch (loss 0.5986): 51%|βββββ | 1048/2046 [12:40<12:48, 1.30it/s]
Training 2/3 epoch (loss 0.5986): 51%|ββββββ | 1049/2046 [12:40<12:22, 1.34it/s]
Training 2/3 epoch (loss 0.5093): 51%|ββββββ | 1049/2046 [12:40<12:22, 1.34it/s]
Training 2/3 epoch (loss 0.5093): 51%|ββββββ | 1050/2046 [12:40<11:43, 1.42it/s]
Training 2/3 epoch (loss 0.5234): 51%|ββββββ | 1050/2046 [12:41<11:43, 1.42it/s]
Training 2/3 epoch (loss 0.5234): 51%|ββββββ | 1051/2046 [12:41<11:21, 1.46it/s]
Training 2/3 epoch (loss 0.6037): 51%|ββββββ | 1051/2046 [12:42<11:21, 1.46it/s]
Training 2/3 epoch (loss 0.6037): 51%|ββββββ | 1052/2046 [12:42<11:26, 1.45it/s]
Training 2/3 epoch (loss 0.4114): 51%|ββββββ | 1052/2046 [12:43<11:26, 1.45it/s]
Training 2/3 epoch (loss 0.4114): 51%|ββββββ | 1053/2046 [12:43<11:37, 1.42it/s]
Training 2/3 epoch (loss 0.4905): 51%|ββββββ | 1053/2046 [12:43<11:37, 1.42it/s]
Training 2/3 epoch (loss 0.4905): 52%|ββββββ | 1054/2046 [12:43<11:28, 1.44it/s]
Training 2/3 epoch (loss 0.6621): 52%|ββββββ | 1054/2046 [12:44<11:28, 1.44it/s]
Training 2/3 epoch (loss 0.6621): 52%|ββββββ | 1055/2046 [12:44<11:28, 1.44it/s]
Training 2/3 epoch (loss 0.5025): 52%|ββββββ | 1055/2046 [12:45<11:28, 1.44it/s]
Training 2/3 epoch (loss 0.5025): 52%|ββββββ | 1056/2046 [12:45<12:00, 1.37it/s]
Training 2/3 epoch (loss 0.5132): 52%|ββββββ | 1056/2046 [12:46<12:00, 1.37it/s]
Training 2/3 epoch (loss 0.5132): 52%|ββββββ | 1057/2046 [12:46<12:25, 1.33it/s]
Training 2/3 epoch (loss 0.4179): 52%|ββββββ | 1057/2046 [12:46<12:25, 1.33it/s]
Training 2/3 epoch (loss 0.4179): 52%|ββββββ | 1058/2046 [12:46<12:06, 1.36it/s]
Training 2/3 epoch (loss 0.4710): 52%|ββββββ | 1058/2046 [12:47<12:06, 1.36it/s]
Training 2/3 epoch (loss 0.4710): 52%|ββββββ | 1059/2046 [12:47<11:33, 1.42it/s]
Training 2/3 epoch (loss 0.4253): 52%|ββββββ | 1059/2046 [12:48<11:33, 1.42it/s]
Training 2/3 epoch (loss 0.4253): 52%|ββββββ | 1060/2046 [12:48<12:40, 1.30it/s]
Training 2/3 epoch (loss 0.5618): 52%|ββββββ | 1060/2046 [12:49<12:40, 1.30it/s]
Training 2/3 epoch (loss 0.5618): 52%|ββββββ | 1061/2046 [12:49<12:32, 1.31it/s]
Training 2/3 epoch (loss 0.4443): 52%|ββββββ | 1061/2046 [12:49<12:32, 1.31it/s]
Training 2/3 epoch (loss 0.4443): 52%|ββββββ | 1062/2046 [12:49<11:48, 1.39it/s]
Training 2/3 epoch (loss 0.4850): 52%|ββββββ | 1062/2046 [12:50<11:48, 1.39it/s]
Training 2/3 epoch (loss 0.4850): 52%|ββββββ | 1063/2046 [12:50<11:29, 1.43it/s]
Training 2/3 epoch (loss 0.5171): 52%|ββββββ | 1063/2046 [12:51<11:29, 1.43it/s]
Training 2/3 epoch (loss 0.5171): 52%|ββββββ | 1064/2046 [12:51<12:17, 1.33it/s]
Training 2/3 epoch (loss 0.5450): 52%|ββββββ | 1064/2046 [12:51<12:17, 1.33it/s]
Training 2/3 epoch (loss 0.5450): 52%|ββββββ | 1065/2046 [12:51<12:10, 1.34it/s]
Training 2/3 epoch (loss 0.4366): 52%|ββββββ | 1065/2046 [12:52<12:10, 1.34it/s]
Training 2/3 epoch (loss 0.4366): 52%|ββββββ | 1066/2046 [12:52<11:51, 1.38it/s]
Training 2/3 epoch (loss 0.4980): 52%|ββββββ | 1066/2046 [12:53<11:51, 1.38it/s]
Training 2/3 epoch (loss 0.4980): 52%|ββββββ | 1067/2046 [12:53<12:16, 1.33it/s]
Training 2/3 epoch (loss 0.4901): 52%|ββββββ | 1067/2046 [12:54<12:16, 1.33it/s]
Training 2/3 epoch (loss 0.4901): 52%|ββββββ | 1068/2046 [12:54<12:04, 1.35it/s]
Training 2/3 epoch (loss 0.5580): 52%|ββββββ | 1068/2046 [12:55<12:04, 1.35it/s]
Training 2/3 epoch (loss 0.5580): 52%|ββββββ | 1069/2046 [12:55<13:43, 1.19it/s]
Training 2/3 epoch (loss 0.4769): 52%|ββββββ | 1069/2046 [12:55<13:43, 1.19it/s]
Training 2/3 epoch (loss 0.4769): 52%|ββββββ | 1070/2046 [12:55<12:53, 1.26it/s]
Training 2/3 epoch (loss 0.4413): 52%|ββββββ | 1070/2046 [12:56<12:53, 1.26it/s]
Training 2/3 epoch (loss 0.4413): 52%|ββββββ | 1071/2046 [12:56<12:10, 1.33it/s]
Training 2/3 epoch (loss 0.6696): 52%|ββββββ | 1071/2046 [12:57<12:10, 1.33it/s]
Training 2/3 epoch (loss 0.6696): 52%|ββββββ | 1072/2046 [12:57<12:58, 1.25it/s]
Training 2/3 epoch (loss 0.3677): 52%|ββββββ | 1072/2046 [12:58<12:58, 1.25it/s]
Training 2/3 epoch (loss 0.3677): 52%|ββββββ | 1073/2046 [12:58<12:13, 1.33it/s]
Training 2/3 epoch (loss 0.3735): 52%|ββββββ | 1073/2046 [12:58<12:13, 1.33it/s]
Training 2/3 epoch (loss 0.3735): 52%|ββββββ | 1074/2046 [12:58<11:37, 1.39it/s]
Training 2/3 epoch (loss 0.4132): 52%|ββββββ | 1074/2046 [12:59<11:37, 1.39it/s]
Training 2/3 epoch (loss 0.4132): 53%|ββββββ | 1075/2046 [12:59<11:27, 1.41it/s]
Training 2/3 epoch (loss 0.4893): 53%|ββββββ | 1075/2046 [13:00<11:27, 1.41it/s]
Training 2/3 epoch (loss 0.4893): 53%|ββββββ | 1076/2046 [13:00<11:40, 1.38it/s]
Training 2/3 epoch (loss 0.4479): 53%|ββββββ | 1076/2046 [13:00<11:40, 1.38it/s]
Training 2/3 epoch (loss 0.4479): 53%|ββββββ | 1077/2046 [13:00<11:23, 1.42it/s]
Training 2/3 epoch (loss 0.5829): 53%|ββββββ | 1077/2046 [13:01<11:23, 1.42it/s]
Training 2/3 epoch (loss 0.5829): 53%|ββββββ | 1078/2046 [13:01<10:56, 1.48it/s]
Training 2/3 epoch (loss 0.4731): 53%|ββββββ | 1078/2046 [13:02<10:56, 1.48it/s]
Training 2/3 epoch (loss 0.4731): 53%|ββββββ | 1079/2046 [13:02<11:52, 1.36it/s]
Training 2/3 epoch (loss 0.5016): 53%|ββββββ | 1079/2046 [13:03<11:52, 1.36it/s]
Training 2/3 epoch (loss 0.5016): 53%|ββββββ | 1080/2046 [13:03<11:58, 1.34it/s]
Training 2/3 epoch (loss 0.4903): 53%|ββββββ | 1080/2046 [13:03<11:58, 1.34it/s]
Training 2/3 epoch (loss 0.4903): 53%|ββββββ | 1081/2046 [13:03<11:29, 1.40it/s]
Training 2/3 epoch (loss 0.5460): 53%|ββββββ | 1081/2046 [13:04<11:29, 1.40it/s]
Training 2/3 epoch (loss 0.5460): 53%|ββββββ | 1082/2046 [13:04<11:23, 1.41it/s]
Training 2/3 epoch (loss 0.5347): 53%|ββββββ | 1082/2046 [13:05<11:23, 1.41it/s]
Training 2/3 epoch (loss 0.5347): 53%|ββββββ | 1083/2046 [13:05<11:15, 1.43it/s]
Training 2/3 epoch (loss 0.4884): 53%|ββββββ | 1083/2046 [13:05<11:15, 1.43it/s]
Training 2/3 epoch (loss 0.4884): 53%|ββββββ | 1084/2046 [13:05<11:16, 1.42it/s]
Training 2/3 epoch (loss 0.4813): 53%|ββββββ | 1084/2046 [13:06<11:16, 1.42it/s]
Training 2/3 epoch (loss 0.4813): 53%|ββββββ | 1085/2046 [13:06<12:18, 1.30it/s]
Training 2/3 epoch (loss 0.4519): 53%|ββββββ | 1085/2046 [13:07<12:18, 1.30it/s]
Training 2/3 epoch (loss 0.4519): 53%|ββββββ | 1086/2046 [13:07<11:42, 1.37it/s]
Training 2/3 epoch (loss 0.5164): 53%|ββββββ | 1086/2046 [13:08<11:42, 1.37it/s]
Training 2/3 epoch (loss 0.5164): 53%|ββββββ | 1087/2046 [13:08<11:39, 1.37it/s]
Training 2/3 epoch (loss 0.4137): 53%|ββββββ | 1087/2046 [13:08<11:39, 1.37it/s]
Training 2/3 epoch (loss 0.4137): 53%|ββββββ | 1088/2046 [13:08<12:07, 1.32it/s]
Training 2/3 epoch (loss 0.4984): 53%|ββββββ | 1088/2046 [13:09<12:07, 1.32it/s]
Training 2/3 epoch (loss 0.4984): 53%|ββββββ | 1089/2046 [13:09<11:38, 1.37it/s]
Training 2/3 epoch (loss 0.4784): 53%|ββββββ | 1089/2046 [13:10<11:38, 1.37it/s]
Training 2/3 epoch (loss 0.4784): 53%|ββββββ | 1090/2046 [13:10<11:07, 1.43it/s]
Training 2/3 epoch (loss 0.3484): 53%|ββββββ | 1090/2046 [13:10<11:07, 1.43it/s]
Training 2/3 epoch (loss 0.3484): 53%|ββββββ | 1091/2046 [13:10<10:52, 1.46it/s]
Training 2/3 epoch (loss 0.3939): 53%|ββββββ | 1091/2046 [13:11<10:52, 1.46it/s]
Training 2/3 epoch (loss 0.3939): 53%|ββββββ | 1092/2046 [13:11<11:11, 1.42it/s]
Training 2/3 epoch (loss 0.4007): 53%|ββββββ | 1092/2046 [13:12<11:11, 1.42it/s]
Training 2/3 epoch (loss 0.4007): 53%|ββββββ | 1093/2046 [13:12<11:30, 1.38it/s]
Training 2/3 epoch (loss 0.4957): 53%|ββββββ | 1093/2046 [13:13<11:30, 1.38it/s]
Training 2/3 epoch (loss 0.4957): 53%|ββββββ | 1094/2046 [13:13<11:02, 1.44it/s]
Training 2/3 epoch (loss 0.4287): 53%|ββββββ | 1094/2046 [13:13<11:02, 1.44it/s]
Training 2/3 epoch (loss 0.4287): 54%|ββββββ | 1095/2046 [13:13<10:40, 1.49it/s]
Training 2/3 epoch (loss 0.4176): 54%|ββββββ | 1095/2046 [13:14<10:40, 1.49it/s]
Training 2/3 epoch (loss 0.4176): 54%|ββββββ | 1096/2046 [13:14<12:32, 1.26it/s]
Training 2/3 epoch (loss 0.4496): 54%|ββββββ | 1096/2046 [13:15<12:32, 1.26it/s]
Training 2/3 epoch (loss 0.4496): 54%|ββββββ | 1097/2046 [13:15<11:51, 1.33it/s]
Training 2/3 epoch (loss 0.3820): 54%|ββββββ | 1097/2046 [13:16<11:51, 1.33it/s]
Training 2/3 epoch (loss 0.3820): 54%|ββββββ | 1098/2046 [13:16<11:23, 1.39it/s]
Training 2/3 epoch (loss 0.4617): 54%|ββββββ | 1098/2046 [13:16<11:23, 1.39it/s]
Training 2/3 epoch (loss 0.4617): 54%|ββββββ | 1099/2046 [13:16<10:55, 1.44it/s]
Training 2/3 epoch (loss 0.4137): 54%|ββββββ | 1099/2046 [13:17<10:55, 1.44it/s]
Training 2/3 epoch (loss 0.4137): 54%|ββββββ | 1100/2046 [13:17<10:36, 1.49it/s]
Training 2/3 epoch (loss 0.4973): 54%|ββββββ | 1100/2046 [13:17<10:36, 1.49it/s]
Training 2/3 epoch (loss 0.4973): 54%|ββββββ | 1101/2046 [13:17<10:29, 1.50it/s]
Training 2/3 epoch (loss 0.3800): 54%|ββββββ | 1101/2046 [13:18<10:29, 1.50it/s]
Training 2/3 epoch (loss 0.3800): 54%|ββββββ | 1102/2046 [13:18<12:01, 1.31it/s]
Training 2/3 epoch (loss 0.4549): 54%|ββββββ | 1102/2046 [13:19<12:01, 1.31it/s]
Training 2/3 epoch (loss 0.4549): 54%|ββββββ | 1103/2046 [13:19<12:11, 1.29it/s]
Training 2/3 epoch (loss 0.5332): 54%|ββββββ | 1103/2046 [13:20<12:11, 1.29it/s]
Training 2/3 epoch (loss 0.5332): 54%|ββββββ | 1104/2046 [13:20<14:11, 1.11it/s]
Training 2/3 epoch (loss 0.4017): 54%|ββββββ | 1104/2046 [13:21<14:11, 1.11it/s]
Training 2/3 epoch (loss 0.4017): 54%|ββββββ | 1105/2046 [13:21<13:47, 1.14it/s]
Training 2/3 epoch (loss 0.6703): 54%|ββββββ | 1105/2046 [13:22<13:47, 1.14it/s]
Training 2/3 epoch (loss 0.6703): 54%|ββββββ | 1106/2046 [13:22<12:36, 1.24it/s]
Training 2/3 epoch (loss 0.4833): 54%|ββββββ | 1106/2046 [13:22<12:36, 1.24it/s]
Training 2/3 epoch (loss 0.4833): 54%|ββββββ | 1107/2046 [13:22<11:41, 1.34it/s]
Training 2/3 epoch (loss 0.4186): 54%|ββββββ | 1107/2046 [13:23<11:41, 1.34it/s]
Training 2/3 epoch (loss 0.4186): 54%|ββββββ | 1108/2046 [13:23<11:37, 1.34it/s]
Training 2/3 epoch (loss 0.4702): 54%|ββββββ | 1108/2046 [13:24<11:37, 1.34it/s]
Training 2/3 epoch (loss 0.4702): 54%|ββββββ | 1109/2046 [13:24<11:24, 1.37it/s]
Training 2/3 epoch (loss 0.5274): 54%|ββββββ | 1109/2046 [13:25<11:24, 1.37it/s]
Training 2/3 epoch (loss 0.5274): 54%|ββββββ | 1110/2046 [13:25<11:28, 1.36it/s]
Training 2/3 epoch (loss 0.5391): 54%|ββββββ | 1110/2046 [13:25<11:28, 1.36it/s]
Training 2/3 epoch (loss 0.5391): 54%|ββββββ | 1111/2046 [13:25<11:09, 1.40it/s]
Training 2/3 epoch (loss 0.4954): 54%|ββββββ | 1111/2046 [13:26<11:09, 1.40it/s]
Training 2/3 epoch (loss 0.4954): 54%|ββββββ | 1112/2046 [13:26<11:36, 1.34it/s]
Training 2/3 epoch (loss 0.5179): 54%|ββββββ | 1112/2046 [13:27<11:36, 1.34it/s]
Training 2/3 epoch (loss 0.5179): 54%|ββββββ | 1113/2046 [13:27<11:39, 1.33it/s]
Training 2/3 epoch (loss 0.5089): 54%|ββββββ | 1113/2046 [13:28<11:39, 1.33it/s]
Training 2/3 epoch (loss 0.5089): 54%|ββββββ | 1114/2046 [13:28<11:05, 1.40it/s]
Training 2/3 epoch (loss 0.4607): 54%|ββββββ | 1114/2046 [13:28<11:05, 1.40it/s]
Training 2/3 epoch (loss 0.4607): 54%|ββββββ | 1115/2046 [13:28<10:40, 1.45it/s]
Training 2/3 epoch (loss 0.4746): 54%|ββββββ | 1115/2046 [13:29<10:40, 1.45it/s]
Training 2/3 epoch (loss 0.4746): 55%|ββββββ | 1116/2046 [13:29<10:25, 1.49it/s]
Training 2/3 epoch (loss 0.5624): 55%|ββββββ | 1116/2046 [13:29<10:25, 1.49it/s]
Training 2/3 epoch (loss 0.5624): 55%|ββββββ | 1117/2046 [13:29<10:13, 1.51it/s]
Training 2/3 epoch (loss 0.3632): 55%|ββββββ | 1117/2046 [13:30<10:13, 1.51it/s]
Training 2/3 epoch (loss 0.3632): 55%|ββββββ | 1118/2046 [13:30<10:31, 1.47it/s]
Training 2/3 epoch (loss 0.4651): 55%|ββββββ | 1118/2046 [13:31<10:31, 1.47it/s]
Training 2/3 epoch (loss 0.4651): 55%|ββββββ | 1119/2046 [13:31<10:22, 1.49it/s]
Training 2/3 epoch (loss 0.3601): 55%|ββββββ | 1119/2046 [13:32<10:22, 1.49it/s]
Training 2/3 epoch (loss 0.3601): 55%|ββββββ | 1120/2046 [13:32<10:41, 1.44it/s]
Training 2/3 epoch (loss 0.6212): 55%|ββββββ | 1120/2046 [13:32<10:41, 1.44it/s]
Training 2/3 epoch (loss 0.6212): 55%|ββββββ | 1121/2046 [13:32<10:56, 1.41it/s]
Training 2/3 epoch (loss 0.4260): 55%|ββββββ | 1121/2046 [13:33<10:56, 1.41it/s]
Training 2/3 epoch (loss 0.4260): 55%|ββββββ | 1122/2046 [13:33<10:58, 1.40it/s]
Training 2/3 epoch (loss 0.4043): 55%|ββββββ | 1122/2046 [13:34<10:58, 1.40it/s]
Training 2/3 epoch (loss 0.4043): 55%|ββββββ | 1123/2046 [13:34<11:43, 1.31it/s]
Training 2/3 epoch (loss 0.5522): 55%|ββββββ | 1123/2046 [13:35<11:43, 1.31it/s]
Training 2/3 epoch (loss 0.5522): 55%|ββββββ | 1124/2046 [13:35<11:17, 1.36it/s]
Training 2/3 epoch (loss 0.5538): 55%|ββββββ | 1124/2046 [13:35<11:17, 1.36it/s]
Training 2/3 epoch (loss 0.5538): 55%|ββββββ | 1125/2046 [13:35<11:40, 1.31it/s]
Training 2/3 epoch (loss 0.4550): 55%|ββββββ | 1125/2046 [13:36<11:40, 1.31it/s]
Training 2/3 epoch (loss 0.4550): 55%|ββββββ | 1126/2046 [13:36<11:17, 1.36it/s]
Training 2/3 epoch (loss 0.4088): 55%|ββββββ | 1126/2046 [13:37<11:17, 1.36it/s]
Training 2/3 epoch (loss 0.4088): 55%|ββββββ | 1127/2046 [13:37<10:56, 1.40it/s]
Training 2/3 epoch (loss 0.4521): 55%|ββββββ | 1127/2046 [13:38<10:56, 1.40it/s]
Training 2/3 epoch (loss 0.4521): 55%|ββββββ | 1128/2046 [13:38<11:46, 1.30it/s]
Training 2/3 epoch (loss 0.4245): 55%|ββββββ | 1128/2046 [13:38<11:46, 1.30it/s]
Training 2/3 epoch (loss 0.4245): 55%|ββββββ | 1129/2046 [13:38<11:02, 1.38it/s]
Training 2/3 epoch (loss 0.3722): 55%|ββββββ | 1129/2046 [13:39<11:02, 1.38it/s]
Training 2/3 epoch (loss 0.3722): 55%|ββββββ | 1130/2046 [13:39<10:37, 1.44it/s]
Training 2/3 epoch (loss 0.4145): 55%|ββββββ | 1130/2046 [13:39<10:37, 1.44it/s]
Training 2/3 epoch (loss 0.4145): 55%|ββββββ | 1131/2046 [13:39<10:18, 1.48it/s]
Training 2/3 epoch (loss 0.4789): 55%|ββββββ | 1131/2046 [13:40<10:18, 1.48it/s]
Training 2/3 epoch (loss 0.4789): 55%|ββββββ | 1132/2046 [13:40<10:01, 1.52it/s]
Training 2/3 epoch (loss 0.4283): 55%|ββββββ | 1132/2046 [13:41<10:01, 1.52it/s]
Training 2/3 epoch (loss 0.4283): 55%|ββββββ | 1133/2046 [13:41<10:12, 1.49it/s]
Training 2/3 epoch (loss 0.4734): 55%|ββββββ | 1133/2046 [13:41<10:12, 1.49it/s]
Training 2/3 epoch (loss 0.4734): 55%|ββββββ | 1134/2046 [13:41<09:54, 1.54it/s]
Training 2/3 epoch (loss 0.5716): 55%|ββββββ | 1134/2046 [13:42<09:54, 1.54it/s]
Training 2/3 epoch (loss 0.5716): 55%|ββββββ | 1135/2046 [13:42<09:43, 1.56it/s]
Training 2/3 epoch (loss 0.4554): 55%|ββββββ | 1135/2046 [13:43<09:43, 1.56it/s]
Training 2/3 epoch (loss 0.4554): 56%|ββββββ | 1136/2046 [13:43<10:31, 1.44it/s]
Training 2/3 epoch (loss 0.3081): 56%|ββββββ | 1136/2046 [13:44<10:31, 1.44it/s]
Training 2/3 epoch (loss 0.3081): 56%|ββββββ | 1137/2046 [13:44<10:23, 1.46it/s]
Training 2/3 epoch (loss 0.4563): 56%|ββββββ | 1137/2046 [13:44<10:23, 1.46it/s]
Training 2/3 epoch (loss 0.4563): 56%|ββββββ | 1138/2046 [13:44<10:03, 1.51it/s]
Training 2/3 epoch (loss 0.5435): 56%|ββββββ | 1138/2046 [13:45<10:03, 1.51it/s]
Training 2/3 epoch (loss 0.5435): 56%|ββββββ | 1139/2046 [13:45<09:49, 1.54it/s]
Training 2/3 epoch (loss 0.4330): 56%|ββββββ | 1139/2046 [13:45<09:49, 1.54it/s]
Training 2/3 epoch (loss 0.4330): 56%|ββββββ | 1140/2046 [13:45<09:55, 1.52it/s]
Training 2/3 epoch (loss 0.5430): 56%|ββββββ | 1140/2046 [13:46<09:55, 1.52it/s]
Training 2/3 epoch (loss 0.5430): 56%|ββββββ | 1141/2046 [13:46<09:36, 1.57it/s]
Training 2/3 epoch (loss 0.4078): 56%|ββββββ | 1141/2046 [13:47<09:36, 1.57it/s]
Training 2/3 epoch (loss 0.4078): 56%|ββββββ | 1142/2046 [13:47<09:45, 1.54it/s]
Training 2/3 epoch (loss 0.3394): 56%|ββββββ | 1142/2046 [13:47<09:45, 1.54it/s]
Training 2/3 epoch (loss 0.3394): 56%|ββββββ | 1143/2046 [13:47<10:04, 1.49it/s]
Training 2/3 epoch (loss 0.5203): 56%|ββββββ | 1143/2046 [13:48<10:04, 1.49it/s]
Training 2/3 epoch (loss 0.5203): 56%|ββββββ | 1144/2046 [13:48<11:26, 1.31it/s]
Training 2/3 epoch (loss 0.4311): 56%|ββββββ | 1144/2046 [13:49<11:26, 1.31it/s]
Training 2/3 epoch (loss 0.4311): 56%|ββββββ | 1145/2046 [13:49<11:39, 1.29it/s]
Training 2/3 epoch (loss 0.5201): 56%|ββββββ | 1145/2046 [13:50<11:39, 1.29it/s]
Training 2/3 epoch (loss 0.5201): 56%|ββββββ | 1146/2046 [13:50<11:48, 1.27it/s]
Training 2/3 epoch (loss 0.3413): 56%|ββββββ | 1146/2046 [13:51<11:48, 1.27it/s]
Training 2/3 epoch (loss 0.3413): 56%|ββββββ | 1147/2046 [13:51<10:57, 1.37it/s]
Training 2/3 epoch (loss 0.4818): 56%|ββββββ | 1147/2046 [13:51<10:57, 1.37it/s]
Training 2/3 epoch (loss 0.4818): 56%|ββββββ | 1148/2046 [13:51<10:32, 1.42it/s]
Training 2/3 epoch (loss 0.5130): 56%|ββββββ | 1148/2046 [13:52<10:32, 1.42it/s]
Training 2/3 epoch (loss 0.5130): 56%|ββββββ | 1149/2046 [13:52<10:17, 1.45it/s]
Training 2/3 epoch (loss 0.4204): 56%|ββββββ | 1149/2046 [13:53<10:17, 1.45it/s]
Training 2/3 epoch (loss 0.4204): 56%|ββββββ | 1150/2046 [13:53<11:38, 1.28it/s]
Training 2/3 epoch (loss 0.4330): 56%|ββββββ | 1150/2046 [13:54<11:38, 1.28it/s]
Training 2/3 epoch (loss 0.4330): 56%|ββββββ | 1151/2046 [13:54<10:52, 1.37it/s]
Training 2/3 epoch (loss 0.4585): 56%|ββββββ | 1151/2046 [13:54<10:52, 1.37it/s]
Training 2/3 epoch (loss 0.4585): 56%|ββββββ | 1152/2046 [13:54<11:38, 1.28it/s]
Training 2/3 epoch (loss 0.4127): 56%|ββββββ | 1152/2046 [13:55<11:38, 1.28it/s]
Training 2/3 epoch (loss 0.4127): 56%|ββββββ | 1153/2046 [13:55<11:03, 1.35it/s]
Training 2/3 epoch (loss 0.4489): 56%|ββββββ | 1153/2046 [13:56<11:03, 1.35it/s]
Training 2/3 epoch (loss 0.4489): 56%|ββββββ | 1154/2046 [13:56<10:36, 1.40it/s]
Training 2/3 epoch (loss 0.4081): 56%|ββββββ | 1154/2046 [13:56<10:36, 1.40it/s]
Training 2/3 epoch (loss 0.4081): 56%|ββββββ | 1155/2046 [13:56<10:25, 1.42it/s]
Training 2/3 epoch (loss 0.3303): 56%|ββββββ | 1155/2046 [13:57<10:25, 1.42it/s]
Training 2/3 epoch (loss 0.3303): 57%|ββββββ | 1156/2046 [13:57<10:09, 1.46it/s]
Training 2/3 epoch (loss 0.4446): 57%|ββββββ | 1156/2046 [13:58<10:09, 1.46it/s]
Training 2/3 epoch (loss 0.4446): 57%|ββββββ | 1157/2046 [13:58<09:49, 1.51it/s]
Training 2/3 epoch (loss 0.4243): 57%|ββββββ | 1157/2046 [13:58<09:49, 1.51it/s]
Training 2/3 epoch (loss 0.4243): 57%|ββββββ | 1158/2046 [13:58<10:38, 1.39it/s]
Training 2/3 epoch (loss 0.4599): 57%|ββββββ | 1158/2046 [13:59<10:38, 1.39it/s]
Training 2/3 epoch (loss 0.4599): 57%|ββββββ | 1159/2046 [13:59<10:35, 1.40it/s]
Training 2/3 epoch (loss 0.4164): 57%|ββββββ | 1159/2046 [14:00<10:35, 1.40it/s]
Training 2/3 epoch (loss 0.4164): 57%|ββββββ | 1160/2046 [14:00<11:00, 1.34it/s]
Training 2/3 epoch (loss 0.5129): 57%|ββββββ | 1160/2046 [14:01<11:00, 1.34it/s]
Training 2/3 epoch (loss 0.5129): 57%|ββββββ | 1161/2046 [14:01<11:05, 1.33it/s]
Training 2/3 epoch (loss 0.4489): 57%|ββββββ | 1161/2046 [14:01<11:05, 1.33it/s]
Training 2/3 epoch (loss 0.4489): 57%|ββββββ | 1162/2046 [14:01<10:39, 1.38it/s]
Training 2/3 epoch (loss 0.4659): 57%|ββββββ | 1162/2046 [14:02<10:39, 1.38it/s]
Training 2/3 epoch (loss 0.4659): 57%|ββββββ | 1163/2046 [14:02<10:08, 1.45it/s]
Training 2/3 epoch (loss 0.4947): 57%|ββββββ | 1163/2046 [14:03<10:08, 1.45it/s]
Training 2/3 epoch (loss 0.4947): 57%|ββββββ | 1164/2046 [14:03<10:03, 1.46it/s]
Training 2/3 epoch (loss 0.4988): 57%|ββββββ | 1164/2046 [14:03<10:03, 1.46it/s]
Training 2/3 epoch (loss 0.4988): 57%|ββββββ | 1165/2046 [14:03<10:07, 1.45it/s]
Training 2/3 epoch (loss 0.4392): 57%|ββββββ | 1165/2046 [14:04<10:07, 1.45it/s]
Training 2/3 epoch (loss 0.4392): 57%|ββββββ | 1166/2046 [14:04<10:30, 1.40it/s]
Training 2/3 epoch (loss 0.5435): 57%|ββββββ | 1166/2046 [14:05<10:30, 1.40it/s]
Training 2/3 epoch (loss 0.5435): 57%|ββββββ | 1167/2046 [14:05<10:06, 1.45it/s]
Training 2/3 epoch (loss 0.2968): 57%|ββββββ | 1167/2046 [14:06<10:06, 1.45it/s]
Training 2/3 epoch (loss 0.2968): 57%|ββββββ | 1168/2046 [14:06<10:46, 1.36it/s]
Training 2/3 epoch (loss 0.4337): 57%|ββββββ | 1168/2046 [14:06<10:46, 1.36it/s]
Training 2/3 epoch (loss 0.4337): 57%|ββββββ | 1169/2046 [14:06<10:39, 1.37it/s]
Training 2/3 epoch (loss 0.4884): 57%|ββββββ | 1169/2046 [14:07<10:39, 1.37it/s]
Training 2/3 epoch (loss 0.4884): 57%|ββββββ | 1170/2046 [14:07<10:58, 1.33it/s]
Training 2/3 epoch (loss 0.4896): 57%|ββββββ | 1170/2046 [14:08<10:58, 1.33it/s]
Training 2/3 epoch (loss 0.4896): 57%|ββββββ | 1171/2046 [14:08<10:33, 1.38it/s]
Training 2/3 epoch (loss 0.3581): 57%|ββββββ | 1171/2046 [14:09<10:33, 1.38it/s]
Training 2/3 epoch (loss 0.3581): 57%|ββββββ | 1172/2046 [14:09<10:17, 1.42it/s]
Training 2/3 epoch (loss 0.4623): 57%|ββββββ | 1172/2046 [14:09<10:17, 1.42it/s]
Training 2/3 epoch (loss 0.4623): 57%|ββββββ | 1173/2046 [14:09<09:50, 1.48it/s]
Training 2/3 epoch (loss 0.4834): 57%|ββββββ | 1173/2046 [14:10<09:50, 1.48it/s]
Training 2/3 epoch (loss 0.4834): 57%|ββββββ | 1174/2046 [14:10<09:33, 1.52it/s]
Training 2/3 epoch (loss 0.5473): 57%|ββββββ | 1174/2046 [14:10<09:33, 1.52it/s]
Training 2/3 epoch (loss 0.5473): 57%|ββββββ | 1175/2046 [14:10<09:31, 1.52it/s]
Training 2/3 epoch (loss 0.6258): 57%|ββββββ | 1175/2046 [14:11<09:31, 1.52it/s]
Training 2/3 epoch (loss 0.6258): 57%|ββββββ | 1176/2046 [14:11<09:57, 1.46it/s]
Training 2/3 epoch (loss 0.3886): 57%|ββββββ | 1176/2046 [14:12<09:57, 1.46it/s]
Training 2/3 epoch (loss 0.3886): 58%|ββββββ | 1177/2046 [14:12<09:52, 1.47it/s]
Training 2/3 epoch (loss 0.4678): 58%|ββββββ | 1177/2046 [14:12<09:52, 1.47it/s]
Training 2/3 epoch (loss 0.4678): 58%|ββββββ | 1178/2046 [14:12<09:37, 1.50it/s]
Training 2/3 epoch (loss 0.4883): 58%|ββββββ | 1178/2046 [14:13<09:37, 1.50it/s]
Training 2/3 epoch (loss 0.4883): 58%|ββββββ | 1179/2046 [14:13<09:35, 1.51it/s]
Training 2/3 epoch (loss 0.6024): 58%|ββββββ | 1179/2046 [14:14<09:35, 1.51it/s]
Training 2/3 epoch (loss 0.6024): 58%|ββββββ | 1180/2046 [14:14<09:23, 1.54it/s]
Training 2/3 epoch (loss 0.6183): 58%|ββββββ | 1180/2046 [14:14<09:23, 1.54it/s]
Training 2/3 epoch (loss 0.6183): 58%|ββββββ | 1181/2046 [14:14<09:12, 1.57it/s]
Training 2/3 epoch (loss 0.5347): 58%|ββββββ | 1181/2046 [14:15<09:12, 1.57it/s]
Training 2/3 epoch (loss 0.5347): 58%|ββββββ | 1182/2046 [14:15<09:45, 1.48it/s]
Training 2/3 epoch (loss 0.4644): 58%|ββββββ | 1182/2046 [14:16<09:45, 1.48it/s]
Training 2/3 epoch (loss 0.4644): 58%|ββββββ | 1183/2046 [14:16<09:56, 1.45it/s]
Training 2/3 epoch (loss 0.5356): 58%|ββββββ | 1183/2046 [14:17<09:56, 1.45it/s]
Training 2/3 epoch (loss 0.5356): 58%|ββββββ | 1184/2046 [14:17<10:14, 1.40it/s]
Training 2/3 epoch (loss 0.5085): 58%|ββββββ | 1184/2046 [14:17<10:14, 1.40it/s]
Training 2/3 epoch (loss 0.5085): 58%|ββββββ | 1185/2046 [14:17<09:53, 1.45it/s]
Training 2/3 epoch (loss 0.5077): 58%|ββββββ | 1185/2046 [14:18<09:53, 1.45it/s]
Training 2/3 epoch (loss 0.5077): 58%|ββββββ | 1186/2046 [14:18<10:12, 1.40it/s]
Training 2/3 epoch (loss 0.4801): 58%|ββββββ | 1186/2046 [14:19<10:12, 1.40it/s]
Training 2/3 epoch (loss 0.4801): 58%|ββββββ | 1187/2046 [14:19<11:26, 1.25it/s]
Training 2/3 epoch (loss 0.5083): 58%|ββββββ | 1187/2046 [14:20<11:26, 1.25it/s]
Training 2/3 epoch (loss 0.5083): 58%|ββββββ | 1188/2046 [14:20<12:20, 1.16it/s]
Training 2/3 epoch (loss 0.4131): 58%|ββββββ | 1188/2046 [14:21<12:20, 1.16it/s]
Training 2/3 epoch (loss 0.4131): 58%|ββββββ | 1189/2046 [14:21<11:37, 1.23it/s]
Training 2/3 epoch (loss 0.5447): 58%|ββββββ | 1189/2046 [14:21<11:37, 1.23it/s]
Training 2/3 epoch (loss 0.5447): 58%|ββββββ | 1190/2046 [14:21<10:44, 1.33it/s]
Training 2/3 epoch (loss 0.5856): 58%|ββββββ | 1190/2046 [14:22<10:44, 1.33it/s]
Training 2/3 epoch (loss 0.5856): 58%|ββββββ | 1191/2046 [14:22<10:49, 1.32it/s]
Training 2/3 epoch (loss 0.4807): 58%|ββββββ | 1191/2046 [14:23<10:49, 1.32it/s]
Training 2/3 epoch (loss 0.4807): 58%|ββββββ | 1192/2046 [14:23<12:17, 1.16it/s]
Training 2/3 epoch (loss 0.4595): 58%|ββββββ | 1192/2046 [14:24<12:17, 1.16it/s]
Training 2/3 epoch (loss 0.4595): 58%|ββββββ | 1193/2046 [14:24<12:21, 1.15it/s]
Training 2/3 epoch (loss 0.4782): 58%|ββββββ | 1193/2046 [14:25<12:21, 1.15it/s]
Training 2/3 epoch (loss 0.4782): 58%|ββββββ | 1194/2046 [14:25<11:49, 1.20it/s]
Training 2/3 epoch (loss 0.6113): 58%|ββββββ | 1194/2046 [14:25<11:49, 1.20it/s]
Training 2/3 epoch (loss 0.6113): 58%|ββββββ | 1195/2046 [14:25<10:50, 1.31it/s]
Training 2/3 epoch (loss 0.6627): 58%|ββββββ | 1195/2046 [14:26<10:50, 1.31it/s]
Training 2/3 epoch (loss 0.6627): 58%|ββββββ | 1196/2046 [14:26<11:41, 1.21it/s]
Training 2/3 epoch (loss 0.4793): 58%|ββββββ | 1196/2046 [14:27<11:41, 1.21it/s]
Training 2/3 epoch (loss 0.4793): 59%|ββββββ | 1197/2046 [14:27<10:54, 1.30it/s]
Training 2/3 epoch (loss 0.4480): 59%|ββββββ | 1197/2046 [14:28<10:54, 1.30it/s]
Training 2/3 epoch (loss 0.4480): 59%|ββββββ | 1198/2046 [14:28<10:09, 1.39it/s]
Training 2/3 epoch (loss 0.3905): 59%|ββββββ | 1198/2046 [14:28<10:09, 1.39it/s]
Training 2/3 epoch (loss 0.3905): 59%|ββββββ | 1199/2046 [14:28<09:54, 1.42it/s]
Training 2/3 epoch (loss 0.5001): 59%|ββββββ | 1199/2046 [14:29<09:54, 1.42it/s]
Training 2/3 epoch (loss 0.5001): 59%|ββββββ | 1200/2046 [14:29<10:46, 1.31it/s]
Training 2/3 epoch (loss 0.4742): 59%|ββββββ | 1200/2046 [14:30<10:46, 1.31it/s]
Training 2/3 epoch (loss 0.4742): 59%|ββββββ | 1201/2046 [14:30<10:44, 1.31it/s]
Training 2/3 epoch (loss 0.5982): 59%|ββββββ | 1201/2046 [14:31<10:44, 1.31it/s]
Training 2/3 epoch (loss 0.5982): 59%|ββββββ | 1202/2046 [14:31<10:30, 1.34it/s]
Training 2/3 epoch (loss 0.4343): 59%|ββββββ | 1202/2046 [14:31<10:30, 1.34it/s]
Training 2/3 epoch (loss 0.4343): 59%|ββββββ | 1203/2046 [14:31<10:12, 1.38it/s]
Training 2/3 epoch (loss 0.6795): 59%|ββββββ | 1203/2046 [14:32<10:12, 1.38it/s]
Training 2/3 epoch (loss 0.6795): 59%|ββββββ | 1204/2046 [14:32<09:49, 1.43it/s]
Training 2/3 epoch (loss 0.4447): 59%|ββββββ | 1204/2046 [14:33<09:49, 1.43it/s]
Training 2/3 epoch (loss 0.4447): 59%|ββββββ | 1205/2046 [14:33<09:48, 1.43it/s]
Training 2/3 epoch (loss 0.4876): 59%|ββββββ | 1205/2046 [14:33<09:48, 1.43it/s]
Training 2/3 epoch (loss 0.4876): 59%|ββββββ | 1206/2046 [14:33<10:08, 1.38it/s]
Training 2/3 epoch (loss 0.5647): 59%|ββββββ | 1206/2046 [14:34<10:08, 1.38it/s]
Training 2/3 epoch (loss 0.5647): 59%|ββββββ | 1207/2046 [14:34<09:49, 1.42it/s]
Training 2/3 epoch (loss 0.5293): 59%|ββββββ | 1207/2046 [14:35<09:49, 1.42it/s]
Training 2/3 epoch (loss 0.5293): 59%|ββββββ | 1208/2046 [14:35<11:17, 1.24it/s]
Training 2/3 epoch (loss 0.5423): 59%|ββββββ | 1208/2046 [14:36<11:17, 1.24it/s]
Training 2/3 epoch (loss 0.5423): 59%|ββββββ | 1209/2046 [14:36<10:32, 1.32it/s]
Training 2/3 epoch (loss 0.5288): 59%|ββββββ | 1209/2046 [14:37<10:32, 1.32it/s]
Training 2/3 epoch (loss 0.5288): 59%|ββββββ | 1210/2046 [14:37<10:17, 1.35it/s]
Training 2/3 epoch (loss 0.4307): 59%|ββββββ | 1210/2046 [14:37<10:17, 1.35it/s]
Training 2/3 epoch (loss 0.4307): 59%|ββββββ | 1211/2046 [14:37<09:48, 1.42it/s]
Training 2/3 epoch (loss 0.5037): 59%|ββββββ | 1211/2046 [14:38<09:48, 1.42it/s]
Training 2/3 epoch (loss 0.5037): 59%|ββββββ | 1212/2046 [14:38<09:38, 1.44it/s]
Training 2/3 epoch (loss 0.4582): 59%|ββββββ | 1212/2046 [14:39<09:38, 1.44it/s]
Training 2/3 epoch (loss 0.4582): 59%|ββββββ | 1213/2046 [14:39<10:03, 1.38it/s]
Training 2/3 epoch (loss 0.5535): 59%|ββββββ | 1213/2046 [14:39<10:03, 1.38it/s]
Training 2/3 epoch (loss 0.5535): 59%|ββββββ | 1214/2046 [14:39<09:59, 1.39it/s]
Training 2/3 epoch (loss 0.4434): 59%|ββββββ | 1214/2046 [14:40<09:59, 1.39it/s]
Training 2/3 epoch (loss 0.4434): 59%|ββββββ | 1215/2046 [14:40<09:33, 1.45it/s]
Training 2/3 epoch (loss 0.4110): 59%|ββββββ | 1215/2046 [14:41<09:33, 1.45it/s]
Training 2/3 epoch (loss 0.4110): 59%|ββββββ | 1216/2046 [14:41<09:44, 1.42it/s]
Training 2/3 epoch (loss 0.5174): 59%|ββββββ | 1216/2046 [14:41<09:44, 1.42it/s]
Training 2/3 epoch (loss 0.5174): 59%|ββββββ | 1217/2046 [14:41<09:26, 1.46it/s]
Training 2/3 epoch (loss 0.3013): 59%|ββββββ | 1217/2046 [14:42<09:26, 1.46it/s]
Training 2/3 epoch (loss 0.3013): 60%|ββββββ | 1218/2046 [14:42<09:31, 1.45it/s]
Training 2/3 epoch (loss 0.4510): 60%|ββββββ | 1218/2046 [14:43<09:31, 1.45it/s]
Training 2/3 epoch (loss 0.4510): 60%|ββββββ | 1219/2046 [14:43<09:24, 1.46it/s]
Training 2/3 epoch (loss 0.3405): 60%|ββββββ | 1219/2046 [14:43<09:24, 1.46it/s]
Training 2/3 epoch (loss 0.3405): 60%|ββββββ | 1220/2046 [14:43<09:11, 1.50it/s]
Training 2/3 epoch (loss 0.5022): 60%|ββββββ | 1220/2046 [14:44<09:11, 1.50it/s]
Training 2/3 epoch (loss 0.5022): 60%|ββββββ | 1221/2046 [14:44<09:15, 1.49it/s]
Training 2/3 epoch (loss 0.4882): 60%|ββββββ | 1221/2046 [14:45<09:15, 1.49it/s]
Training 2/3 epoch (loss 0.4882): 60%|ββββββ | 1222/2046 [14:45<08:58, 1.53it/s]
Training 2/3 epoch (loss 0.4836): 60%|ββββββ | 1222/2046 [14:45<08:58, 1.53it/s]
Training 2/3 epoch (loss 0.4836): 60%|ββββββ | 1223/2046 [14:45<09:43, 1.41it/s]
Training 2/3 epoch (loss 0.5751): 60%|ββββββ | 1223/2046 [14:46<09:43, 1.41it/s]
Training 2/3 epoch (loss 0.5751): 60%|ββββββ | 1224/2046 [14:46<09:54, 1.38it/s]
Training 2/3 epoch (loss 0.5634): 60%|ββββββ | 1224/2046 [14:47<09:54, 1.38it/s]
Training 2/3 epoch (loss 0.5634): 60%|ββββββ | 1225/2046 [14:47<09:53, 1.38it/s]
Training 2/3 epoch (loss 0.4896): 60%|ββββββ | 1225/2046 [14:48<09:53, 1.38it/s]
Training 2/3 epoch (loss 0.4896): 60%|ββββββ | 1226/2046 [14:48<09:37, 1.42it/s]
Training 2/3 epoch (loss 0.4798): 60%|ββββββ | 1226/2046 [14:49<09:37, 1.42it/s]
Training 2/3 epoch (loss 0.4798): 60%|ββββββ | 1227/2046 [14:49<10:55, 1.25it/s]
Training 2/3 epoch (loss 0.5577): 60%|ββββββ | 1227/2046 [14:49<10:55, 1.25it/s]
Training 2/3 epoch (loss 0.5577): 60%|ββββββ | 1228/2046 [14:49<10:25, 1.31it/s]
Training 2/3 epoch (loss 0.4834): 60%|ββββββ | 1228/2046 [14:50<10:25, 1.31it/s]
Training 2/3 epoch (loss 0.4834): 60%|ββββββ | 1229/2046 [14:50<11:50, 1.15it/s]
Training 2/3 epoch (loss 0.4534): 60%|ββββββ | 1229/2046 [14:51<11:50, 1.15it/s]
Training 2/3 epoch (loss 0.4534): 60%|ββββββ | 1230/2046 [14:51<11:04, 1.23it/s]
Training 2/3 epoch (loss 0.5182): 60%|ββββββ | 1230/2046 [14:52<11:04, 1.23it/s]
Training 2/3 epoch (loss 0.5182): 60%|ββββββ | 1231/2046 [14:52<10:37, 1.28it/s]
Training 2/3 epoch (loss 0.5737): 60%|ββββββ | 1231/2046 [14:53<10:37, 1.28it/s]
Training 2/3 epoch (loss 0.5737): 60%|ββββββ | 1232/2046 [14:53<11:00, 1.23it/s]
Training 2/3 epoch (loss 0.4392): 60%|ββββββ | 1232/2046 [14:54<11:00, 1.23it/s]
Training 2/3 epoch (loss 0.4392): 60%|ββββββ | 1233/2046 [14:54<11:23, 1.19it/s]
Training 2/3 epoch (loss 0.4750): 60%|ββββββ | 1233/2046 [14:54<11:23, 1.19it/s]
Training 2/3 epoch (loss 0.4750): 60%|ββββββ | 1234/2046 [14:54<11:17, 1.20it/s]
Training 2/3 epoch (loss 0.4766): 60%|ββββββ | 1234/2046 [14:55<11:17, 1.20it/s]
Training 2/3 epoch (loss 0.4766): 60%|ββββββ | 1235/2046 [14:55<11:30, 1.17it/s]
Training 2/3 epoch (loss 0.5145): 60%|ββββββ | 1235/2046 [14:56<11:30, 1.17it/s]
Training 2/3 epoch (loss 0.5145): 60%|ββββββ | 1236/2046 [14:56<10:35, 1.27it/s]
Training 2/3 epoch (loss 0.4084): 60%|ββββββ | 1236/2046 [14:57<10:35, 1.27it/s]
Training 2/3 epoch (loss 0.4084): 60%|ββββββ | 1237/2046 [14:57<10:52, 1.24it/s]
Training 2/3 epoch (loss 0.5864): 60%|ββββββ | 1237/2046 [14:57<10:52, 1.24it/s]
Training 2/3 epoch (loss 0.5864): 61%|ββββββ | 1238/2046 [14:57<10:27, 1.29it/s]
Training 2/3 epoch (loss 0.5555): 61%|ββββββ | 1238/2046 [14:58<10:27, 1.29it/s]
Training 2/3 epoch (loss 0.5555): 61%|ββββββ | 1239/2046 [14:58<09:52, 1.36it/s]
Training 2/3 epoch (loss 0.5714): 61%|ββββββ | 1239/2046 [14:59<09:52, 1.36it/s]
Training 2/3 epoch (loss 0.5714): 61%|ββββββ | 1240/2046 [14:59<10:06, 1.33it/s]
Training 2/3 epoch (loss 0.4451): 61%|ββββββ | 1240/2046 [15:00<10:06, 1.33it/s]
Training 2/3 epoch (loss 0.4451): 61%|ββββββ | 1241/2046 [15:00<10:56, 1.23it/s]
Training 2/3 epoch (loss 0.5561): 61%|ββββββ | 1241/2046 [15:01<10:56, 1.23it/s]
Training 2/3 epoch (loss 0.5561): 61%|ββββββ | 1242/2046 [15:01<10:12, 1.31it/s]
Training 2/3 epoch (loss 0.3914): 61%|ββββββ | 1242/2046 [15:01<10:12, 1.31it/s]
Training 2/3 epoch (loss 0.3914): 61%|ββββββ | 1243/2046 [15:01<09:55, 1.35it/s]
Training 2/3 epoch (loss 0.3378): 61%|ββββββ | 1243/2046 [15:02<09:55, 1.35it/s]
Training 2/3 epoch (loss 0.3378): 61%|ββββββ | 1244/2046 [15:02<09:36, 1.39it/s]
Training 2/3 epoch (loss 0.4133): 61%|ββββββ | 1244/2046 [15:03<09:36, 1.39it/s]
Training 2/3 epoch (loss 0.4133): 61%|ββββββ | 1245/2046 [15:03<09:15, 1.44it/s]
Training 2/3 epoch (loss 0.3820): 61%|ββββββ | 1245/2046 [15:03<09:15, 1.44it/s]
Training 2/3 epoch (loss 0.3820): 61%|ββββββ | 1246/2046 [15:03<08:57, 1.49it/s]
Training 2/3 epoch (loss 0.4168): 61%|ββββββ | 1246/2046 [15:04<08:57, 1.49it/s]
Training 2/3 epoch (loss 0.4168): 61%|ββββββ | 1247/2046 [15:04<08:49, 1.51it/s]
Training 2/3 epoch (loss 0.4314): 61%|ββββββ | 1247/2046 [15:05<08:49, 1.51it/s]
Training 2/3 epoch (loss 0.4314): 61%|ββββββ | 1248/2046 [15:05<09:15, 1.44it/s]
Training 2/3 epoch (loss 0.4768): 61%|ββββββ | 1248/2046 [15:05<09:15, 1.44it/s]
Training 2/3 epoch (loss 0.4768): 61%|ββββββ | 1249/2046 [15:05<08:56, 1.49it/s]
Training 2/3 epoch (loss 0.4696): 61%|ββββββ | 1249/2046 [15:06<08:56, 1.49it/s]
Training 2/3 epoch (loss 0.4696): 61%|ββββββ | 1250/2046 [15:06<08:47, 1.51it/s]
Training 2/3 epoch (loss 0.5172): 61%|ββββββ | 1250/2046 [15:06<08:47, 1.51it/s]
Training 2/3 epoch (loss 0.5172): 61%|ββββββ | 1251/2046 [15:06<08:37, 1.54it/s]
Training 2/3 epoch (loss 0.5829): 61%|ββββββ | 1251/2046 [15:07<08:37, 1.54it/s]
Training 2/3 epoch (loss 0.5829): 61%|ββββββ | 1252/2046 [15:07<09:04, 1.46it/s]
Training 2/3 epoch (loss 0.5182): 61%|ββββββ | 1252/2046 [15:08<09:04, 1.46it/s]
Training 2/3 epoch (loss 0.5182): 61%|ββββββ | 1253/2046 [15:08<08:59, 1.47it/s]
Training 2/3 epoch (loss 0.5731): 61%|ββββββ | 1253/2046 [15:09<08:59, 1.47it/s]
Training 2/3 epoch (loss 0.5731): 61%|βββββββ | 1254/2046 [15:09<09:58, 1.32it/s]
Training 2/3 epoch (loss 0.4139): 61%|βββββββ | 1254/2046 [15:09<09:58, 1.32it/s]
Training 2/3 epoch (loss 0.4139): 61%|βββββββ | 1255/2046 [15:09<09:26, 1.40it/s]
Training 2/3 epoch (loss 0.4518): 61%|βββββββ | 1255/2046 [15:10<09:26, 1.40it/s]
Training 2/3 epoch (loss 0.4518): 61%|βββββββ | 1256/2046 [15:10<10:00, 1.32it/s]
Training 2/3 epoch (loss 0.4520): 61%|βββββββ | 1256/2046 [15:11<10:00, 1.32it/s]
Training 2/3 epoch (loss 0.4520): 61%|βββββββ | 1257/2046 [15:11<09:49, 1.34it/s]
Training 2/3 epoch (loss 0.5138): 61%|βββββββ | 1257/2046 [15:12<09:49, 1.34it/s]
Training 2/3 epoch (loss 0.5138): 61%|βββββββ | 1258/2046 [15:12<09:35, 1.37it/s]
Training 2/3 epoch (loss 0.5692): 61%|βββββββ | 1258/2046 [15:12<09:35, 1.37it/s]
Training 2/3 epoch (loss 0.5692): 62%|βββββββ | 1259/2046 [15:12<09:47, 1.34it/s]
Training 2/3 epoch (loss 0.5335): 62%|βββββββ | 1259/2046 [15:13<09:47, 1.34it/s]
Training 2/3 epoch (loss 0.5335): 62%|βββββββ | 1260/2046 [15:13<09:20, 1.40it/s]
Training 2/3 epoch (loss 0.4423): 62%|βββββββ | 1260/2046 [15:14<09:20, 1.40it/s]
Training 2/3 epoch (loss 0.4423): 62%|βββββββ | 1261/2046 [15:14<08:56, 1.46it/s]
Training 2/3 epoch (loss 0.5425): 62%|βββββββ | 1261/2046 [15:14<08:56, 1.46it/s]
Training 2/3 epoch (loss 0.5425): 62%|βββββββ | 1262/2046 [15:14<09:12, 1.42it/s]
Training 2/3 epoch (loss 0.4840): 62%|βββββββ | 1262/2046 [15:15<09:12, 1.42it/s]
Training 2/3 epoch (loss 0.4840): 62%|βββββββ | 1263/2046 [15:15<09:10, 1.42it/s]
Training 2/3 epoch (loss 0.4076): 62%|βββββββ | 1263/2046 [15:16<09:10, 1.42it/s]
Training 2/3 epoch (loss 0.4076): 62%|βββββββ | 1264/2046 [15:16<10:03, 1.30it/s]
Training 2/3 epoch (loss 0.4653): 62%|βββββββ | 1264/2046 [15:17<10:03, 1.30it/s]
Training 2/3 epoch (loss 0.4653): 62%|βββββββ | 1265/2046 [15:17<09:25, 1.38it/s]
Training 2/3 epoch (loss 0.4780): 62%|βββββββ | 1265/2046 [15:17<09:25, 1.38it/s]
Training 2/3 epoch (loss 0.4780): 62%|βββββββ | 1266/2046 [15:17<09:00, 1.44it/s]
Training 2/3 epoch (loss 0.3801): 62%|βββββββ | 1266/2046 [15:18<09:00, 1.44it/s]
Training 2/3 epoch (loss 0.3801): 62%|βββββββ | 1267/2046 [15:18<10:01, 1.29it/s]
Training 2/3 epoch (loss 0.5112): 62%|βββββββ | 1267/2046 [15:19<10:01, 1.29it/s]
Training 2/3 epoch (loss 0.5112): 62%|βββββββ | 1268/2046 [15:19<10:49, 1.20it/s]
Training 2/3 epoch (loss 0.6312): 62%|βββββββ | 1268/2046 [15:20<10:49, 1.20it/s]
Training 2/3 epoch (loss 0.6312): 62%|βββββββ | 1269/2046 [15:20<10:22, 1.25it/s]
Training 2/3 epoch (loss 0.5197): 62%|βββββββ | 1269/2046 [15:21<10:22, 1.25it/s]
Training 2/3 epoch (loss 0.5197): 62%|βββββββ | 1270/2046 [15:21<10:08, 1.28it/s]
Training 2/3 epoch (loss 0.4278): 62%|βββββββ | 1270/2046 [15:21<10:08, 1.28it/s]
Training 2/3 epoch (loss 0.4278): 62%|βββββββ | 1271/2046 [15:21<09:35, 1.35it/s]
Training 2/3 epoch (loss 0.6150): 62%|βββββββ | 1271/2046 [15:22<09:35, 1.35it/s]
Training 2/3 epoch (loss 0.6150): 62%|βββββββ | 1272/2046 [15:22<09:33, 1.35it/s]
Training 2/3 epoch (loss 0.5012): 62%|βββββββ | 1272/2046 [15:23<09:33, 1.35it/s]
Training 2/3 epoch (loss 0.5012): 62%|βββββββ | 1273/2046 [15:23<09:19, 1.38it/s]
Training 2/3 epoch (loss 0.4887): 62%|βββββββ | 1273/2046 [15:23<09:19, 1.38it/s]
Training 2/3 epoch (loss 0.4887): 62%|βββββββ | 1274/2046 [15:23<08:50, 1.45it/s]
Training 2/3 epoch (loss 0.4422): 62%|βββββββ | 1274/2046 [15:24<08:50, 1.45it/s]
Training 2/3 epoch (loss 0.4422): 62%|βββββββ | 1275/2046 [15:24<09:14, 1.39it/s]
Training 2/3 epoch (loss 0.3503): 62%|βββββββ | 1275/2046 [15:25<09:14, 1.39it/s]
Training 2/3 epoch (loss 0.3503): 62%|βββββββ | 1276/2046 [15:25<09:09, 1.40it/s]
Training 2/3 epoch (loss 0.3741): 62%|βββββββ | 1276/2046 [15:26<09:09, 1.40it/s]
Training 2/3 epoch (loss 0.3741): 62%|βββββββ | 1277/2046 [15:26<09:02, 1.42it/s]
Training 2/3 epoch (loss 0.4139): 62%|βββββββ | 1277/2046 [15:26<09:02, 1.42it/s]
Training 2/3 epoch (loss 0.4139): 62%|βββββββ | 1278/2046 [15:26<08:58, 1.43it/s]
Training 2/3 epoch (loss 0.4356): 62%|βββββββ | 1278/2046 [15:27<08:58, 1.43it/s]
Training 2/3 epoch (loss 0.4356): 63%|βββββββ | 1279/2046 [15:27<09:01, 1.42it/s]
Training 2/3 epoch (loss 0.3832): 63%|βββββββ | 1279/2046 [15:28<09:01, 1.42it/s]
Training 2/3 epoch (loss 0.3832): 63%|βββββββ | 1280/2046 [15:28<09:38, 1.32it/s]
Training 2/3 epoch (loss 0.4994): 63%|βββββββ | 1280/2046 [15:29<09:38, 1.32it/s]
Training 2/3 epoch (loss 0.4994): 63%|βββββββ | 1281/2046 [15:29<09:48, 1.30it/s]
Training 2/3 epoch (loss 0.5061): 63%|βββββββ | 1281/2046 [15:29<09:48, 1.30it/s]
Training 2/3 epoch (loss 0.5061): 63%|βββββββ | 1282/2046 [15:29<09:19, 1.37it/s]
Training 2/3 epoch (loss 0.4229): 63%|βββββββ | 1282/2046 [15:30<09:19, 1.37it/s]
Training 2/3 epoch (loss 0.4229): 63%|βββββββ | 1283/2046 [15:30<09:04, 1.40it/s]
Training 2/3 epoch (loss 0.5792): 63%|βββββββ | 1283/2046 [15:31<09:04, 1.40it/s]
Training 2/3 epoch (loss 0.5792): 63%|βββββββ | 1284/2046 [15:31<08:51, 1.43it/s]
Training 2/3 epoch (loss 0.4296): 63%|βββββββ | 1284/2046 [15:32<08:51, 1.43it/s]
Training 2/3 epoch (loss 0.4296): 63%|βββββββ | 1285/2046 [15:32<09:42, 1.31it/s]
Training 2/3 epoch (loss 0.4783): 63%|βββββββ | 1285/2046 [15:32<09:42, 1.31it/s]
Training 2/3 epoch (loss 0.4783): 63%|βββββββ | 1286/2046 [15:32<09:29, 1.33it/s]
Training 2/3 epoch (loss 0.3948): 63%|βββββββ | 1286/2046 [15:33<09:29, 1.33it/s]
Training 2/3 epoch (loss 0.3948): 63%|βββββββ | 1287/2046 [15:33<09:00, 1.40it/s]
Training 2/3 epoch (loss 0.4326): 63%|βββββββ | 1287/2046 [15:34<09:00, 1.40it/s]
Training 2/3 epoch (loss 0.4326): 63%|βββββββ | 1288/2046 [15:34<09:39, 1.31it/s]
Training 2/3 epoch (loss 0.4893): 63%|βββββββ | 1288/2046 [15:34<09:39, 1.31it/s]
Training 2/3 epoch (loss 0.4893): 63%|βββββββ | 1289/2046 [15:34<09:10, 1.37it/s]
Training 2/3 epoch (loss 0.4783): 63%|βββββββ | 1289/2046 [15:35<09:10, 1.37it/s]
Training 2/3 epoch (loss 0.4783): 63%|βββββββ | 1290/2046 [15:35<08:53, 1.42it/s]
Training 2/3 epoch (loss 0.3582): 63%|βββββββ | 1290/2046 [15:36<08:53, 1.42it/s]
Training 2/3 epoch (loss 0.3582): 63%|βββββββ | 1291/2046 [15:36<09:09, 1.37it/s]
Training 2/3 epoch (loss 0.5672): 63%|βββββββ | 1291/2046 [15:37<09:09, 1.37it/s]
Training 2/3 epoch (loss 0.5672): 63%|βββββββ | 1292/2046 [15:37<09:05, 1.38it/s]
Training 2/3 epoch (loss 0.5045): 63%|βββββββ | 1292/2046 [15:37<09:05, 1.38it/s]
Training 2/3 epoch (loss 0.5045): 63%|βββββββ | 1293/2046 [15:37<08:39, 1.45it/s]
Training 2/3 epoch (loss 0.4254): 63%|βββββββ | 1293/2046 [15:38<08:39, 1.45it/s]
Training 2/3 epoch (loss 0.4254): 63%|βββββββ | 1294/2046 [15:38<08:21, 1.50it/s]
Training 2/3 epoch (loss 0.3811): 63%|βββββββ | 1294/2046 [15:38<08:21, 1.50it/s]
Training 2/3 epoch (loss 0.3811): 63%|βββββββ | 1295/2046 [15:39<08:22, 1.50it/s]
Training 2/3 epoch (loss 0.3936): 63%|βββββββ | 1295/2046 [15:39<08:22, 1.50it/s]
Training 2/3 epoch (loss 0.3936): 63%|βββββββ | 1296/2046 [15:39<08:39, 1.44it/s]
Training 2/3 epoch (loss 0.4476): 63%|βββββββ | 1296/2046 [15:40<08:39, 1.44it/s]
Training 2/3 epoch (loss 0.4476): 63%|βββββββ | 1297/2046 [15:40<08:26, 1.48it/s]
Training 2/3 epoch (loss 0.4124): 63%|βββββββ | 1297/2046 [15:41<08:26, 1.48it/s]
Training 2/3 epoch (loss 0.4124): 63%|βββββββ | 1298/2046 [15:41<08:35, 1.45it/s]
Training 2/3 epoch (loss 0.4562): 63%|βββββββ | 1298/2046 [15:41<08:35, 1.45it/s]
Training 2/3 epoch (loss 0.4562): 63%|βββββββ | 1299/2046 [15:41<08:12, 1.52it/s]
Training 2/3 epoch (loss 0.4403): 63%|βββββββ | 1299/2046 [15:42<08:12, 1.52it/s]
Training 2/3 epoch (loss 0.4403): 64%|βββββββ | 1300/2046 [15:42<08:09, 1.52it/s]
Training 2/3 epoch (loss 0.3817): 64%|βββββββ | 1300/2046 [15:42<08:09, 1.52it/s]
Training 2/3 epoch (loss 0.3817): 64%|βββββββ | 1301/2046 [15:42<07:58, 1.56it/s]
Training 2/3 epoch (loss 0.3765): 64%|βββββββ | 1301/2046 [15:43<07:58, 1.56it/s]
Training 2/3 epoch (loss 0.3765): 64%|βββββββ | 1302/2046 [15:43<08:04, 1.54it/s]
Training 2/3 epoch (loss 0.5555): 64%|βββββββ | 1302/2046 [15:44<08:04, 1.54it/s]
Training 2/3 epoch (loss 0.5555): 64%|βββββββ | 1303/2046 [15:44<07:54, 1.57it/s]
Training 2/3 epoch (loss 0.4047): 64%|βββββββ | 1303/2046 [15:45<07:54, 1.57it/s]
Training 2/3 epoch (loss 0.4047): 64%|βββββββ | 1304/2046 [15:45<08:24, 1.47it/s]
Training 2/3 epoch (loss 0.3407): 64%|βββββββ | 1304/2046 [15:45<08:24, 1.47it/s]
Training 2/3 epoch (loss 0.3407): 64%|βββββββ | 1305/2046 [15:45<08:34, 1.44it/s]
Training 2/3 epoch (loss 0.4801): 64%|βββββββ | 1305/2046 [15:46<08:34, 1.44it/s]
Training 2/3 epoch (loss 0.4801): 64%|βββββββ | 1306/2046 [15:46<08:33, 1.44it/s]
Training 2/3 epoch (loss 0.4592): 64%|βββββββ | 1306/2046 [15:47<08:33, 1.44it/s]
Training 2/3 epoch (loss 0.4592): 64%|βββββββ | 1307/2046 [15:47<08:14, 1.50it/s]
Training 2/3 epoch (loss 0.4148): 64%|βββββββ | 1307/2046 [15:47<08:14, 1.50it/s]
Training 2/3 epoch (loss 0.4148): 64%|βββββββ | 1308/2046 [15:47<09:05, 1.35it/s]
Training 2/3 epoch (loss 0.5147): 64%|βββββββ | 1308/2046 [15:48<09:05, 1.35it/s]
Training 2/3 epoch (loss 0.5147): 64%|βββββββ | 1309/2046 [15:48<09:25, 1.30it/s]
Training 2/3 epoch (loss 0.4547): 64%|βββββββ | 1309/2046 [15:49<09:25, 1.30it/s]
Training 2/3 epoch (loss 0.4547): 64%|βββββββ | 1310/2046 [15:49<09:59, 1.23it/s]
Training 2/3 epoch (loss 0.4193): 64%|βββββββ | 1310/2046 [15:50<09:59, 1.23it/s]
Training 2/3 epoch (loss 0.4193): 64%|βββββββ | 1311/2046 [15:50<09:53, 1.24it/s]
Training 2/3 epoch (loss 0.4167): 64%|βββββββ | 1311/2046 [15:51<09:53, 1.24it/s]
Training 2/3 epoch (loss 0.4167): 64%|βββββββ | 1312/2046 [15:51<09:51, 1.24it/s]
Training 2/3 epoch (loss 0.4178): 64%|βββββββ | 1312/2046 [15:51<09:51, 1.24it/s]
Training 2/3 epoch (loss 0.4178): 64%|βββββββ | 1313/2046 [15:51<09:14, 1.32it/s]
Training 2/3 epoch (loss 0.4029): 64%|βββββββ | 1313/2046 [15:52<09:14, 1.32it/s]
Training 2/3 epoch (loss 0.4029): 64%|βββββββ | 1314/2046 [15:52<08:44, 1.40it/s]
Training 2/3 epoch (loss 0.5749): 64%|βββββββ | 1314/2046 [15:53<08:44, 1.40it/s]
Training 2/3 epoch (loss 0.5749): 64%|βββββββ | 1315/2046 [15:53<08:29, 1.43it/s]
Training 2/3 epoch (loss 0.4231): 64%|βββββββ | 1315/2046 [15:53<08:29, 1.43it/s]
Training 2/3 epoch (loss 0.4231): 64%|βββββββ | 1316/2046 [15:53<08:26, 1.44it/s]
Training 2/3 epoch (loss 0.5115): 64%|βββββββ | 1316/2046 [15:54<08:26, 1.44it/s]
Training 2/3 epoch (loss 0.5115): 64%|βββββββ | 1317/2046 [15:54<08:51, 1.37it/s]
Training 2/3 epoch (loss 0.4759): 64%|βββββββ | 1317/2046 [15:55<08:51, 1.37it/s]
Training 2/3 epoch (loss 0.4759): 64%|βββββββ | 1318/2046 [15:55<08:42, 1.39it/s]
Training 2/3 epoch (loss 0.3609): 64%|βββββββ | 1318/2046 [15:56<08:42, 1.39it/s]
Training 2/3 epoch (loss 0.3609): 64%|βββββββ | 1319/2046 [15:56<08:29, 1.43it/s]
Training 2/3 epoch (loss 0.4957): 64%|βββββββ | 1319/2046 [15:56<08:29, 1.43it/s]
Training 2/3 epoch (loss 0.4957): 65%|βββββββ | 1320/2046 [15:56<08:55, 1.35it/s]
Training 2/3 epoch (loss 0.5276): 65%|βββββββ | 1320/2046 [15:57<08:55, 1.35it/s]
Training 2/3 epoch (loss 0.5276): 65%|βββββββ | 1321/2046 [15:57<08:47, 1.37it/s]
Training 2/3 epoch (loss 0.5439): 65%|βββββββ | 1321/2046 [15:58<08:47, 1.37it/s]
Training 2/3 epoch (loss 0.5439): 65%|βββββββ | 1322/2046 [15:58<08:50, 1.37it/s]
Training 2/3 epoch (loss 0.4319): 65%|βββββββ | 1322/2046 [15:59<08:50, 1.37it/s]
Training 2/3 epoch (loss 0.4319): 65%|βββββββ | 1323/2046 [15:59<08:34, 1.40it/s]
Training 2/3 epoch (loss 0.4436): 65%|βββββββ | 1323/2046 [15:59<08:34, 1.40it/s]
Training 2/3 epoch (loss 0.4436): 65%|βββββββ | 1324/2046 [15:59<08:26, 1.42it/s]
Training 2/3 epoch (loss 0.5063): 65%|βββββββ | 1324/2046 [16:00<08:26, 1.42it/s]
Training 2/3 epoch (loss 0.5063): 65%|βββββββ | 1325/2046 [16:00<08:31, 1.41it/s]
Training 2/3 epoch (loss 0.4217): 65%|βββββββ | 1325/2046 [16:01<08:31, 1.41it/s]
Training 2/3 epoch (loss 0.4217): 65%|βββββββ | 1326/2046 [16:01<08:18, 1.44it/s]
Training 2/3 epoch (loss 0.5651): 65%|βββββββ | 1326/2046 [16:01<08:18, 1.44it/s]
Training 2/3 epoch (loss 0.5651): 65%|βββββββ | 1327/2046 [16:01<08:05, 1.48it/s]
Training 2/3 epoch (loss 0.3914): 65%|βββββββ | 1327/2046 [16:02<08:05, 1.48it/s]
Training 2/3 epoch (loss 0.3914): 65%|βββββββ | 1328/2046 [16:02<08:54, 1.34it/s]
Training 2/3 epoch (loss 0.3916): 65%|βββββββ | 1328/2046 [16:03<08:54, 1.34it/s]
Training 2/3 epoch (loss 0.3916): 65%|βββββββ | 1329/2046 [16:03<08:30, 1.41it/s]
Training 2/3 epoch (loss 0.4645): 65%|βββββββ | 1329/2046 [16:04<08:30, 1.41it/s]
Training 2/3 epoch (loss 0.4645): 65%|βββββββ | 1330/2046 [16:04<08:44, 1.36it/s]
Training 2/3 epoch (loss 0.5017): 65%|βββββββ | 1330/2046 [16:04<08:44, 1.36it/s]
Training 2/3 epoch (loss 0.5017): 65%|βββββββ | 1331/2046 [16:04<08:47, 1.36it/s]
Training 2/3 epoch (loss 0.5828): 65%|βββββββ | 1331/2046 [16:05<08:47, 1.36it/s]
Training 2/3 epoch (loss 0.5828): 65%|βββββββ | 1332/2046 [16:05<08:22, 1.42it/s]
Training 2/3 epoch (loss 0.5591): 65%|βββββββ | 1332/2046 [16:06<08:22, 1.42it/s]
Training 2/3 epoch (loss 0.5591): 65%|βββββββ | 1333/2046 [16:06<08:05, 1.47it/s]
Training 2/3 epoch (loss 0.4966): 65%|βββββββ | 1333/2046 [16:06<08:05, 1.47it/s]
Training 2/3 epoch (loss 0.4966): 65%|βββββββ | 1334/2046 [16:06<08:27, 1.40it/s]
Training 2/3 epoch (loss 0.5149): 65%|βββββββ | 1334/2046 [16:07<08:27, 1.40it/s]
Training 2/3 epoch (loss 0.5149): 65%|βββββββ | 1335/2046 [16:07<08:07, 1.46it/s]
Training 2/3 epoch (loss 0.4821): 65%|βββββββ | 1335/2046 [16:08<08:07, 1.46it/s]
Training 2/3 epoch (loss 0.4821): 65%|βββββββ | 1336/2046 [16:08<08:32, 1.38it/s]
Training 2/3 epoch (loss 0.3862): 65%|βββββββ | 1336/2046 [16:08<08:32, 1.38it/s]
Training 2/3 epoch (loss 0.3862): 65%|βββββββ | 1337/2046 [16:08<08:10, 1.44it/s]
Training 2/3 epoch (loss 0.4043): 65%|βββββββ | 1337/2046 [16:09<08:10, 1.44it/s]
Training 2/3 epoch (loss 0.4043): 65%|βββββββ | 1338/2046 [16:09<08:22, 1.41it/s]
Training 2/3 epoch (loss 0.4436): 65%|βββββββ | 1338/2046 [16:10<08:22, 1.41it/s]
Training 2/3 epoch (loss 0.4436): 65%|βββββββ | 1339/2046 [16:10<08:05, 1.46it/s]
Training 2/3 epoch (loss 0.4659): 65%|βββββββ | 1339/2046 [16:10<08:05, 1.46it/s]
Training 2/3 epoch (loss 0.4659): 65%|βββββββ | 1340/2046 [16:10<07:58, 1.47it/s]
Training 2/3 epoch (loss 0.3898): 65%|βββββββ | 1340/2046 [16:11<07:58, 1.47it/s]
Training 2/3 epoch (loss 0.3898): 66%|βββββββ | 1341/2046 [16:11<08:03, 1.46it/s]
Training 2/3 epoch (loss 0.5694): 66%|βββββββ | 1341/2046 [16:12<08:03, 1.46it/s]
Training 2/3 epoch (loss 0.5694): 66%|βββββββ | 1342/2046 [16:12<08:07, 1.44it/s]
Training 2/3 epoch (loss 0.4969): 66%|βββββββ | 1342/2046 [16:12<08:07, 1.44it/s]
Training 2/3 epoch (loss 0.4969): 66%|βββββββ | 1343/2046 [16:12<07:55, 1.48it/s]
Training 2/3 epoch (loss 0.5458): 66%|βββββββ | 1343/2046 [16:13<07:55, 1.48it/s]
Training 2/3 epoch (loss 0.5458): 66%|βββββββ | 1344/2046 [16:13<08:31, 1.37it/s]
Training 2/3 epoch (loss 0.4812): 66%|βββββββ | 1344/2046 [16:14<08:31, 1.37it/s]
Training 2/3 epoch (loss 0.4812): 66%|βββββββ | 1345/2046 [16:14<08:13, 1.42it/s]
Training 2/3 epoch (loss 0.4751): 66%|βββββββ | 1345/2046 [16:15<08:13, 1.42it/s]
Training 2/3 epoch (loss 0.4751): 66%|βββββββ | 1346/2046 [16:15<07:59, 1.46it/s]
Training 2/3 epoch (loss 0.5500): 66%|βββββββ | 1346/2046 [16:15<07:59, 1.46it/s]
Training 2/3 epoch (loss 0.5500): 66%|βββββββ | 1347/2046 [16:15<07:45, 1.50it/s]
Training 2/3 epoch (loss 0.4929): 66%|βββββββ | 1347/2046 [16:16<07:45, 1.50it/s]
Training 2/3 epoch (loss 0.4929): 66%|βββββββ | 1348/2046 [16:16<07:49, 1.49it/s]
Training 2/3 epoch (loss 0.3785): 66%|βββββββ | 1348/2046 [16:17<07:49, 1.49it/s]
Training 2/3 epoch (loss 0.3785): 66%|βββββββ | 1349/2046 [16:17<07:41, 1.51it/s]
Training 2/3 epoch (loss 0.4895): 66%|βββββββ | 1349/2046 [16:17<07:41, 1.51it/s]
Training 2/3 epoch (loss 0.4895): 66%|βββββββ | 1350/2046 [16:17<07:29, 1.55it/s]
Training 2/3 epoch (loss 0.4689): 66%|βββββββ | 1350/2046 [16:18<07:29, 1.55it/s]
Training 2/3 epoch (loss 0.4689): 66%|βββββββ | 1351/2046 [16:18<07:26, 1.56it/s]
Training 2/3 epoch (loss 0.4665): 66%|βββββββ | 1351/2046 [16:19<07:26, 1.56it/s]
Training 2/3 epoch (loss 0.4665): 66%|βββββββ | 1352/2046 [16:19<09:24, 1.23it/s]
Training 2/3 epoch (loss 0.4111): 66%|βββββββ | 1352/2046 [16:20<09:24, 1.23it/s]
Training 2/3 epoch (loss 0.4111): 66%|βββββββ | 1353/2046 [16:20<09:20, 1.24it/s]
Training 2/3 epoch (loss 0.4696): 66%|βββββββ | 1353/2046 [16:21<09:20, 1.24it/s]
Training 2/3 epoch (loss 0.4696): 66%|βββββββ | 1354/2046 [16:21<09:37, 1.20it/s]
Training 2/3 epoch (loss 0.4946): 66%|βββββββ | 1354/2046 [16:21<09:37, 1.20it/s]
Training 2/3 epoch (loss 0.4946): 66%|βββββββ | 1355/2046 [16:21<09:21, 1.23it/s]
Training 2/3 epoch (loss 0.4383): 66%|βββββββ | 1355/2046 [16:22<09:21, 1.23it/s]
Training 2/3 epoch (loss 0.4383): 66%|βββββββ | 1356/2046 [16:22<08:41, 1.32it/s]
Training 2/3 epoch (loss 0.4573): 66%|βββββββ | 1356/2046 [16:23<08:41, 1.32it/s]
Training 2/3 epoch (loss 0.4573): 66%|βββββββ | 1357/2046 [16:23<08:52, 1.29it/s]
Training 2/3 epoch (loss 0.4991): 66%|βββββββ | 1357/2046 [16:24<08:52, 1.29it/s]
Training 2/3 epoch (loss 0.4991): 66%|βββββββ | 1358/2046 [16:24<09:11, 1.25it/s]
Training 2/3 epoch (loss 0.4117): 66%|βββββββ | 1358/2046 [16:25<09:11, 1.25it/s]
Training 2/3 epoch (loss 0.4117): 66%|βββββββ | 1359/2046 [16:25<10:00, 1.14it/s]
Training 2/3 epoch (loss 0.4321): 66%|βββββββ | 1359/2046 [16:26<10:00, 1.14it/s]
Training 2/3 epoch (loss 0.4321): 66%|βββββββ | 1360/2046 [16:26<10:17, 1.11it/s]
Training 2/3 epoch (loss 0.4330): 66%|βββββββ | 1360/2046 [16:26<10:17, 1.11it/s]
Training 2/3 epoch (loss 0.4330): 67%|βββββββ | 1361/2046 [16:26<09:20, 1.22it/s]
Training 2/3 epoch (loss 0.4206): 67%|βββββββ | 1361/2046 [16:27<09:20, 1.22it/s]
Training 2/3 epoch (loss 0.4206): 67%|βββββββ | 1362/2046 [16:27<09:19, 1.22it/s]
Training 2/3 epoch (loss 0.4784): 67%|βββββββ | 1362/2046 [16:28<09:19, 1.22it/s]
Training 2/3 epoch (loss 0.4784): 67%|βββββββ | 1363/2046 [16:28<09:18, 1.22it/s]
Training 2/3 epoch (loss 0.2373): 67%|βββββββ | 1363/2046 [16:29<09:18, 1.22it/s]
Training 2/3 epoch (loss 0.2373): 67%|βββββββ | 1364/2046 [16:29<08:49, 1.29it/s]
Training 3/3 epoch (loss 0.4011): 67%|βββββββ | 1364/2046 [16:29<08:49, 1.29it/s]
Training 3/3 epoch (loss 0.4011): 67%|βββββββ | 1365/2046 [16:29<08:35, 1.32it/s]
Training 3/3 epoch (loss 0.5021): 67%|βββββββ | 1365/2046 [16:30<08:35, 1.32it/s]
Training 3/3 epoch (loss 0.5021): 67%|βββββββ | 1366/2046 [16:30<08:11, 1.38it/s]
Training 3/3 epoch (loss 0.6053): 67%|βββββββ | 1366/2046 [16:31<08:11, 1.38it/s]
Training 3/3 epoch (loss 0.6053): 67%|βββββββ | 1367/2046 [16:31<07:56, 1.42it/s]
Training 3/3 epoch (loss 0.4862): 67%|βββββββ | 1367/2046 [16:32<07:56, 1.42it/s]
Training 3/3 epoch (loss 0.4862): 67%|βββββββ | 1368/2046 [16:32<08:31, 1.33it/s]
Training 3/3 epoch (loss 0.5182): 67%|βββββββ | 1368/2046 [16:32<08:31, 1.33it/s]
Training 3/3 epoch (loss 0.5182): 67%|βββββββ | 1369/2046 [16:32<08:08, 1.39it/s]
Training 3/3 epoch (loss 0.5395): 67%|βββββββ | 1369/2046 [16:33<08:08, 1.39it/s]
Training 3/3 epoch (loss 0.5395): 67%|βββββββ | 1370/2046 [16:33<08:18, 1.36it/s]
Training 3/3 epoch (loss 0.5621): 67%|βββββββ | 1370/2046 [16:34<08:18, 1.36it/s]
Training 3/3 epoch (loss 0.5621): 67%|βββββββ | 1371/2046 [16:34<07:52, 1.43it/s]
Training 3/3 epoch (loss 0.4837): 67%|βββββββ | 1371/2046 [16:34<07:52, 1.43it/s]
Training 3/3 epoch (loss 0.4837): 67%|βββββββ | 1372/2046 [16:34<08:08, 1.38it/s]
Training 3/3 epoch (loss 0.4744): 67%|βββββββ | 1372/2046 [16:35<08:08, 1.38it/s]
Training 3/3 epoch (loss 0.4744): 67%|βββββββ | 1373/2046 [16:35<07:46, 1.44it/s]
Training 3/3 epoch (loss 0.5708): 67%|βββββββ | 1373/2046 [16:36<07:46, 1.44it/s]
Training 3/3 epoch (loss 0.5708): 67%|βββββββ | 1374/2046 [16:36<07:47, 1.44it/s]
Training 3/3 epoch (loss 0.3933): 67%|βββββββ | 1374/2046 [16:36<07:47, 1.44it/s]
Training 3/3 epoch (loss 0.3933): 67%|βββββββ | 1375/2046 [16:36<07:48, 1.43it/s]
Training 3/3 epoch (loss 0.4305): 67%|βββββββ | 1375/2046 [16:37<07:48, 1.43it/s]
Training 3/3 epoch (loss 0.4305): 67%|βββββββ | 1376/2046 [16:37<08:15, 1.35it/s]
Training 3/3 epoch (loss 0.4016): 67%|βββββββ | 1376/2046 [16:38<08:15, 1.35it/s]
Training 3/3 epoch (loss 0.4016): 67%|βββββββ | 1377/2046 [16:38<08:16, 1.35it/s]
Training 3/3 epoch (loss 0.3653): 67%|βββββββ | 1377/2046 [16:39<08:16, 1.35it/s]
Training 3/3 epoch (loss 0.3653): 67%|βββββββ | 1378/2046 [16:39<08:03, 1.38it/s]
Training 3/3 epoch (loss 0.4199): 67%|βββββββ | 1378/2046 [16:39<08:03, 1.38it/s]
Training 3/3 epoch (loss 0.4199): 67%|βββββββ | 1379/2046 [16:39<07:59, 1.39it/s]
Training 3/3 epoch (loss 0.5127): 67%|βββββββ | 1379/2046 [16:40<07:59, 1.39it/s]
Training 3/3 epoch (loss 0.5127): 67%|βββββββ | 1380/2046 [16:40<07:39, 1.45it/s]
Training 3/3 epoch (loss 0.4770): 67%|βββββββ | 1380/2046 [16:41<07:39, 1.45it/s]
Training 3/3 epoch (loss 0.4770): 67%|βββββββ | 1381/2046 [16:41<07:22, 1.50it/s]
Training 3/3 epoch (loss 0.4857): 67%|βββββββ | 1381/2046 [16:41<07:22, 1.50it/s]
Training 3/3 epoch (loss 0.4857): 68%|βββββββ | 1382/2046 [16:41<07:12, 1.53it/s]
Training 3/3 epoch (loss 0.4934): 68%|βββββββ | 1382/2046 [16:42<07:12, 1.53it/s]
Training 3/3 epoch (loss 0.4934): 68%|βββββββ | 1383/2046 [16:42<07:03, 1.56it/s]
Training 3/3 epoch (loss 0.5630): 68%|βββββββ | 1383/2046 [16:43<07:03, 1.56it/s]
Training 3/3 epoch (loss 0.5630): 68%|βββββββ | 1384/2046 [16:43<07:41, 1.43it/s]
Training 3/3 epoch (loss 0.4376): 68%|βββββββ | 1384/2046 [16:43<07:41, 1.43it/s]
Training 3/3 epoch (loss 0.4376): 68%|βββββββ | 1385/2046 [16:43<07:30, 1.47it/s]
Training 3/3 epoch (loss 0.4017): 68%|βββββββ | 1385/2046 [16:44<07:30, 1.47it/s]
Training 3/3 epoch (loss 0.4017): 68%|βββββββ | 1386/2046 [16:44<07:28, 1.47it/s]
Training 3/3 epoch (loss 0.4662): 68%|βββββββ | 1386/2046 [16:45<07:28, 1.47it/s]
Training 3/3 epoch (loss 0.4662): 68%|βββββββ | 1387/2046 [16:45<07:34, 1.45it/s]
Training 3/3 epoch (loss 0.5080): 68%|βββββββ | 1387/2046 [16:45<07:34, 1.45it/s]
Training 3/3 epoch (loss 0.5080): 68%|βββββββ | 1388/2046 [16:45<07:31, 1.46it/s]
Training 3/3 epoch (loss 0.4631): 68%|βββββββ | 1388/2046 [16:46<07:31, 1.46it/s]
Training 3/3 epoch (loss 0.4631): 68%|βββββββ | 1389/2046 [16:46<07:30, 1.46it/s]
Training 3/3 epoch (loss 0.4048): 68%|βββββββ | 1389/2046 [16:47<07:30, 1.46it/s]
Training 3/3 epoch (loss 0.4048): 68%|βββββββ | 1390/2046 [16:47<07:15, 1.51it/s]
Training 3/3 epoch (loss 0.4907): 68%|βββββββ | 1390/2046 [16:47<07:15, 1.51it/s]
Training 3/3 epoch (loss 0.4907): 68%|βββββββ | 1391/2046 [16:47<07:24, 1.47it/s]
Training 3/3 epoch (loss 0.3615): 68%|βββββββ | 1391/2046 [16:48<07:24, 1.47it/s]
Training 3/3 epoch (loss 0.3615): 68%|βββββββ | 1392/2046 [16:48<08:25, 1.29it/s]
Training 3/3 epoch (loss 0.3846): 68%|βββββββ | 1392/2046 [16:49<08:25, 1.29it/s]
Training 3/3 epoch (loss 0.3846): 68%|βββββββ | 1393/2046 [16:49<08:30, 1.28it/s]
Training 3/3 epoch (loss 0.4989): 68%|βββββββ | 1393/2046 [16:50<08:30, 1.28it/s]
Training 3/3 epoch (loss 0.4989): 68%|βββββββ | 1394/2046 [16:50<08:30, 1.28it/s]
Training 3/3 epoch (loss 0.4615): 68%|βββββββ | 1394/2046 [16:51<08:30, 1.28it/s]
Training 3/3 epoch (loss 0.4615): 68%|βββββββ | 1395/2046 [16:51<07:55, 1.37it/s]
Training 3/3 epoch (loss 0.5170): 68%|βββββββ | 1395/2046 [16:51<07:55, 1.37it/s]
Training 3/3 epoch (loss 0.5170): 68%|βββββββ | 1396/2046 [16:51<07:29, 1.44it/s]
Training 3/3 epoch (loss 0.3995): 68%|βββββββ | 1396/2046 [16:52<07:29, 1.44it/s]
Training 3/3 epoch (loss 0.3995): 68%|βββββββ | 1397/2046 [16:52<07:32, 1.43it/s]
Training 3/3 epoch (loss 0.3686): 68%|βββββββ | 1397/2046 [16:53<07:32, 1.43it/s]
Training 3/3 epoch (loss 0.3686): 68%|βββββββ | 1398/2046 [16:53<07:50, 1.38it/s]
Training 3/3 epoch (loss 0.3060): 68%|βββββββ | 1398/2046 [16:53<07:50, 1.38it/s]
Training 3/3 epoch (loss 0.3060): 68%|βββββββ | 1399/2046 [16:53<07:59, 1.35it/s]
Training 3/3 epoch (loss 0.4934): 68%|βββββββ | 1399/2046 [16:55<07:59, 1.35it/s]
Training 3/3 epoch (loss 0.4934): 68%|βββββββ | 1400/2046 [16:55<09:00, 1.19it/s]
Training 3/3 epoch (loss 0.3781): 68%|βββββββ | 1400/2046 [16:55<09:00, 1.19it/s]
Training 3/3 epoch (loss 0.3781): 68%|βββββββ | 1401/2046 [16:55<08:28, 1.27it/s]
Training 3/3 epoch (loss 0.3965): 68%|βββββββ | 1401/2046 [16:56<08:28, 1.27it/s]
Training 3/3 epoch (loss 0.3965): 69%|βββββββ | 1402/2046 [16:56<08:02, 1.33it/s]
Training 3/3 epoch (loss 0.3639): 69%|βββββββ | 1402/2046 [16:57<08:02, 1.33it/s]
Training 3/3 epoch (loss 0.3639): 69%|βββββββ | 1403/2046 [16:57<07:57, 1.35it/s]
Training 3/3 epoch (loss 0.4513): 69%|βββββββ | 1403/2046 [16:57<07:57, 1.35it/s]
Training 3/3 epoch (loss 0.4513): 69%|βββββββ | 1404/2046 [16:57<07:35, 1.41it/s]
Training 3/3 epoch (loss 0.4136): 69%|βββββββ | 1404/2046 [16:58<07:35, 1.41it/s]
Training 3/3 epoch (loss 0.4136): 69%|βββββββ | 1405/2046 [16:58<07:43, 1.38it/s]
Training 3/3 epoch (loss 0.4890): 69%|βββββββ | 1405/2046 [16:59<07:43, 1.38it/s]
Training 3/3 epoch (loss 0.4890): 69%|βββββββ | 1406/2046 [16:59<07:39, 1.39it/s]
Training 3/3 epoch (loss 0.4689): 69%|βββββββ | 1406/2046 [16:59<07:39, 1.39it/s]
Training 3/3 epoch (loss 0.4689): 69%|βββββββ | 1407/2046 [16:59<07:41, 1.39it/s]
Training 3/3 epoch (loss 0.4317): 69%|βββββββ | 1407/2046 [17:00<07:41, 1.39it/s]
Training 3/3 epoch (loss 0.4317): 69%|βββββββ | 1408/2046 [17:00<07:39, 1.39it/s]
Training 3/3 epoch (loss 0.3916): 69%|βββββββ | 1408/2046 [17:01<07:39, 1.39it/s]
Training 3/3 epoch (loss 0.3916): 69%|βββββββ | 1409/2046 [17:01<07:46, 1.37it/s]
Training 3/3 epoch (loss 0.4729): 69%|βββββββ | 1409/2046 [17:02<07:46, 1.37it/s]
Training 3/3 epoch (loss 0.4729): 69%|βββββββ | 1410/2046 [17:02<08:00, 1.32it/s]
Training 3/3 epoch (loss 0.3629): 69%|βββββββ | 1410/2046 [17:02<08:00, 1.32it/s]
Training 3/3 epoch (loss 0.3629): 69%|βββββββ | 1411/2046 [17:02<07:35, 1.39it/s]
Training 3/3 epoch (loss 0.3542): 69%|βββββββ | 1411/2046 [17:03<07:35, 1.39it/s]
Training 3/3 epoch (loss 0.3542): 69%|βββββββ | 1412/2046 [17:03<07:24, 1.43it/s]
Training 3/3 epoch (loss 0.3577): 69%|βββββββ | 1412/2046 [17:04<07:24, 1.43it/s]
Training 3/3 epoch (loss 0.3577): 69%|βββββββ | 1413/2046 [17:04<07:42, 1.37it/s]
Training 3/3 epoch (loss 0.3922): 69%|βββββββ | 1413/2046 [17:05<07:42, 1.37it/s]
Training 3/3 epoch (loss 0.3922): 69%|βββββββ | 1414/2046 [17:05<07:36, 1.38it/s]
Training 3/3 epoch (loss 0.3640): 69%|βββββββ | 1414/2046 [17:05<07:36, 1.38it/s]
Training 3/3 epoch (loss 0.3640): 69%|βββββββ | 1415/2046 [17:05<07:51, 1.34it/s]
Training 3/3 epoch (loss 0.3682): 69%|βββββββ | 1415/2046 [17:06<07:51, 1.34it/s]
Training 3/3 epoch (loss 0.3682): 69%|βββββββ | 1416/2046 [17:06<08:36, 1.22it/s]
Training 3/3 epoch (loss 0.3400): 69%|βββββββ | 1416/2046 [17:07<08:36, 1.22it/s]
Training 3/3 epoch (loss 0.3400): 69%|βββββββ | 1417/2046 [17:07<07:59, 1.31it/s]
Training 3/3 epoch (loss 0.2947): 69%|βββββββ | 1417/2046 [17:08<07:59, 1.31it/s]
Training 3/3 epoch (loss 0.2947): 69%|βββββββ | 1418/2046 [17:08<07:29, 1.40it/s]
Training 3/3 epoch (loss 0.3872): 69%|βββββββ | 1418/2046 [17:08<07:29, 1.40it/s]
Training 3/3 epoch (loss 0.3872): 69%|βββββββ | 1419/2046 [17:08<07:21, 1.42it/s]
Training 3/3 epoch (loss 0.3966): 69%|βββββββ | 1419/2046 [17:09<07:21, 1.42it/s]
Training 3/3 epoch (loss 0.3966): 69%|βββββββ | 1420/2046 [17:09<07:22, 1.42it/s]
Training 3/3 epoch (loss 0.4686): 69%|βββββββ | 1420/2046 [17:10<07:22, 1.42it/s]
Training 3/3 epoch (loss 0.4686): 69%|βββββββ | 1421/2046 [17:10<07:26, 1.40it/s]
Training 3/3 epoch (loss 0.5001): 69%|βββββββ | 1421/2046 [17:10<07:26, 1.40it/s]
Training 3/3 epoch (loss 0.5001): 70%|βββββββ | 1422/2046 [17:10<07:14, 1.44it/s]
Training 3/3 epoch (loss 0.3203): 70%|βββββββ | 1422/2046 [17:11<07:14, 1.44it/s]
Training 3/3 epoch (loss 0.3203): 70%|βββββββ | 1423/2046 [17:11<07:17, 1.42it/s]
Training 3/3 epoch (loss 0.2880): 70%|βββββββ | 1423/2046 [17:12<07:17, 1.42it/s]
Training 3/3 epoch (loss 0.2880): 70%|βββββββ | 1424/2046 [17:12<07:36, 1.36it/s]
Training 3/3 epoch (loss 0.3790): 70%|βββββββ | 1424/2046 [17:13<07:36, 1.36it/s]
Training 3/3 epoch (loss 0.3790): 70%|βββββββ | 1425/2046 [17:13<08:01, 1.29it/s]
Training 3/3 epoch (loss 0.3556): 70%|βββββββ | 1425/2046 [17:13<08:01, 1.29it/s]
Training 3/3 epoch (loss 0.3556): 70%|βββββββ | 1426/2046 [17:13<07:47, 1.33it/s]
Training 3/3 epoch (loss 0.4477): 70%|βββββββ | 1426/2046 [17:14<07:47, 1.33it/s]
Training 3/3 epoch (loss 0.4477): 70%|βββββββ | 1427/2046 [17:14<07:33, 1.36it/s]
Training 3/3 epoch (loss 0.4377): 70%|βββββββ | 1427/2046 [17:15<07:33, 1.36it/s]
Training 3/3 epoch (loss 0.4377): 70%|βββββββ | 1428/2046 [17:15<07:12, 1.43it/s]
Training 3/3 epoch (loss 0.4749): 70%|βββββββ | 1428/2046 [17:16<07:12, 1.43it/s]
Training 3/3 epoch (loss 0.4749): 70%|βββββββ | 1429/2046 [17:16<07:43, 1.33it/s]
Training 3/3 epoch (loss 0.4235): 70%|βββββββ | 1429/2046 [17:16<07:43, 1.33it/s]
Training 3/3 epoch (loss 0.4235): 70%|βββββββ | 1430/2046 [17:16<07:17, 1.41it/s]
Training 3/3 epoch (loss 0.3571): 70%|βββββββ | 1430/2046 [17:17<07:17, 1.41it/s]
Training 3/3 epoch (loss 0.3571): 70%|βββββββ | 1431/2046 [17:17<06:56, 1.47it/s]
Training 3/3 epoch (loss 0.3540): 70%|βββββββ | 1431/2046 [17:18<06:56, 1.47it/s]
Training 3/3 epoch (loss 0.3540): 70%|βββββββ | 1432/2046 [17:18<08:15, 1.24it/s]
Training 3/3 epoch (loss 0.3474): 70%|βββββββ | 1432/2046 [17:19<08:15, 1.24it/s]
Training 3/3 epoch (loss 0.3474): 70%|βββββββ | 1433/2046 [17:19<08:27, 1.21it/s]
Training 3/3 epoch (loss 0.4699): 70%|βββββββ | 1433/2046 [17:20<08:27, 1.21it/s]
Training 3/3 epoch (loss 0.4699): 70%|βββββββ | 1434/2046 [17:20<09:16, 1.10it/s]
Training 3/3 epoch (loss 0.2834): 70%|βββββββ | 1434/2046 [17:21<09:16, 1.10it/s]
Training 3/3 epoch (loss 0.2834): 70%|βββββββ | 1435/2046 [17:21<08:44, 1.16it/s]
Training 3/3 epoch (loss 0.2640): 70%|βββββββ | 1435/2046 [17:21<08:44, 1.16it/s]
Training 3/3 epoch (loss 0.2640): 70%|βββββββ | 1436/2046 [17:21<07:58, 1.27it/s]
Training 3/3 epoch (loss 0.2953): 70%|βββββββ | 1436/2046 [17:22<07:58, 1.27it/s]
Training 3/3 epoch (loss 0.2953): 70%|βββββββ | 1437/2046 [17:22<07:27, 1.36it/s]
Training 3/3 epoch (loss 0.2465): 70%|βββββββ | 1437/2046 [17:23<07:27, 1.36it/s]
Training 3/3 epoch (loss 0.2465): 70%|βββββββ | 1438/2046 [17:23<07:33, 1.34it/s]
Training 3/3 epoch (loss 0.2619): 70%|βββββββ | 1438/2046 [17:23<07:33, 1.34it/s]
Training 3/3 epoch (loss 0.2619): 70%|βββββββ | 1439/2046 [17:23<07:38, 1.32it/s]
Training 3/3 epoch (loss 0.2273): 70%|βββββββ | 1439/2046 [17:25<07:38, 1.32it/s]
Training 3/3 epoch (loss 0.2273): 70%|βββββββ | 1440/2046 [17:25<08:42, 1.16it/s]
Training 3/3 epoch (loss 0.2692): 70%|βββββββ | 1440/2046 [17:25<08:42, 1.16it/s]
Training 3/3 epoch (loss 0.2692): 70%|βββββββ | 1441/2046 [17:25<08:02, 1.25it/s]
Training 3/3 epoch (loss 0.3559): 70%|βββββββ | 1441/2046 [17:26<08:02, 1.25it/s]
Training 3/3 epoch (loss 0.3559): 70%|βββββββ | 1442/2046 [17:26<07:36, 1.32it/s]
Training 3/3 epoch (loss 0.1887): 70%|βββββββ | 1442/2046 [17:27<07:36, 1.32it/s]
Training 3/3 epoch (loss 0.1887): 71%|βββββββ | 1443/2046 [17:27<07:22, 1.36it/s]
Training 3/3 epoch (loss 0.2888): 71%|βββββββ | 1443/2046 [17:27<07:22, 1.36it/s]
Training 3/3 epoch (loss 0.2888): 71%|βββββββ | 1444/2046 [17:27<07:15, 1.38it/s]
Training 3/3 epoch (loss 0.2317): 71%|βββββββ | 1444/2046 [17:28<07:15, 1.38it/s]
Training 3/3 epoch (loss 0.2317): 71%|βββββββ | 1445/2046 [17:28<07:10, 1.40it/s]
Training 3/3 epoch (loss 0.3297): 71%|βββββββ | 1445/2046 [17:29<07:10, 1.40it/s]
Training 3/3 epoch (loss 0.3297): 71%|βββββββ | 1446/2046 [17:29<07:38, 1.31it/s]
Training 3/3 epoch (loss 0.1675): 71%|βββββββ | 1446/2046 [17:29<07:38, 1.31it/s]
Training 3/3 epoch (loss 0.1675): 71%|βββββββ | 1447/2046 [17:29<07:27, 1.34it/s]
Training 3/3 epoch (loss 0.2404): 71%|βββββββ | 1447/2046 [17:30<07:27, 1.34it/s]
Training 3/3 epoch (loss 0.2404): 71%|βββββββ | 1448/2046 [17:30<07:44, 1.29it/s]
Training 3/3 epoch (loss 0.2440): 71%|βββββββ | 1448/2046 [17:31<07:44, 1.29it/s]
Training 3/3 epoch (loss 0.2440): 71%|βββββββ | 1449/2046 [17:31<07:25, 1.34it/s]
Training 3/3 epoch (loss 0.2246): 71%|βββββββ | 1449/2046 [17:32<07:25, 1.34it/s]
Training 3/3 epoch (loss 0.2246): 71%|βββββββ | 1450/2046 [17:32<07:06, 1.40it/s]
Training 3/3 epoch (loss 0.3313): 71%|βββββββ | 1450/2046 [17:32<07:06, 1.40it/s]
Training 3/3 epoch (loss 0.3313): 71%|βββββββ | 1451/2046 [17:32<06:55, 1.43it/s]
Training 3/3 epoch (loss 0.3439): 71%|βββββββ | 1451/2046 [17:33<06:55, 1.43it/s]
Training 3/3 epoch (loss 0.3439): 71%|βββββββ | 1452/2046 [17:33<07:11, 1.38it/s]
Training 3/3 epoch (loss 0.2057): 71%|βββββββ | 1452/2046 [17:34<07:11, 1.38it/s]
Training 3/3 epoch (loss 0.2057): 71%|βββββββ | 1453/2046 [17:34<06:54, 1.43it/s]
Training 3/3 epoch (loss 0.2680): 71%|βββββββ | 1453/2046 [17:34<06:54, 1.43it/s]
Training 3/3 epoch (loss 0.2680): 71%|βββββββ | 1454/2046 [17:34<06:42, 1.47it/s]
Training 3/3 epoch (loss 0.2690): 71%|βββββββ | 1454/2046 [17:35<06:42, 1.47it/s]
Training 3/3 epoch (loss 0.2690): 71%|βββββββ | 1455/2046 [17:35<06:48, 1.45it/s]
Training 3/3 epoch (loss 0.2315): 71%|βββββββ | 1455/2046 [17:36<06:48, 1.45it/s]
Training 3/3 epoch (loss 0.2315): 71%|βββββββ | 1456/2046 [17:36<07:13, 1.36it/s]
Training 3/3 epoch (loss 0.2372): 71%|βββββββ | 1456/2046 [17:37<07:13, 1.36it/s]
Training 3/3 epoch (loss 0.2372): 71%|βββββββ | 1457/2046 [17:37<06:56, 1.41it/s]
Training 3/3 epoch (loss 0.2384): 71%|βββββββ | 1457/2046 [17:37<06:56, 1.41it/s]
Training 3/3 epoch (loss 0.2384): 71%|ββββββββ | 1458/2046 [17:37<06:51, 1.43it/s]
Training 3/3 epoch (loss 0.2032): 71%|ββββββββ | 1458/2046 [17:38<06:51, 1.43it/s]
Training 3/3 epoch (loss 0.2032): 71%|ββββββββ | 1459/2046 [17:38<06:45, 1.45it/s]
Training 3/3 epoch (loss 0.1839): 71%|ββββββββ | 1459/2046 [17:39<06:45, 1.45it/s]
Training 3/3 epoch (loss 0.1839): 71%|ββββββββ | 1460/2046 [17:39<06:35, 1.48it/s]
Training 3/3 epoch (loss 0.2046): 71%|ββββββββ | 1460/2046 [17:39<06:35, 1.48it/s]
Training 3/3 epoch (loss 0.2046): 71%|ββββββββ | 1461/2046 [17:39<06:26, 1.51it/s]
Training 3/3 epoch (loss 0.2108): 71%|ββββββββ | 1461/2046 [17:40<06:26, 1.51it/s]
Training 3/3 epoch (loss 0.2108): 71%|ββββββββ | 1462/2046 [17:40<06:20, 1.54it/s]
Training 3/3 epoch (loss 0.1667): 71%|ββββββββ | 1462/2046 [17:40<06:20, 1.54it/s]
Training 3/3 epoch (loss 0.1667): 72%|ββββββββ | 1463/2046 [17:40<06:12, 1.56it/s]
Training 3/3 epoch (loss 0.2375): 72%|ββββββββ | 1463/2046 [17:41<06:12, 1.56it/s]
Training 3/3 epoch (loss 0.2375): 72%|ββββββββ | 1464/2046 [17:41<06:37, 1.47it/s]
Training 3/3 epoch (loss 0.2132): 72%|ββββββββ | 1464/2046 [17:42<06:37, 1.47it/s]
Training 3/3 epoch (loss 0.2132): 72%|ββββββββ | 1465/2046 [17:42<06:25, 1.51it/s]
Training 3/3 epoch (loss 0.1951): 72%|ββββββββ | 1465/2046 [17:42<06:25, 1.51it/s]
Training 3/3 epoch (loss 0.1951): 72%|ββββββββ | 1466/2046 [17:42<06:16, 1.54it/s]
Training 3/3 epoch (loss 0.2072): 72%|ββββββββ | 1466/2046 [17:43<06:16, 1.54it/s]
Training 3/3 epoch (loss 0.2072): 72%|ββββββββ | 1467/2046 [17:43<06:15, 1.54it/s]
Training 3/3 epoch (loss 0.1835): 72%|ββββββββ | 1467/2046 [17:44<06:15, 1.54it/s]
Training 3/3 epoch (loss 0.1835): 72%|ββββββββ | 1468/2046 [17:44<06:12, 1.55it/s]
Training 3/3 epoch (loss 0.1834): 72%|ββββββββ | 1468/2046 [17:44<06:12, 1.55it/s]
Training 3/3 epoch (loss 0.1834): 72%|ββββββββ | 1469/2046 [17:44<06:25, 1.50it/s]
Training 3/3 epoch (loss 0.1944): 72%|ββββββββ | 1469/2046 [17:45<06:25, 1.50it/s]
Training 3/3 epoch (loss 0.1944): 72%|ββββββββ | 1470/2046 [17:45<06:31, 1.47it/s]
Training 3/3 epoch (loss 0.2069): 72%|ββββββββ | 1470/2046 [17:46<06:31, 1.47it/s]
Training 3/3 epoch (loss 0.2069): 72%|ββββββββ | 1471/2046 [17:46<06:28, 1.48it/s]
Training 3/3 epoch (loss 0.1976): 72%|ββββββββ | 1471/2046 [17:47<06:28, 1.48it/s]
Training 3/3 epoch (loss 0.1976): 72%|ββββββββ | 1472/2046 [17:47<07:00, 1.36it/s]
Training 3/3 epoch (loss 0.2183): 72%|ββββββββ | 1472/2046 [17:47<07:00, 1.36it/s]
Training 3/3 epoch (loss 0.2183): 72%|ββββββββ | 1473/2046 [17:47<07:11, 1.33it/s]
Training 3/3 epoch (loss 0.2283): 72%|ββββββββ | 1473/2046 [17:48<07:11, 1.33it/s]
Training 3/3 epoch (loss 0.2283): 72%|ββββββββ | 1474/2046 [17:48<07:20, 1.30it/s]
Training 3/3 epoch (loss 0.2009): 72%|ββββββββ | 1474/2046 [17:49<07:20, 1.30it/s]
Training 3/3 epoch (loss 0.2009): 72%|ββββββββ | 1475/2046 [17:49<07:13, 1.32it/s]
Training 3/3 epoch (loss 0.2234): 72%|ββββββββ | 1475/2046 [17:50<07:13, 1.32it/s]
Training 3/3 epoch (loss 0.2234): 72%|ββββββββ | 1476/2046 [17:50<07:12, 1.32it/s]
Training 3/3 epoch (loss 0.2615): 72%|ββββββββ | 1476/2046 [17:50<07:12, 1.32it/s]
Training 3/3 epoch (loss 0.2615): 72%|ββββββββ | 1477/2046 [17:50<06:50, 1.38it/s]
Training 3/3 epoch (loss 0.1780): 72%|ββββββββ | 1477/2046 [17:51<06:50, 1.38it/s]
Training 3/3 epoch (loss 0.1780): 72%|ββββββββ | 1478/2046 [17:51<06:53, 1.37it/s]
Training 3/3 epoch (loss 0.1987): 72%|ββββββββ | 1478/2046 [17:52<06:53, 1.37it/s]
Training 3/3 epoch (loss 0.1987): 72%|ββββββββ | 1479/2046 [17:52<06:54, 1.37it/s]
Training 3/3 epoch (loss 0.1590): 72%|ββββββββ | 1479/2046 [17:53<06:54, 1.37it/s]
Training 3/3 epoch (loss 0.1590): 72%|ββββββββ | 1480/2046 [17:53<07:22, 1.28it/s]
Training 3/3 epoch (loss 0.2336): 72%|ββββββββ | 1480/2046 [17:54<07:22, 1.28it/s]
Training 3/3 epoch (loss 0.2336): 72%|ββββββββ | 1481/2046 [17:54<07:43, 1.22it/s]
Training 3/3 epoch (loss 0.1977): 72%|ββββββββ | 1481/2046 [17:54<07:43, 1.22it/s]
Training 3/3 epoch (loss 0.1977): 72%|ββββββββ | 1482/2046 [17:54<07:20, 1.28it/s]
Training 3/3 epoch (loss 0.1852): 72%|ββββββββ | 1482/2046 [17:55<07:20, 1.28it/s]
Training 3/3 epoch (loss 0.1852): 72%|ββββββββ | 1483/2046 [17:55<07:07, 1.32it/s]
Training 3/3 epoch (loss 0.1767): 72%|ββββββββ | 1483/2046 [17:56<07:07, 1.32it/s]
Training 3/3 epoch (loss 0.1767): 73%|ββββββββ | 1484/2046 [17:56<06:41, 1.40it/s]
Training 3/3 epoch (loss 0.2443): 73%|ββββββββ | 1484/2046 [17:57<06:41, 1.40it/s]
Training 3/3 epoch (loss 0.2443): 73%|ββββββββ | 1485/2046 [17:57<07:21, 1.27it/s]
Training 3/3 epoch (loss 0.1538): 73%|ββββββββ | 1485/2046 [17:57<07:21, 1.27it/s]
Training 3/3 epoch (loss 0.1538): 73%|ββββββββ | 1486/2046 [17:57<06:55, 1.35it/s]
Training 3/3 epoch (loss 0.1237): 73%|ββββββββ | 1486/2046 [17:58<06:55, 1.35it/s]
Training 3/3 epoch (loss 0.1237): 73%|ββββββββ | 1487/2046 [17:58<06:34, 1.42it/s]
Training 3/3 epoch (loss 0.2185): 73%|ββββββββ | 1487/2046 [17:59<06:34, 1.42it/s]
Training 3/3 epoch (loss 0.2185): 73%|ββββββββ | 1488/2046 [17:59<07:06, 1.31it/s]
Training 3/3 epoch (loss 0.1427): 73%|ββββββββ | 1488/2046 [18:00<07:06, 1.31it/s]
Training 3/3 epoch (loss 0.1427): 73%|ββββββββ | 1489/2046 [18:00<07:05, 1.31it/s]
Training 3/3 epoch (loss 0.1539): 73%|ββββββββ | 1489/2046 [18:00<07:05, 1.31it/s]
Training 3/3 epoch (loss 0.1539): 73%|ββββββββ | 1490/2046 [18:00<07:10, 1.29it/s]
Training 3/3 epoch (loss 0.1736): 73%|ββββββββ | 1490/2046 [18:01<07:10, 1.29it/s]
Training 3/3 epoch (loss 0.1736): 73%|ββββββββ | 1491/2046 [18:01<06:57, 1.33it/s]
Training 3/3 epoch (loss 0.0981): 73%|ββββββββ | 1491/2046 [18:02<06:57, 1.33it/s]
Training 3/3 epoch (loss 0.0981): 73%|ββββββββ | 1492/2046 [18:02<06:56, 1.33it/s]
Training 3/3 epoch (loss 0.2035): 73%|ββββββββ | 1492/2046 [18:02<06:56, 1.33it/s]
Training 3/3 epoch (loss 0.2035): 73%|ββββββββ | 1493/2046 [18:02<06:33, 1.41it/s]
Training 3/3 epoch (loss 0.1652): 73%|ββββββββ | 1493/2046 [18:03<06:33, 1.41it/s]
Training 3/3 epoch (loss 0.1652): 73%|ββββββββ | 1494/2046 [18:03<06:18, 1.46it/s]
Training 3/3 epoch (loss 0.2062): 73%|ββββββββ | 1494/2046 [18:04<06:18, 1.46it/s]
Training 3/3 epoch (loss 0.2062): 73%|ββββββββ | 1495/2046 [18:04<06:22, 1.44it/s]
Training 3/3 epoch (loss 0.1603): 73%|ββββββββ | 1495/2046 [18:05<06:22, 1.44it/s]
Training 3/3 epoch (loss 0.1603): 73%|ββββββββ | 1496/2046 [18:05<06:56, 1.32it/s]
Training 3/3 epoch (loss 0.1169): 73%|ββββββββ | 1496/2046 [18:05<06:56, 1.32it/s]
Training 3/3 epoch (loss 0.1169): 73%|ββββββββ | 1497/2046 [18:05<06:36, 1.38it/s]
Training 3/3 epoch (loss 0.1892): 73%|ββββββββ | 1497/2046 [18:06<06:36, 1.38it/s]
Training 3/3 epoch (loss 0.1892): 73%|ββββββββ | 1498/2046 [18:06<07:11, 1.27it/s]
Training 3/3 epoch (loss 0.2542): 73%|ββββββββ | 1498/2046 [18:07<07:11, 1.27it/s]
Training 3/3 epoch (loss 0.2542): 73%|ββββββββ | 1499/2046 [18:07<06:46, 1.35it/s]
Training 3/3 epoch (loss 0.2433): 73%|ββββββββ | 1499/2046 [18:08<06:46, 1.35it/s]
Training 3/3 epoch (loss 0.2433): 73%|ββββββββ | 1500/2046 [18:08<06:28, 1.41it/s]
Training 3/3 epoch (loss 0.2323): 73%|ββββββββ | 1500/2046 [18:08<06:28, 1.41it/s]
Training 3/3 epoch (loss 0.2323): 73%|ββββββββ | 1501/2046 [18:08<06:14, 1.46it/s]
Training 3/3 epoch (loss 0.2023): 73%|ββββββββ | 1501/2046 [18:09<06:14, 1.46it/s]
Training 3/3 epoch (loss 0.2023): 73%|ββββββββ | 1502/2046 [18:09<06:27, 1.40it/s]
Training 3/3 epoch (loss 0.1881): 73%|ββββββββ | 1502/2046 [18:10<06:27, 1.40it/s]
Training 3/3 epoch (loss 0.1881): 73%|ββββββββ | 1503/2046 [18:10<06:12, 1.46it/s]
Training 3/3 epoch (loss 0.1842): 73%|ββββββββ | 1503/2046 [18:10<06:12, 1.46it/s]
Training 3/3 epoch (loss 0.1842): 74%|ββββββββ | 1504/2046 [18:10<06:24, 1.41it/s]
Training 3/3 epoch (loss 0.1868): 74%|ββββββββ | 1504/2046 [18:11<06:24, 1.41it/s]
Training 3/3 epoch (loss 0.1868): 74%|ββββββββ | 1505/2046 [18:11<06:06, 1.48it/s]
Training 3/3 epoch (loss 0.1479): 74%|ββββββββ | 1505/2046 [18:12<06:06, 1.48it/s]
Training 3/3 epoch (loss 0.1479): 74%|ββββββββ | 1506/2046 [18:12<06:00, 1.50it/s]
Training 3/3 epoch (loss 0.1586): 74%|ββββββββ | 1506/2046 [18:12<06:00, 1.50it/s]
Training 3/3 epoch (loss 0.1586): 74%|ββββββββ | 1507/2046 [18:12<06:05, 1.47it/s]
Training 3/3 epoch (loss 0.1990): 74%|ββββββββ | 1507/2046 [18:13<06:05, 1.47it/s]
Training 3/3 epoch (loss 0.1990): 74%|ββββββββ | 1508/2046 [18:13<06:18, 1.42it/s]
Training 3/3 epoch (loss 0.1523): 74%|ββββββββ | 1508/2046 [18:14<06:18, 1.42it/s]
Training 3/3 epoch (loss 0.1523): 74%|ββββββββ | 1509/2046 [18:14<06:38, 1.35it/s]
Training 3/3 epoch (loss 0.1717): 74%|ββββββββ | 1509/2046 [18:15<06:38, 1.35it/s]
Training 3/3 epoch (loss 0.1717): 74%|ββββββββ | 1510/2046 [18:15<06:17, 1.42it/s]
Training 3/3 epoch (loss 0.1263): 74%|ββββββββ | 1510/2046 [18:15<06:17, 1.42it/s]
Training 3/3 epoch (loss 0.1263): 74%|ββββββββ | 1511/2046 [18:15<06:15, 1.43it/s]
Training 3/3 epoch (loss 0.1094): 74%|ββββββββ | 1511/2046 [18:16<06:15, 1.43it/s]
Training 3/3 epoch (loss 0.1094): 74%|ββββββββ | 1512/2046 [18:16<06:44, 1.32it/s]
Training 3/3 epoch (loss 0.1457): 74%|ββββββββ | 1512/2046 [18:17<06:44, 1.32it/s]
Training 3/3 epoch (loss 0.1457): 74%|ββββββββ | 1513/2046 [18:17<06:46, 1.31it/s]
Training 3/3 epoch (loss 0.1461): 74%|ββββββββ | 1513/2046 [18:18<06:46, 1.31it/s]
Training 3/3 epoch (loss 0.1461): 74%|ββββββββ | 1514/2046 [18:18<06:40, 1.33it/s]
Training 3/3 epoch (loss 0.1920): 74%|ββββββββ | 1514/2046 [18:18<06:40, 1.33it/s]
Training 3/3 epoch (loss 0.1920): 74%|ββββββββ | 1515/2046 [18:18<06:37, 1.34it/s]
Training 3/3 epoch (loss 0.1775): 74%|ββββββββ | 1515/2046 [18:19<06:37, 1.34it/s]
Training 3/3 epoch (loss 0.1775): 74%|ββββββββ | 1516/2046 [18:19<06:40, 1.32it/s]
Training 3/3 epoch (loss 0.1574): 74%|ββββββββ | 1516/2046 [18:20<06:40, 1.32it/s]
Training 3/3 epoch (loss 0.1574): 74%|ββββββββ | 1517/2046 [18:20<06:30, 1.36it/s]
Training 3/3 epoch (loss 0.1647): 74%|ββββββββ | 1517/2046 [18:21<06:30, 1.36it/s]
Training 3/3 epoch (loss 0.1647): 74%|ββββββββ | 1518/2046 [18:21<06:45, 1.30it/s]
Training 3/3 epoch (loss 0.1333): 74%|ββββββββ | 1518/2046 [18:21<06:45, 1.30it/s]
Training 3/3 epoch (loss 0.1333): 74%|ββββββββ | 1519/2046 [18:21<06:24, 1.37it/s]
Training 3/3 epoch (loss 0.1406): 74%|ββββββββ | 1519/2046 [18:22<06:24, 1.37it/s]
Training 3/3 epoch (loss 0.1406): 74%|ββββββββ | 1520/2046 [18:22<06:28, 1.35it/s]
Training 3/3 epoch (loss 0.1173): 74%|ββββββββ | 1520/2046 [18:23<06:28, 1.35it/s]
Training 3/3 epoch (loss 0.1173): 74%|ββββββββ | 1521/2046 [18:23<06:43, 1.30it/s]
Training 3/3 epoch (loss 0.1267): 74%|ββββββββ | 1521/2046 [18:24<06:43, 1.30it/s]
Training 3/3 epoch (loss 0.1267): 74%|ββββββββ | 1522/2046 [18:24<06:34, 1.33it/s]
Training 3/3 epoch (loss 0.1737): 74%|ββββββββ | 1522/2046 [18:24<06:34, 1.33it/s]
Training 3/3 epoch (loss 0.1737): 74%|ββββββββ | 1523/2046 [18:24<06:26, 1.35it/s]
Training 3/3 epoch (loss 0.1336): 74%|ββββββββ | 1523/2046 [18:25<06:26, 1.35it/s]
Training 3/3 epoch (loss 0.1336): 74%|ββββββββ | 1524/2046 [18:25<06:42, 1.30it/s]
Training 3/3 epoch (loss 0.1798): 74%|ββββββββ | 1524/2046 [18:26<06:42, 1.30it/s]
Training 3/3 epoch (loss 0.1798): 75%|ββββββββ | 1525/2046 [18:26<07:05, 1.23it/s]
Training 3/3 epoch (loss 0.1321): 75%|ββββββββ | 1525/2046 [18:27<07:05, 1.23it/s]
Training 3/3 epoch (loss 0.1321): 75%|ββββββββ | 1526/2046 [18:27<06:40, 1.30it/s]
Training 3/3 epoch (loss 0.1697): 75%|ββββββββ | 1526/2046 [18:27<06:40, 1.30it/s]
Training 3/3 epoch (loss 0.1697): 75%|ββββββββ | 1527/2046 [18:27<06:19, 1.37it/s]
Training 3/3 epoch (loss 0.1633): 75%|ββββββββ | 1527/2046 [18:28<06:19, 1.37it/s]
Training 3/3 epoch (loss 0.1633): 75%|ββββββββ | 1528/2046 [18:28<06:21, 1.36it/s]
Training 3/3 epoch (loss 0.2924): 75%|ββββββββ | 1528/2046 [18:29<06:21, 1.36it/s]
Training 3/3 epoch (loss 0.2924): 75%|ββββββββ | 1529/2046 [18:29<06:42, 1.28it/s]
Training 3/3 epoch (loss 0.1513): 75%|ββββββββ | 1529/2046 [18:30<06:42, 1.28it/s]
Training 3/3 epoch (loss 0.1513): 75%|ββββββββ | 1530/2046 [18:30<06:27, 1.33it/s]
Training 3/3 epoch (loss 0.2043): 75%|ββββββββ | 1530/2046 [18:30<06:27, 1.33it/s]
Training 3/3 epoch (loss 0.2043): 75%|ββββββββ | 1531/2046 [18:30<06:21, 1.35it/s]
Training 3/3 epoch (loss 0.1937): 75%|ββββββββ | 1531/2046 [18:31<06:21, 1.35it/s]
Training 3/3 epoch (loss 0.1937): 75%|ββββββββ | 1532/2046 [18:31<05:59, 1.43it/s]
Training 3/3 epoch (loss 0.1744): 75%|ββββββββ | 1532/2046 [18:32<05:59, 1.43it/s]
Training 3/3 epoch (loss 0.1744): 75%|ββββββββ | 1533/2046 [18:32<06:00, 1.42it/s]
Training 3/3 epoch (loss 0.0848): 75%|ββββββββ | 1533/2046 [18:32<06:00, 1.42it/s]
Training 3/3 epoch (loss 0.0848): 75%|ββββββββ | 1534/2046 [18:32<05:50, 1.46it/s]
Training 3/3 epoch (loss 0.1493): 75%|ββββββββ | 1534/2046 [18:33<05:50, 1.46it/s]
Training 3/3 epoch (loss 0.1493): 75%|ββββββββ | 1535/2046 [18:33<06:13, 1.37it/s]
Training 3/3 epoch (loss 0.1360): 75%|ββββββββ | 1535/2046 [18:34<06:13, 1.37it/s]
Training 3/3 epoch (loss 0.1360): 75%|ββββββββ | 1536/2046 [18:34<06:23, 1.33it/s]
Training 3/3 epoch (loss 0.0859): 75%|ββββββββ | 1536/2046 [18:35<06:23, 1.33it/s]
Training 3/3 epoch (loss 0.0859): 75%|ββββββββ | 1537/2046 [18:35<06:13, 1.36it/s]
Training 3/3 epoch (loss 0.1496): 75%|ββββββββ | 1537/2046 [18:35<06:13, 1.36it/s]
Training 3/3 epoch (loss 0.1496): 75%|ββββββββ | 1538/2046 [18:35<06:01, 1.41it/s]
Training 3/3 epoch (loss 0.1409): 75%|ββββββββ | 1538/2046 [18:36<06:01, 1.41it/s]
Training 3/3 epoch (loss 0.1409): 75%|ββββββββ | 1539/2046 [18:36<05:43, 1.47it/s]
Training 3/3 epoch (loss 0.1278): 75%|ββββββββ | 1539/2046 [18:37<05:43, 1.47it/s]
Training 3/3 epoch (loss 0.1278): 75%|ββββββββ | 1540/2046 [18:37<05:53, 1.43it/s]
Training 3/3 epoch (loss 0.1184): 75%|ββββββββ | 1540/2046 [18:37<05:53, 1.43it/s]
Training 3/3 epoch (loss 0.1184): 75%|ββββββββ | 1541/2046 [18:37<06:07, 1.38it/s]
Training 3/3 epoch (loss 0.1123): 75%|ββββββββ | 1541/2046 [18:38<06:07, 1.38it/s]
Training 3/3 epoch (loss 0.1123): 75%|ββββββββ | 1542/2046 [18:38<05:48, 1.45it/s]
Training 3/3 epoch (loss 0.1351): 75%|ββββββββ | 1542/2046 [18:39<05:48, 1.45it/s]
Training 3/3 epoch (loss 0.1351): 75%|ββββββββ | 1543/2046 [18:39<05:36, 1.49it/s]
Training 3/3 epoch (loss 0.1048): 75%|ββββββββ | 1543/2046 [18:39<05:36, 1.49it/s]
Training 3/3 epoch (loss 0.1048): 75%|ββββββββ | 1544/2046 [18:39<05:50, 1.43it/s]
Training 3/3 epoch (loss 0.0974): 75%|ββββββββ | 1544/2046 [18:40<05:50, 1.43it/s]
Training 3/3 epoch (loss 0.0974): 76%|ββββββββ | 1545/2046 [18:40<05:39, 1.48it/s]
Training 3/3 epoch (loss 0.0873): 76%|ββββββββ | 1545/2046 [18:41<05:39, 1.48it/s]
Training 3/3 epoch (loss 0.0873): 76%|ββββββββ | 1546/2046 [18:41<05:31, 1.51it/s]
Training 3/3 epoch (loss 0.1161): 76%|ββββββββ | 1546/2046 [18:41<05:31, 1.51it/s]
Training 3/3 epoch (loss 0.1161): 76%|ββββββββ | 1547/2046 [18:41<05:30, 1.51it/s]
Training 3/3 epoch (loss 0.1244): 76%|ββββββββ | 1547/2046 [18:42<05:30, 1.51it/s]
Training 3/3 epoch (loss 0.1244): 76%|ββββββββ | 1548/2046 [18:42<05:22, 1.54it/s]
Training 3/3 epoch (loss 0.1177): 76%|ββββββββ | 1548/2046 [18:43<05:22, 1.54it/s]
Training 3/3 epoch (loss 0.1177): 76%|ββββββββ | 1549/2046 [18:43<05:27, 1.52it/s]
Training 3/3 epoch (loss 0.1215): 76%|ββββββββ | 1549/2046 [18:43<05:27, 1.52it/s]
Training 3/3 epoch (loss 0.1215): 76%|ββββββββ | 1550/2046 [18:43<05:22, 1.54it/s]
Training 3/3 epoch (loss 0.1289): 76%|ββββββββ | 1550/2046 [18:44<05:22, 1.54it/s]
Training 3/3 epoch (loss 0.1289): 76%|ββββββββ | 1551/2046 [18:44<05:30, 1.50it/s]
Training 3/3 epoch (loss 0.1116): 76%|ββββββββ | 1551/2046 [18:45<05:30, 1.50it/s]
Training 3/3 epoch (loss 0.1116): 76%|ββββββββ | 1552/2046 [18:45<05:43, 1.44it/s]
Training 3/3 epoch (loss 0.1263): 76%|ββββββββ | 1552/2046 [18:46<05:43, 1.44it/s]
Training 3/3 epoch (loss 0.1263): 76%|ββββββββ | 1553/2046 [18:46<06:01, 1.36it/s]
Training 3/3 epoch (loss 0.1288): 76%|ββββββββ | 1553/2046 [18:46<06:01, 1.36it/s]
Training 3/3 epoch (loss 0.1288): 76%|ββββββββ | 1554/2046 [18:46<05:57, 1.38it/s]
Training 3/3 epoch (loss 0.1220): 76%|ββββββββ | 1554/2046 [18:47<05:57, 1.38it/s]
Training 3/3 epoch (loss 0.1220): 76%|ββββββββ | 1555/2046 [18:47<05:43, 1.43it/s]
Training 3/3 epoch (loss 0.1472): 76%|ββββββββ | 1555/2046 [18:48<05:43, 1.43it/s]
Training 3/3 epoch (loss 0.1472): 76%|ββββββββ | 1556/2046 [18:48<05:46, 1.41it/s]
Training 3/3 epoch (loss 0.1547): 76%|ββββββββ | 1556/2046 [18:48<05:46, 1.41it/s]
Training 3/3 epoch (loss 0.1547): 76%|ββββββββ | 1557/2046 [18:48<05:59, 1.36it/s]
Training 3/3 epoch (loss 0.1048): 76%|ββββββββ | 1557/2046 [18:49<05:59, 1.36it/s]
Training 3/3 epoch (loss 0.1048): 76%|ββββββββ | 1558/2046 [18:49<05:53, 1.38it/s]
Training 3/3 epoch (loss 0.1368): 76%|ββββββββ | 1558/2046 [18:50<05:53, 1.38it/s]
Training 3/3 epoch (loss 0.1368): 76%|ββββββββ | 1559/2046 [18:50<05:48, 1.40it/s]
Training 3/3 epoch (loss 0.0768): 76%|ββββββββ | 1559/2046 [18:51<05:48, 1.40it/s]
Training 3/3 epoch (loss 0.0768): 76%|ββββββββ | 1560/2046 [18:51<06:11, 1.31it/s]
Training 3/3 epoch (loss 0.1646): 76%|ββββββββ | 1560/2046 [18:51<06:11, 1.31it/s]
Training 3/3 epoch (loss 0.1646): 76%|ββββββββ | 1561/2046 [18:51<05:52, 1.37it/s]
Training 3/3 epoch (loss 0.1837): 76%|ββββββββ | 1561/2046 [18:52<05:52, 1.37it/s]
Training 3/3 epoch (loss 0.1837): 76%|ββββββββ | 1562/2046 [18:52<05:33, 1.45it/s]
Training 3/3 epoch (loss 0.0702): 76%|ββββββββ | 1562/2046 [18:53<05:33, 1.45it/s]
Training 3/3 epoch (loss 0.0702): 76%|ββββββββ | 1563/2046 [18:53<05:19, 1.51it/s]
Training 3/3 epoch (loss 0.1646): 76%|ββββββββ | 1563/2046 [18:53<05:19, 1.51it/s]
Training 3/3 epoch (loss 0.1646): 76%|ββββββββ | 1564/2046 [18:53<05:22, 1.49it/s]
Training 3/3 epoch (loss 0.1314): 76%|ββββββββ | 1564/2046 [18:54<05:22, 1.49it/s]
Training 3/3 epoch (loss 0.1314): 76%|ββββββββ | 1565/2046 [18:54<05:26, 1.47it/s]
Training 3/3 epoch (loss 0.1411): 76%|ββββββββ | 1565/2046 [18:55<05:26, 1.47it/s]
Training 3/3 epoch (loss 0.1411): 77%|ββββββββ | 1566/2046 [18:55<05:39, 1.41it/s]
Training 3/3 epoch (loss 0.1797): 77%|ββββββββ | 1566/2046 [18:55<05:39, 1.41it/s]
Training 3/3 epoch (loss 0.1797): 77%|ββββββββ | 1567/2046 [18:55<05:26, 1.47it/s]
Training 3/3 epoch (loss 0.1304): 77%|ββββββββ | 1567/2046 [18:56<05:26, 1.47it/s]
Training 3/3 epoch (loss 0.1304): 77%|ββββββββ | 1568/2046 [18:56<05:34, 1.43it/s]
Training 3/3 epoch (loss 0.0781): 77%|ββββββββ | 1568/2046 [18:57<05:34, 1.43it/s]
Training 3/3 epoch (loss 0.0781): 77%|ββββββββ | 1569/2046 [18:57<06:17, 1.26it/s]
Training 3/3 epoch (loss 0.1666): 77%|ββββββββ | 1569/2046 [18:58<06:17, 1.26it/s]
Training 3/3 epoch (loss 0.1666): 77%|ββββββββ | 1570/2046 [18:58<05:51, 1.35it/s]
Training 3/3 epoch (loss 0.1055): 77%|ββββββββ | 1570/2046 [18:58<05:51, 1.35it/s]
Training 3/3 epoch (loss 0.1055): 77%|ββββββββ | 1571/2046 [18:58<05:39, 1.40it/s]
Training 3/3 epoch (loss 0.1563): 77%|ββββββββ | 1571/2046 [18:59<05:39, 1.40it/s]
Training 3/3 epoch (loss 0.1563): 77%|ββββββββ | 1572/2046 [18:59<05:59, 1.32it/s]
Training 3/3 epoch (loss 0.1370): 77%|ββββββββ | 1572/2046 [19:00<05:59, 1.32it/s]
Training 3/3 epoch (loss 0.1370): 77%|ββββββββ | 1573/2046 [19:00<05:45, 1.37it/s]
Training 3/3 epoch (loss 0.1178): 77%|ββββββββ | 1573/2046 [19:01<05:45, 1.37it/s]
Training 3/3 epoch (loss 0.1178): 77%|ββββββββ | 1574/2046 [19:01<05:33, 1.42it/s]
Training 3/3 epoch (loss 0.1660): 77%|ββββββββ | 1574/2046 [19:01<05:33, 1.42it/s]
Training 3/3 epoch (loss 0.1660): 77%|ββββββββ | 1575/2046 [19:01<05:29, 1.43it/s]
Training 3/3 epoch (loss 0.1600): 77%|ββββββββ | 1575/2046 [19:02<05:29, 1.43it/s]
Training 3/3 epoch (loss 0.1600): 77%|ββββββββ | 1576/2046 [19:02<05:40, 1.38it/s]
Training 3/3 epoch (loss 0.0971): 77%|ββββββββ | 1576/2046 [19:03<05:40, 1.38it/s]
Training 3/3 epoch (loss 0.0971): 77%|ββββββββ | 1577/2046 [19:03<05:41, 1.37it/s]
Training 3/3 epoch (loss 0.0984): 77%|ββββββββ | 1577/2046 [19:03<05:41, 1.37it/s]
Training 3/3 epoch (loss 0.0984): 77%|ββββββββ | 1578/2046 [19:03<05:25, 1.44it/s]
Training 3/3 epoch (loss 0.2018): 77%|ββββββββ | 1578/2046 [19:04<05:25, 1.44it/s]
Training 3/3 epoch (loss 0.2018): 77%|ββββββββ | 1579/2046 [19:04<05:15, 1.48it/s]
Training 3/3 epoch (loss 0.2256): 77%|ββββββββ | 1579/2046 [19:05<05:15, 1.48it/s]
Training 3/3 epoch (loss 0.2256): 77%|ββββββββ | 1580/2046 [19:05<05:10, 1.50it/s]
Training 3/3 epoch (loss 0.1368): 77%|ββββββββ | 1580/2046 [19:06<05:10, 1.50it/s]
Training 3/3 epoch (loss 0.1368): 77%|ββββββββ | 1581/2046 [19:06<05:40, 1.36it/s]
Training 3/3 epoch (loss 0.0777): 77%|ββββββββ | 1581/2046 [19:06<05:40, 1.36it/s]
Training 3/3 epoch (loss 0.0777): 77%|ββββββββ | 1582/2046 [19:06<05:26, 1.42it/s]
Training 3/3 epoch (loss 0.1266): 77%|ββββββββ | 1582/2046 [19:07<05:26, 1.42it/s]
Training 3/3 epoch (loss 0.1266): 77%|ββββββββ | 1583/2046 [19:07<05:18, 1.46it/s]
Training 3/3 epoch (loss 0.1285): 77%|ββββββββ | 1583/2046 [19:08<05:18, 1.46it/s]
Training 3/3 epoch (loss 0.1285): 77%|ββββββββ | 1584/2046 [19:08<05:34, 1.38it/s]
Training 3/3 epoch (loss 0.1548): 77%|ββββββββ | 1584/2046 [19:09<05:34, 1.38it/s]
Training 3/3 epoch (loss 0.1548): 77%|ββββββββ | 1585/2046 [19:09<05:49, 1.32it/s]
Training 3/3 epoch (loss 0.1291): 77%|ββββββββ | 1585/2046 [19:09<05:49, 1.32it/s]
Training 3/3 epoch (loss 0.1291): 78%|ββββββββ | 1586/2046 [19:09<05:32, 1.38it/s]
Training 3/3 epoch (loss 0.1393): 78%|ββββββββ | 1586/2046 [19:10<05:32, 1.38it/s]
Training 3/3 epoch (loss 0.1393): 78%|ββββββββ | 1587/2046 [19:10<05:20, 1.43it/s]
Training 3/3 epoch (loss 0.1142): 78%|ββββββββ | 1587/2046 [19:10<05:20, 1.43it/s]
Training 3/3 epoch (loss 0.1142): 78%|ββββββββ | 1588/2046 [19:10<05:06, 1.50it/s]
Training 3/3 epoch (loss 0.1239): 78%|ββββββββ | 1588/2046 [19:11<05:06, 1.50it/s]
Training 3/3 epoch (loss 0.1239): 78%|ββββββββ | 1589/2046 [19:11<05:00, 1.52it/s]
Training 3/3 epoch (loss 0.1506): 78%|ββββββββ | 1589/2046 [19:12<05:00, 1.52it/s]
Training 3/3 epoch (loss 0.1506): 78%|ββββββββ | 1590/2046 [19:12<05:02, 1.51it/s]
Training 3/3 epoch (loss 0.1474): 78%|ββββββββ | 1590/2046 [19:12<05:02, 1.51it/s]
Training 3/3 epoch (loss 0.1474): 78%|ββββββββ | 1591/2046 [19:12<05:08, 1.47it/s]
Training 3/3 epoch (loss 0.1517): 78%|ββββββββ | 1591/2046 [19:13<05:08, 1.47it/s]
Training 3/3 epoch (loss 0.1517): 78%|ββββββββ | 1592/2046 [19:13<05:16, 1.43it/s]
Training 3/3 epoch (loss 0.1467): 78%|ββββββββ | 1592/2046 [19:14<05:16, 1.43it/s]
Training 3/3 epoch (loss 0.1467): 78%|ββββββββ | 1593/2046 [19:14<05:12, 1.45it/s]
Training 3/3 epoch (loss 0.1687): 78%|ββββββββ | 1593/2046 [19:15<05:12, 1.45it/s]
Training 3/3 epoch (loss 0.1687): 78%|ββββββββ | 1594/2046 [19:15<05:36, 1.34it/s]
Training 3/3 epoch (loss 0.0941): 78%|ββββββββ | 1594/2046 [19:16<05:36, 1.34it/s]
Training 3/3 epoch (loss 0.0941): 78%|ββββββββ | 1595/2046 [19:16<05:44, 1.31it/s]
Training 3/3 epoch (loss 0.1023): 78%|ββββββββ | 1595/2046 [19:16<05:44, 1.31it/s]
Training 3/3 epoch (loss 0.1023): 78%|ββββββββ | 1596/2046 [19:16<05:46, 1.30it/s]
Training 3/3 epoch (loss 0.1448): 78%|ββββββββ | 1596/2046 [19:17<05:46, 1.30it/s]
Training 3/3 epoch (loss 0.1448): 78%|ββββββββ | 1597/2046 [19:17<05:26, 1.38it/s]
Training 3/3 epoch (loss 0.1602): 78%|ββββββββ | 1597/2046 [19:18<05:26, 1.38it/s]
Training 3/3 epoch (loss 0.1602): 78%|ββββββββ | 1598/2046 [19:18<05:18, 1.41it/s]
Training 3/3 epoch (loss 0.1228): 78%|ββββββββ | 1598/2046 [19:18<05:18, 1.41it/s]
Training 3/3 epoch (loss 0.1228): 78%|ββββββββ | 1599/2046 [19:18<05:09, 1.44it/s]
Training 3/3 epoch (loss 0.1963): 78%|ββββββββ | 1599/2046 [19:19<05:09, 1.44it/s]
Training 3/3 epoch (loss 0.1963): 78%|ββββββββ | 1600/2046 [19:19<06:11, 1.20it/s]
Training 3/3 epoch (loss 0.1201): 78%|ββββββββ | 1600/2046 [19:20<06:11, 1.20it/s]
Training 3/3 epoch (loss 0.1201): 78%|ββββββββ | 1601/2046 [19:20<05:52, 1.26it/s]
Training 3/3 epoch (loss 0.1457): 78%|ββββββββ | 1601/2046 [19:21<05:52, 1.26it/s]
Training 3/3 epoch (loss 0.1457): 78%|ββββββββ | 1602/2046 [19:21<05:49, 1.27it/s]
Training 3/3 epoch (loss 0.0801): 78%|ββββββββ | 1602/2046 [19:22<05:49, 1.27it/s]
Training 3/3 epoch (loss 0.0801): 78%|ββββββββ | 1603/2046 [19:22<05:52, 1.26it/s]
Training 3/3 epoch (loss 0.1280): 78%|ββββββββ | 1603/2046 [19:22<05:52, 1.26it/s]
Training 3/3 epoch (loss 0.1280): 78%|ββββββββ | 1604/2046 [19:22<05:41, 1.30it/s]
Training 3/3 epoch (loss 0.2308): 78%|ββββββββ | 1604/2046 [19:23<05:41, 1.30it/s]
Training 3/3 epoch (loss 0.2308): 78%|ββββββββ | 1605/2046 [19:23<05:19, 1.38it/s]
Training 3/3 epoch (loss 0.1397): 78%|ββββββββ | 1605/2046 [19:24<05:19, 1.38it/s]
Training 3/3 epoch (loss 0.1397): 78%|ββββββββ | 1606/2046 [19:24<05:28, 1.34it/s]
Training 3/3 epoch (loss 0.1263): 78%|ββββββββ | 1606/2046 [19:25<05:28, 1.34it/s]
Training 3/3 epoch (loss 0.1263): 79%|ββββββββ | 1607/2046 [19:25<05:41, 1.28it/s]
Training 3/3 epoch (loss 0.0997): 79%|ββββββββ | 1607/2046 [19:26<05:41, 1.28it/s]
Training 3/3 epoch (loss 0.0997): 79%|ββββββββ | 1608/2046 [19:26<05:59, 1.22it/s]
Training 3/3 epoch (loss 0.2058): 79%|ββββββββ | 1608/2046 [19:26<05:59, 1.22it/s]
Training 3/3 epoch (loss 0.2058): 79%|ββββββββ | 1609/2046 [19:26<05:43, 1.27it/s]
Training 3/3 epoch (loss 0.0832): 79%|ββββββββ | 1609/2046 [19:27<05:43, 1.27it/s]
Training 3/3 epoch (loss 0.0832): 79%|ββββββββ | 1610/2046 [19:27<05:20, 1.36it/s]
Training 3/3 epoch (loss 0.1806): 79%|ββββββββ | 1610/2046 [19:28<05:20, 1.36it/s]
Training 3/3 epoch (loss 0.1806): 79%|ββββββββ | 1611/2046 [19:28<05:04, 1.43it/s]
Training 3/3 epoch (loss 0.1234): 79%|ββββββββ | 1611/2046 [19:28<05:04, 1.43it/s]
Training 3/3 epoch (loss 0.1234): 79%|ββββββββ | 1612/2046 [19:28<04:59, 1.45it/s]
Training 3/3 epoch (loss 0.2625): 79%|ββββββββ | 1612/2046 [19:29<04:59, 1.45it/s]
Training 3/3 epoch (loss 0.2625): 79%|ββββββββ | 1613/2046 [19:29<05:25, 1.33it/s]
Training 3/3 epoch (loss 0.2663): 79%|ββββββββ | 1613/2046 [19:30<05:25, 1.33it/s]
Training 3/3 epoch (loss 0.2663): 79%|ββββββββ | 1614/2046 [19:30<05:12, 1.38it/s]
Training 3/3 epoch (loss 0.1600): 79%|ββββββββ | 1614/2046 [19:30<05:12, 1.38it/s]
Training 3/3 epoch (loss 0.1600): 79%|ββββββββ | 1615/2046 [19:30<05:14, 1.37it/s]
Training 3/3 epoch (loss 0.1615): 79%|ββββββββ | 1615/2046 [19:31<05:14, 1.37it/s]
Training 3/3 epoch (loss 0.1615): 79%|ββββββββ | 1616/2046 [19:31<05:21, 1.34it/s]
Training 3/3 epoch (loss 0.1291): 79%|ββββββββ | 1616/2046 [19:32<05:21, 1.34it/s]
Training 3/3 epoch (loss 0.1291): 79%|ββββββββ | 1617/2046 [19:32<05:22, 1.33it/s]
Training 3/3 epoch (loss 0.1726): 79%|ββββββββ | 1617/2046 [19:33<05:22, 1.33it/s]
Training 3/3 epoch (loss 0.1726): 79%|ββββββββ | 1618/2046 [19:33<05:06, 1.40it/s]
Training 3/3 epoch (loss 0.1344): 79%|ββββββββ | 1618/2046 [19:33<05:06, 1.40it/s]
Training 3/3 epoch (loss 0.1344): 79%|ββββββββ | 1619/2046 [19:33<04:54, 1.45it/s]
Training 3/3 epoch (loss 0.1210): 79%|ββββββββ | 1619/2046 [19:34<04:54, 1.45it/s]
Training 3/3 epoch (loss 0.1210): 79%|ββββββββ | 1620/2046 [19:34<04:44, 1.50it/s]
Training 3/3 epoch (loss 0.2250): 79%|ββββββββ | 1620/2046 [19:35<04:44, 1.50it/s]
Training 3/3 epoch (loss 0.2250): 79%|ββββββββ | 1621/2046 [19:35<04:37, 1.53it/s]
Training 3/3 epoch (loss 0.1701): 79%|ββββββββ | 1621/2046 [19:35<04:37, 1.53it/s]
Training 3/3 epoch (loss 0.1701): 79%|ββββββββ | 1622/2046 [19:35<04:33, 1.55it/s]
Training 3/3 epoch (loss 0.1709): 79%|ββββββββ | 1622/2046 [19:36<04:33, 1.55it/s]
Training 3/3 epoch (loss 0.1709): 79%|ββββββββ | 1623/2046 [19:36<04:43, 1.49it/s]
Training 3/3 epoch (loss 0.0953): 79%|ββββββββ | 1623/2046 [19:37<04:43, 1.49it/s]
Training 3/3 epoch (loss 0.0953): 79%|ββββββββ | 1624/2046 [19:37<04:58, 1.41it/s]
Training 3/3 epoch (loss 0.1504): 79%|ββββββββ | 1624/2046 [19:37<04:58, 1.41it/s]
Training 3/3 epoch (loss 0.1504): 79%|ββββββββ | 1625/2046 [19:37<05:08, 1.37it/s]
Training 3/3 epoch (loss 0.1451): 79%|ββββββββ | 1625/2046 [19:38<05:08, 1.37it/s]
Training 3/3 epoch (loss 0.1451): 79%|ββββββββ | 1626/2046 [19:38<05:05, 1.38it/s]
Training 3/3 epoch (loss 0.2128): 79%|ββββββββ | 1626/2046 [19:39<05:05, 1.38it/s]
Training 3/3 epoch (loss 0.2128): 80%|ββββββββ | 1627/2046 [19:39<04:53, 1.43it/s]
Training 3/3 epoch (loss 0.1378): 80%|ββββββββ | 1627/2046 [19:39<04:53, 1.43it/s]
Training 3/3 epoch (loss 0.1378): 80%|ββββββββ | 1628/2046 [19:39<04:41, 1.49it/s]
Training 3/3 epoch (loss 0.1480): 80%|ββββββββ | 1628/2046 [19:40<04:41, 1.49it/s]
Training 3/3 epoch (loss 0.1480): 80%|ββββββββ | 1629/2046 [19:40<04:36, 1.51it/s]
Training 3/3 epoch (loss 0.0937): 80%|ββββββββ | 1629/2046 [19:41<04:36, 1.51it/s]
Training 3/3 epoch (loss 0.0937): 80%|ββββββββ | 1630/2046 [19:41<04:30, 1.54it/s]
Training 3/3 epoch (loss 0.1443): 80%|ββββββββ | 1630/2046 [19:41<04:30, 1.54it/s]
Training 3/3 epoch (loss 0.1443): 80%|ββββββββ | 1631/2046 [19:41<04:47, 1.44it/s]
Training 3/3 epoch (loss 0.1523): 80%|ββββββββ | 1631/2046 [19:42<04:47, 1.44it/s]
Training 3/3 epoch (loss 0.1523): 80%|ββββββββ | 1632/2046 [19:42<04:50, 1.43it/s]
Training 3/3 epoch (loss 0.1802): 80%|ββββββββ | 1632/2046 [19:43<04:50, 1.43it/s]
Training 3/3 epoch (loss 0.1802): 80%|ββββββββ | 1633/2046 [19:43<05:23, 1.28it/s]
Training 3/3 epoch (loss 0.1645): 80%|ββββββββ | 1633/2046 [19:44<05:23, 1.28it/s]
Training 3/3 epoch (loss 0.1645): 80%|ββββββββ | 1634/2046 [19:44<05:02, 1.36it/s]
Training 3/3 epoch (loss 0.1977): 80%|ββββββββ | 1634/2046 [19:45<05:02, 1.36it/s]
Training 3/3 epoch (loss 0.1977): 80%|ββββββββ | 1635/2046 [19:45<05:20, 1.28it/s]
Training 3/3 epoch (loss 0.1894): 80%|ββββββββ | 1635/2046 [19:45<05:20, 1.28it/s]
Training 3/3 epoch (loss 0.1894): 80%|ββββββββ | 1636/2046 [19:45<05:03, 1.35it/s]
Training 3/3 epoch (loss 0.2656): 80%|ββββββββ | 1636/2046 [19:46<05:03, 1.35it/s]
Training 3/3 epoch (loss 0.2656): 80%|ββββββββ | 1637/2046 [19:46<05:08, 1.33it/s]
Training 3/3 epoch (loss 0.1978): 80%|ββββββββ | 1637/2046 [19:47<05:08, 1.33it/s]
Training 3/3 epoch (loss 0.1978): 80%|ββββββββ | 1638/2046 [19:47<04:52, 1.39it/s]
Training 3/3 epoch (loss 0.1042): 80%|ββββββββ | 1638/2046 [19:47<04:52, 1.39it/s]
Training 3/3 epoch (loss 0.1042): 80%|ββββββββ | 1639/2046 [19:47<04:52, 1.39it/s]
Training 3/3 epoch (loss 0.1818): 80%|ββββββββ | 1639/2046 [19:48<04:52, 1.39it/s]
Training 3/3 epoch (loss 0.1818): 80%|ββββββββ | 1640/2046 [19:48<05:09, 1.31it/s]
Training 3/3 epoch (loss 0.0830): 80%|ββββββββ | 1640/2046 [19:49<05:09, 1.31it/s]
Training 3/3 epoch (loss 0.0830): 80%|ββββββββ | 1641/2046 [19:49<05:31, 1.22it/s]
Training 3/3 epoch (loss 0.1419): 80%|ββββββββ | 1641/2046 [19:50<05:31, 1.22it/s]
Training 3/3 epoch (loss 0.1419): 80%|ββββββββ | 1642/2046 [19:50<05:37, 1.20it/s]
Training 3/3 epoch (loss 0.1327): 80%|ββββββββ | 1642/2046 [19:51<05:37, 1.20it/s]
Training 3/3 epoch (loss 0.1327): 80%|ββββββββ | 1643/2046 [19:51<05:29, 1.22it/s]
Training 3/3 epoch (loss 0.1905): 80%|ββββββββ | 1643/2046 [19:52<05:29, 1.22it/s]
Training 3/3 epoch (loss 0.1905): 80%|ββββββββ | 1644/2046 [19:52<05:05, 1.32it/s]
Training 3/3 epoch (loss 0.1051): 80%|ββββββββ | 1644/2046 [19:52<05:05, 1.32it/s]
Training 3/3 epoch (loss 0.1051): 80%|ββββββββ | 1645/2046 [19:52<04:58, 1.34it/s]
Training 3/3 epoch (loss 0.1453): 80%|ββββββββ | 1645/2046 [19:53<04:58, 1.34it/s]
Training 3/3 epoch (loss 0.1453): 80%|ββββββββ | 1646/2046 [19:53<05:16, 1.26it/s]
Training 3/3 epoch (loss 0.1497): 80%|ββββββββ | 1646/2046 [19:54<05:16, 1.26it/s]
Training 3/3 epoch (loss 0.1497): 80%|ββββββββ | 1647/2046 [19:54<05:30, 1.21it/s]
Training 3/3 epoch (loss 0.1019): 80%|ββββββββ | 1647/2046 [19:55<05:30, 1.21it/s]
Training 3/3 epoch (loss 0.1019): 81%|ββββββββ | 1648/2046 [19:55<05:49, 1.14it/s]
Training 3/3 epoch (loss 0.1589): 81%|ββββββββ | 1648/2046 [19:56<05:49, 1.14it/s]
Training 3/3 epoch (loss 0.1589): 81%|ββββββββ | 1649/2046 [19:56<05:35, 1.18it/s]
Training 3/3 epoch (loss 0.1475): 81%|ββββββββ | 1649/2046 [19:57<05:35, 1.18it/s]
Training 3/3 epoch (loss 0.1475): 81%|ββββββββ | 1650/2046 [19:57<05:37, 1.17it/s]
Training 3/3 epoch (loss 0.1907): 81%|ββββββββ | 1650/2046 [19:57<05:37, 1.17it/s]
Training 3/3 epoch (loss 0.1907): 81%|ββββββββ | 1651/2046 [19:57<05:17, 1.25it/s]
Training 3/3 epoch (loss 0.1678): 81%|ββββββββ | 1651/2046 [19:58<05:17, 1.25it/s]
Training 3/3 epoch (loss 0.1678): 81%|ββββββββ | 1652/2046 [19:58<05:05, 1.29it/s]
Training 3/3 epoch (loss 0.1626): 81%|ββββββββ | 1652/2046 [19:59<05:05, 1.29it/s]
Training 3/3 epoch (loss 0.1626): 81%|ββββββββ | 1653/2046 [19:59<04:52, 1.34it/s]
Training 3/3 epoch (loss 0.0840): 81%|ββββββββ | 1653/2046 [20:00<04:52, 1.34it/s]
Training 3/3 epoch (loss 0.0840): 81%|ββββββββ | 1654/2046 [20:00<04:47, 1.37it/s]
Training 3/3 epoch (loss 0.1027): 81%|ββββββββ | 1654/2046 [20:00<04:47, 1.37it/s]
Training 3/3 epoch (loss 0.1027): 81%|ββββββββ | 1655/2046 [20:00<04:41, 1.39it/s]
Training 3/3 epoch (loss 0.1014): 81%|ββββββββ | 1655/2046 [20:01<04:41, 1.39it/s]
Training 3/3 epoch (loss 0.1014): 81%|ββββββββ | 1656/2046 [20:01<05:06, 1.27it/s]
Training 3/3 epoch (loss 0.1439): 81%|ββββββββ | 1656/2046 [20:02<05:06, 1.27it/s]
Training 3/3 epoch (loss 0.1439): 81%|ββββββββ | 1657/2046 [20:02<04:56, 1.31it/s]
Training 3/3 epoch (loss 0.0994): 81%|ββββββββ | 1657/2046 [20:02<04:56, 1.31it/s]
Training 3/3 epoch (loss 0.0994): 81%|ββββββββ | 1658/2046 [20:02<04:39, 1.39it/s]
Training 3/3 epoch (loss 0.1329): 81%|ββββββββ | 1658/2046 [20:03<04:39, 1.39it/s]
Training 3/3 epoch (loss 0.1329): 81%|ββββββββ | 1659/2046 [20:03<04:24, 1.46it/s]
Training 3/3 epoch (loss 0.1972): 81%|ββββββββ | 1659/2046 [20:04<04:24, 1.46it/s]
Training 3/3 epoch (loss 0.1972): 81%|ββββββββ | 1660/2046 [20:04<04:18, 1.49it/s]
Training 3/3 epoch (loss 0.1394): 81%|ββββββββ | 1660/2046 [20:04<04:18, 1.49it/s]
Training 3/3 epoch (loss 0.1394): 81%|ββββββββ | 1661/2046 [20:04<04:12, 1.52it/s]
Training 3/3 epoch (loss 0.1606): 81%|ββββββββ | 1661/2046 [20:05<04:12, 1.52it/s]
Training 3/3 epoch (loss 0.1606): 81%|ββββββββ | 1662/2046 [20:05<04:14, 1.51it/s]
Training 3/3 epoch (loss 0.2088): 81%|ββββββββ | 1662/2046 [20:06<04:14, 1.51it/s]
Training 3/3 epoch (loss 0.2088): 81%|βββββββββ | 1663/2046 [20:06<04:09, 1.53it/s]
Training 3/3 epoch (loss 0.1892): 81%|βββββββββ | 1663/2046 [20:06<04:09, 1.53it/s]
Training 3/3 epoch (loss 0.1892): 81%|βββββββββ | 1664/2046 [20:06<04:18, 1.48it/s]
Training 3/3 epoch (loss 0.1085): 81%|βββββββββ | 1664/2046 [20:07<04:18, 1.48it/s]
Training 3/3 epoch (loss 0.1085): 81%|βββββββββ | 1665/2046 [20:07<04:22, 1.45it/s]
Training 3/3 epoch (loss 0.1292): 81%|βββββββββ | 1665/2046 [20:08<04:22, 1.45it/s]
Training 3/3 epoch (loss 0.1292): 81%|βββββββββ | 1666/2046 [20:08<04:26, 1.43it/s]
Training 3/3 epoch (loss 0.1191): 81%|βββββββββ | 1666/2046 [20:08<04:26, 1.43it/s]
Training 3/3 epoch (loss 0.1191): 81%|βββββββββ | 1667/2046 [20:08<04:17, 1.47it/s]
Training 3/3 epoch (loss 0.1371): 81%|βββββββββ | 1667/2046 [20:09<04:17, 1.47it/s]
Training 3/3 epoch (loss 0.1371): 82%|βββββββββ | 1668/2046 [20:09<04:12, 1.50it/s]
Training 3/3 epoch (loss 0.0775): 82%|βββββββββ | 1668/2046 [20:10<04:12, 1.50it/s]
Training 3/3 epoch (loss 0.0775): 82%|βββββββββ | 1669/2046 [20:10<04:07, 1.52it/s]
Training 3/3 epoch (loss 0.1500): 82%|βββββββββ | 1669/2046 [20:10<04:07, 1.52it/s]
Training 3/3 epoch (loss 0.1500): 82%|βββββββββ | 1670/2046 [20:10<03:59, 1.57it/s]
Training 3/3 epoch (loss 0.1760): 82%|βββββββββ | 1670/2046 [20:11<03:59, 1.57it/s]
Training 3/3 epoch (loss 0.1760): 82%|βββββββββ | 1671/2046 [20:11<04:03, 1.54it/s]
Training 3/3 epoch (loss 0.1364): 82%|βββββββββ | 1671/2046 [20:12<04:03, 1.54it/s]
Training 3/3 epoch (loss 0.1364): 82%|βββββββββ | 1672/2046 [20:12<04:22, 1.42it/s]
Training 3/3 epoch (loss 0.1068): 82%|βββββββββ | 1672/2046 [20:13<04:22, 1.42it/s]
Training 3/3 epoch (loss 0.1068): 82%|βββββββββ | 1673/2046 [20:13<04:32, 1.37it/s]
Training 3/3 epoch (loss 0.1500): 82%|βββββββββ | 1673/2046 [20:13<04:32, 1.37it/s]
Training 3/3 epoch (loss 0.1500): 82%|βββββββββ | 1674/2046 [20:13<04:40, 1.33it/s]
Training 3/3 epoch (loss 0.1031): 82%|βββββββββ | 1674/2046 [20:14<04:40, 1.33it/s]
Training 3/3 epoch (loss 0.1031): 82%|βββββββββ | 1675/2046 [20:14<04:32, 1.36it/s]
Training 3/3 epoch (loss 0.1178): 82%|βββββββββ | 1675/2046 [20:15<04:32, 1.36it/s]
Training 3/3 epoch (loss 0.1178): 82%|βββββββββ | 1676/2046 [20:15<04:20, 1.42it/s]
Training 3/3 epoch (loss 0.1809): 82%|βββββββββ | 1676/2046 [20:15<04:20, 1.42it/s]
Training 3/3 epoch (loss 0.1809): 82%|βββββββββ | 1677/2046 [20:15<04:20, 1.42it/s]
Training 3/3 epoch (loss 0.1179): 82%|βββββββββ | 1677/2046 [20:16<04:20, 1.42it/s]
Training 3/3 epoch (loss 0.1179): 82%|βββββββββ | 1678/2046 [20:16<04:21, 1.41it/s]
Training 3/3 epoch (loss 0.1849): 82%|βββββββββ | 1678/2046 [20:17<04:21, 1.41it/s]
Training 3/3 epoch (loss 0.1849): 82%|βββββββββ | 1679/2046 [20:17<04:12, 1.45it/s]
Training 3/3 epoch (loss 0.2255): 82%|βββββββββ | 1679/2046 [20:18<04:12, 1.45it/s]
Training 3/3 epoch (loss 0.2255): 82%|βββββββββ | 1680/2046 [20:18<04:34, 1.33it/s]
Training 3/3 epoch (loss 0.1143): 82%|βββββββββ | 1680/2046 [20:18<04:34, 1.33it/s]
Training 3/3 epoch (loss 0.1143): 82%|βββββββββ | 1681/2046 [20:18<04:30, 1.35it/s]
Training 3/3 epoch (loss 0.1113): 82%|βββββββββ | 1681/2046 [20:19<04:30, 1.35it/s]
Training 3/3 epoch (loss 0.1113): 82%|βββββββββ | 1682/2046 [20:19<04:28, 1.36it/s]
Training 3/3 epoch (loss 0.1360): 82%|βββββββββ | 1682/2046 [20:20<04:28, 1.36it/s]
Training 3/3 epoch (loss 0.1360): 82%|βββββββββ | 1683/2046 [20:20<04:50, 1.25it/s]
Training 3/3 epoch (loss 0.1609): 82%|βββββββββ | 1683/2046 [20:21<04:50, 1.25it/s]
Training 3/3 epoch (loss 0.1609): 82%|βββββββββ | 1684/2046 [20:21<04:31, 1.33it/s]
Training 3/3 epoch (loss 0.1430): 82%|βββββββββ | 1684/2046 [20:21<04:31, 1.33it/s]
Training 3/3 epoch (loss 0.1430): 82%|βββββββββ | 1685/2046 [20:21<04:16, 1.41it/s]
Training 3/3 epoch (loss 0.2039): 82%|βββββββββ | 1685/2046 [20:22<04:16, 1.41it/s]
Training 3/3 epoch (loss 0.2039): 82%|βββββββββ | 1686/2046 [20:22<04:16, 1.40it/s]
Training 3/3 epoch (loss 0.1740): 82%|βββββββββ | 1686/2046 [20:23<04:16, 1.40it/s]
Training 3/3 epoch (loss 0.1740): 82%|βββββββββ | 1687/2046 [20:23<04:03, 1.47it/s]
Training 3/3 epoch (loss 0.1415): 82%|βββββββββ | 1687/2046 [20:24<04:03, 1.47it/s]
Training 3/3 epoch (loss 0.1415): 83%|βββββββββ | 1688/2046 [20:24<04:32, 1.31it/s]
Training 3/3 epoch (loss 0.1970): 83%|βββββββββ | 1688/2046 [20:24<04:32, 1.31it/s]
Training 3/3 epoch (loss 0.1970): 83%|βββββββββ | 1689/2046 [20:24<04:16, 1.39it/s]
Training 3/3 epoch (loss 0.0899): 83%|βββββββββ | 1689/2046 [20:25<04:16, 1.39it/s]
Training 3/3 epoch (loss 0.0899): 83%|βββββββββ | 1690/2046 [20:25<04:06, 1.44it/s]
Training 3/3 epoch (loss 0.1320): 83%|βββββββββ | 1690/2046 [20:26<04:06, 1.44it/s]
Training 3/3 epoch (loss 0.1320): 83%|βββββββββ | 1691/2046 [20:26<04:08, 1.43it/s]
Training 3/3 epoch (loss 0.1498): 83%|βββββββββ | 1691/2046 [20:26<04:08, 1.43it/s]
Training 3/3 epoch (loss 0.1498): 83%|βββββββββ | 1692/2046 [20:26<04:01, 1.46it/s]
Training 3/3 epoch (loss 0.1187): 83%|βββββββββ | 1692/2046 [20:27<04:01, 1.46it/s]
Training 3/3 epoch (loss 0.1187): 83%|βββββββββ | 1693/2046 [20:27<03:52, 1.52it/s]
Training 3/3 epoch (loss 0.1492): 83%|βββββββββ | 1693/2046 [20:28<03:52, 1.52it/s]
Training 3/3 epoch (loss 0.1492): 83%|βββββββββ | 1694/2046 [20:28<04:01, 1.46it/s]
Training 3/3 epoch (loss 0.1020): 83%|βββββββββ | 1694/2046 [20:28<04:01, 1.46it/s]
Training 3/3 epoch (loss 0.1020): 83%|βββββββββ | 1695/2046 [20:28<03:55, 1.49it/s]
Training 3/3 epoch (loss 0.1401): 83%|βββββββββ | 1695/2046 [20:29<03:55, 1.49it/s]
Training 3/3 epoch (loss 0.1401): 83%|βββββββββ | 1696/2046 [20:29<04:07, 1.41it/s]
Training 3/3 epoch (loss 0.1202): 83%|βββββββββ | 1696/2046 [20:30<04:07, 1.41it/s]
Training 3/3 epoch (loss 0.1202): 83%|βββββββββ | 1697/2046 [20:30<03:58, 1.46it/s]
Training 3/3 epoch (loss 0.0810): 83%|βββββββββ | 1697/2046 [20:30<03:58, 1.46it/s]
Training 3/3 epoch (loss 0.0810): 83%|βββββββββ | 1698/2046 [20:30<03:55, 1.48it/s]
Training 3/3 epoch (loss 0.1091): 83%|βββββββββ | 1698/2046 [20:31<03:55, 1.48it/s]
Training 3/3 epoch (loss 0.1091): 83%|βββββββββ | 1699/2046 [20:31<04:00, 1.44it/s]
Training 3/3 epoch (loss 0.1316): 83%|βββββββββ | 1699/2046 [20:32<04:00, 1.44it/s]
Training 3/3 epoch (loss 0.1316): 83%|βββββββββ | 1700/2046 [20:32<03:49, 1.50it/s]
Training 3/3 epoch (loss 0.1207): 83%|βββββββββ | 1700/2046 [20:32<03:49, 1.50it/s]
Training 3/3 epoch (loss 0.1207): 83%|βββββββββ | 1701/2046 [20:32<04:03, 1.42it/s]
Training 3/3 epoch (loss 0.1986): 83%|βββββββββ | 1701/2046 [20:33<04:03, 1.42it/s]
Training 3/3 epoch (loss 0.1986): 83%|βββββββββ | 1702/2046 [20:33<03:53, 1.47it/s]
Training 3/3 epoch (loss 0.1240): 83%|βββββββββ | 1702/2046 [20:34<03:53, 1.47it/s]
Training 3/3 epoch (loss 0.1240): 83%|βββββββββ | 1703/2046 [20:34<03:51, 1.48it/s]
Training 3/3 epoch (loss 0.1523): 83%|βββββββββ | 1703/2046 [20:35<03:51, 1.48it/s]
Training 3/3 epoch (loss 0.1523): 83%|βββββββββ | 1704/2046 [20:35<04:22, 1.30it/s]
Training 3/3 epoch (loss 0.1329): 83%|βββββββββ | 1704/2046 [20:36<04:22, 1.30it/s]
Training 3/3 epoch (loss 0.1329): 83%|βββββββββ | 1705/2046 [20:36<04:28, 1.27it/s]
Training 3/3 epoch (loss 0.1403): 83%|βββββββββ | 1705/2046 [20:36<04:28, 1.27it/s]
Training 3/3 epoch (loss 0.1403): 83%|βββββββββ | 1706/2046 [20:36<04:13, 1.34it/s]
Training 3/3 epoch (loss 0.1303): 83%|βββββββββ | 1706/2046 [20:37<04:13, 1.34it/s]
Training 3/3 epoch (loss 0.1303): 83%|βββββββββ | 1707/2046 [20:37<04:10, 1.35it/s]
Training 3/3 epoch (loss 0.1631): 83%|βββββββββ | 1707/2046 [20:38<04:10, 1.35it/s]
Training 3/3 epoch (loss 0.1631): 83%|βββββββββ | 1708/2046 [20:38<03:58, 1.42it/s]
Training 3/3 epoch (loss 0.1685): 83%|βββββββββ | 1708/2046 [20:38<03:58, 1.42it/s]
Training 3/3 epoch (loss 0.1685): 84%|βββββββββ | 1709/2046 [20:38<03:49, 1.47it/s]
Training 3/3 epoch (loss 0.0517): 84%|βββββββββ | 1709/2046 [20:39<03:49, 1.47it/s]
Training 3/3 epoch (loss 0.0517): 84%|βββββββββ | 1710/2046 [20:39<03:43, 1.50it/s]
Training 3/3 epoch (loss 0.0857): 84%|βββββββββ | 1710/2046 [20:39<03:43, 1.50it/s]
Training 3/3 epoch (loss 0.0857): 84%|βββββββββ | 1711/2046 [20:39<03:37, 1.54it/s]
Training 3/3 epoch (loss 0.1559): 84%|βββββββββ | 1711/2046 [20:40<03:37, 1.54it/s]
Training 3/3 epoch (loss 0.1559): 84%|βββββββββ | 1712/2046 [20:40<03:45, 1.48it/s]
Training 3/3 epoch (loss 0.2280): 84%|βββββββββ | 1712/2046 [20:41<03:45, 1.48it/s]
Training 3/3 epoch (loss 0.2280): 84%|βββββββββ | 1713/2046 [20:41<03:41, 1.50it/s]
Training 3/3 epoch (loss 0.1527): 84%|βββββββββ | 1713/2046 [20:41<03:41, 1.50it/s]
Training 3/3 epoch (loss 0.1527): 84%|βββββββββ | 1714/2046 [20:41<03:38, 1.52it/s]
Training 3/3 epoch (loss 0.1070): 84%|βββββββββ | 1714/2046 [20:42<03:38, 1.52it/s]
Training 3/3 epoch (loss 0.1070): 84%|βββββββββ | 1715/2046 [20:42<03:47, 1.45it/s]
Training 3/3 epoch (loss 0.0998): 84%|βββββββββ | 1715/2046 [20:43<03:47, 1.45it/s]
Training 3/3 epoch (loss 0.0998): 84%|βββββββββ | 1716/2046 [20:43<03:42, 1.48it/s]
Training 3/3 epoch (loss 0.0784): 84%|βββββββββ | 1716/2046 [20:44<03:42, 1.48it/s]
Training 3/3 epoch (loss 0.0784): 84%|βββββββββ | 1717/2046 [20:44<03:52, 1.42it/s]
Training 3/3 epoch (loss 0.1171): 84%|βββββββββ | 1717/2046 [20:44<03:52, 1.42it/s]
Training 3/3 epoch (loss 0.1171): 84%|βββββββββ | 1718/2046 [20:44<03:43, 1.47it/s]
Training 3/3 epoch (loss 0.1061): 84%|βββββββββ | 1718/2046 [20:45<03:43, 1.47it/s]
Training 3/3 epoch (loss 0.1061): 84%|βββββββββ | 1719/2046 [20:45<03:46, 1.44it/s]
Training 3/3 epoch (loss 0.1206): 84%|βββββββββ | 1719/2046 [20:46<03:46, 1.44it/s]
Training 3/3 epoch (loss 0.1206): 84%|βββββββββ | 1720/2046 [20:46<03:55, 1.38it/s]
Training 3/3 epoch (loss 0.1393): 84%|βββββββββ | 1720/2046 [20:46<03:55, 1.38it/s]
Training 3/3 epoch (loss 0.1393): 84%|βββββββββ | 1721/2046 [20:46<04:00, 1.35it/s]
Training 3/3 epoch (loss 0.1410): 84%|βββββββββ | 1721/2046 [20:47<04:00, 1.35it/s]
Training 3/3 epoch (loss 0.1410): 84%|βββββββββ | 1722/2046 [20:47<03:54, 1.38it/s]
Training 3/3 epoch (loss 0.1709): 84%|βββββββββ | 1722/2046 [20:48<03:54, 1.38it/s]
Training 3/3 epoch (loss 0.1709): 84%|βββββββββ | 1723/2046 [20:48<03:47, 1.42it/s]
Training 3/3 epoch (loss 0.1031): 84%|βββββββββ | 1723/2046 [20:49<03:47, 1.42it/s]
Training 3/3 epoch (loss 0.1031): 84%|βββββββββ | 1724/2046 [20:49<03:59, 1.34it/s]
Training 3/3 epoch (loss 0.1305): 84%|βββββββββ | 1724/2046 [20:50<03:59, 1.34it/s]
Training 3/3 epoch (loss 0.1305): 84%|βββββββββ | 1725/2046 [20:50<04:06, 1.30it/s]
Training 3/3 epoch (loss 0.1339): 84%|βββββββββ | 1725/2046 [20:50<04:06, 1.30it/s]
Training 3/3 epoch (loss 0.1339): 84%|βββββββββ | 1726/2046 [20:50<03:55, 1.36it/s]
Training 3/3 epoch (loss 0.1540): 84%|βββββββββ | 1726/2046 [20:51<03:55, 1.36it/s]
Training 3/3 epoch (loss 0.1540): 84%|βββββββββ | 1727/2046 [20:51<04:08, 1.28it/s]
Training 3/3 epoch (loss 0.1729): 84%|βββββββββ | 1727/2046 [20:52<04:08, 1.28it/s]
Training 3/3 epoch (loss 0.1729): 84%|βββββββββ | 1728/2046 [20:52<04:13, 1.25it/s]
Training 3/3 epoch (loss 0.0767): 84%|βββββββββ | 1728/2046 [20:53<04:13, 1.25it/s]
Training 3/3 epoch (loss 0.0767): 85%|βββββββββ | 1729/2046 [20:53<03:59, 1.33it/s]
Training 3/3 epoch (loss 0.1239): 85%|βββββββββ | 1729/2046 [20:53<03:59, 1.33it/s]
Training 3/3 epoch (loss 0.1239): 85%|βββββββββ | 1730/2046 [20:53<03:51, 1.36it/s]
Training 3/3 epoch (loss 0.0962): 85%|βββββββββ | 1730/2046 [20:54<03:51, 1.36it/s]
Training 3/3 epoch (loss 0.0962): 85%|βββββββββ | 1731/2046 [20:54<04:05, 1.29it/s]
Training 3/3 epoch (loss 0.1462): 85%|βββββββββ | 1731/2046 [20:55<04:05, 1.29it/s]
Training 3/3 epoch (loss 0.1462): 85%|βββββββββ | 1732/2046 [20:55<03:59, 1.31it/s]
Training 3/3 epoch (loss 0.1097): 85%|βββββββββ | 1732/2046 [20:55<03:59, 1.31it/s]
Training 3/3 epoch (loss 0.1097): 85%|βββββββββ | 1733/2046 [20:55<03:48, 1.37it/s]
Training 3/3 epoch (loss 0.1129): 85%|βββββββββ | 1733/2046 [20:56<03:48, 1.37it/s]
Training 3/3 epoch (loss 0.1129): 85%|βββββββββ | 1734/2046 [20:56<03:42, 1.40it/s]
Training 3/3 epoch (loss 0.1207): 85%|βββββββββ | 1734/2046 [20:57<03:42, 1.40it/s]
Training 3/3 epoch (loss 0.1207): 85%|βββββββββ | 1735/2046 [20:57<03:50, 1.35it/s]
Training 3/3 epoch (loss 0.1949): 85%|βββββββββ | 1735/2046 [20:58<03:50, 1.35it/s]
Training 3/3 epoch (loss 0.1949): 85%|βββββββββ | 1736/2046 [20:58<03:58, 1.30it/s]
Training 3/3 epoch (loss 0.1725): 85%|βββββββββ | 1736/2046 [20:59<03:58, 1.30it/s]
Training 3/3 epoch (loss 0.1725): 85%|βββββββββ | 1737/2046 [20:59<03:51, 1.34it/s]
Training 3/3 epoch (loss 0.1024): 85%|βββββββββ | 1737/2046 [20:59<03:51, 1.34it/s]
Training 3/3 epoch (loss 0.1024): 85%|βββββββββ | 1738/2046 [20:59<04:02, 1.27it/s]
Training 3/3 epoch (loss 0.1703): 85%|βββββββββ | 1738/2046 [21:00<04:02, 1.27it/s]
Training 3/3 epoch (loss 0.1703): 85%|βββββββββ | 1739/2046 [21:00<03:44, 1.37it/s]
Training 3/3 epoch (loss 0.1103): 85%|βββββββββ | 1739/2046 [21:01<03:44, 1.37it/s]
Training 3/3 epoch (loss 0.1103): 85%|βββββββββ | 1740/2046 [21:01<03:33, 1.43it/s]
Training 3/3 epoch (loss 0.0920): 85%|βββββββββ | 1740/2046 [21:02<03:33, 1.43it/s]
Training 3/3 epoch (loss 0.0920): 85%|βββββββββ | 1741/2046 [21:02<04:16, 1.19it/s]
Training 3/3 epoch (loss 0.1255): 85%|βββββββββ | 1741/2046 [21:02<04:16, 1.19it/s]
Training 3/3 epoch (loss 0.1255): 85%|βββββββββ | 1742/2046 [21:02<03:56, 1.28it/s]
Training 3/3 epoch (loss 0.1087): 85%|βββββββββ | 1742/2046 [21:03<03:56, 1.28it/s]
Training 3/3 epoch (loss 0.1087): 85%|βββββββββ | 1743/2046 [21:03<03:42, 1.36it/s]
Training 3/3 epoch (loss 0.1517): 85%|βββββββββ | 1743/2046 [21:04<03:42, 1.36it/s]
Training 3/3 epoch (loss 0.1517): 85%|βββββββββ | 1744/2046 [21:04<04:10, 1.21it/s]
Training 3/3 epoch (loss 0.1168): 85%|βββββββββ | 1744/2046 [21:05<04:10, 1.21it/s]
Training 3/3 epoch (loss 0.1168): 85%|βββββββββ | 1745/2046 [21:05<03:51, 1.30it/s]
Training 3/3 epoch (loss 0.1076): 85%|βββββββββ | 1745/2046 [21:05<03:51, 1.30it/s]
Training 3/3 epoch (loss 0.1076): 85%|βββββββββ | 1746/2046 [21:05<03:37, 1.38it/s]
Training 3/3 epoch (loss 0.2306): 85%|βββββββββ | 1746/2046 [21:06<03:37, 1.38it/s]
Training 3/3 epoch (loss 0.2306): 85%|βββββββββ | 1747/2046 [21:06<03:32, 1.41it/s]
Training 3/3 epoch (loss 0.1342): 85%|βββββββββ | 1747/2046 [21:07<03:32, 1.41it/s]
Training 3/3 epoch (loss 0.1342): 85%|βββββββββ | 1748/2046 [21:07<03:27, 1.43it/s]
Training 3/3 epoch (loss 0.1327): 85%|βββββββββ | 1748/2046 [21:07<03:27, 1.43it/s]
Training 3/3 epoch (loss 0.1327): 85%|βββββββββ | 1749/2046 [21:07<03:21, 1.48it/s]
Training 3/3 epoch (loss 0.1277): 85%|βββββββββ | 1749/2046 [21:08<03:21, 1.48it/s]
Training 3/3 epoch (loss 0.1277): 86%|βββββββββ | 1750/2046 [21:08<03:14, 1.52it/s]
Training 3/3 epoch (loss 0.1491): 86%|βββββββββ | 1750/2046 [21:09<03:14, 1.52it/s]
Training 3/3 epoch (loss 0.1491): 86%|βββββββββ | 1751/2046 [21:09<03:26, 1.43it/s]
Training 3/3 epoch (loss 0.1379): 86%|βββββββββ | 1751/2046 [21:09<03:26, 1.43it/s]
Training 3/3 epoch (loss 0.1379): 86%|βββββββββ | 1752/2046 [21:09<03:30, 1.39it/s]
Training 3/3 epoch (loss 0.1546): 86%|βββββββββ | 1752/2046 [21:10<03:30, 1.39it/s]
Training 3/3 epoch (loss 0.1546): 86%|βββββββββ | 1753/2046 [21:10<03:20, 1.46it/s]
Training 3/3 epoch (loss 0.1405): 86%|βββββββββ | 1753/2046 [21:11<03:20, 1.46it/s]
Training 3/3 epoch (loss 0.1405): 86%|βββββββββ | 1754/2046 [21:11<03:12, 1.52it/s]
Training 3/3 epoch (loss 0.0775): 86%|βββββββββ | 1754/2046 [21:11<03:12, 1.52it/s]
Training 3/3 epoch (loss 0.0775): 86%|βββββββββ | 1755/2046 [21:11<03:08, 1.55it/s]
Training 3/3 epoch (loss 0.0971): 86%|βββββββββ | 1755/2046 [21:12<03:08, 1.55it/s]
Training 3/3 epoch (loss 0.0971): 86%|βββββββββ | 1756/2046 [21:12<03:03, 1.58it/s]
Training 3/3 epoch (loss 0.1580): 86%|βββββββββ | 1756/2046 [21:13<03:03, 1.58it/s]
Training 3/3 epoch (loss 0.1580): 86%|βββββββββ | 1757/2046 [21:13<03:07, 1.54it/s]
Training 3/3 epoch (loss 0.0996): 86%|βββββββββ | 1757/2046 [21:13<03:07, 1.54it/s]
Training 3/3 epoch (loss 0.0996): 86%|βββββββββ | 1758/2046 [21:13<03:03, 1.57it/s]
Training 3/3 epoch (loss 0.0855): 86%|βββββββββ | 1758/2046 [21:14<03:03, 1.57it/s]
Training 3/3 epoch (loss 0.0855): 86%|βββββββββ | 1759/2046 [21:14<03:00, 1.59it/s]
Training 3/3 epoch (loss 0.1197): 86%|βββββββββ | 1759/2046 [21:15<03:00, 1.59it/s]
Training 3/3 epoch (loss 0.1197): 86%|βββββββββ | 1760/2046 [21:15<03:21, 1.42it/s]
Training 3/3 epoch (loss 0.1339): 86%|βββββββββ | 1760/2046 [21:16<03:21, 1.42it/s]
Training 3/3 epoch (loss 0.1339): 86%|βββββββββ | 1761/2046 [21:16<03:29, 1.36it/s]
Training 3/3 epoch (loss 0.1602): 86%|βββββββββ | 1761/2046 [21:16<03:29, 1.36it/s]
Training 3/3 epoch (loss 0.1602): 86%|βββββββββ | 1762/2046 [21:16<03:25, 1.38it/s]
Training 3/3 epoch (loss 0.1974): 86%|βββββββββ | 1762/2046 [21:17<03:25, 1.38it/s]
Training 3/3 epoch (loss 0.1974): 86%|βββββββββ | 1763/2046 [21:17<03:23, 1.39it/s]
Training 3/3 epoch (loss 0.1689): 86%|βββββββββ | 1763/2046 [21:18<03:23, 1.39it/s]
Training 3/3 epoch (loss 0.1689): 86%|βββββββββ | 1764/2046 [21:18<03:20, 1.41it/s]
Training 3/3 epoch (loss 0.1353): 86%|βββββββββ | 1764/2046 [21:18<03:20, 1.41it/s]
Training 3/3 epoch (loss 0.1353): 86%|βββββββββ | 1765/2046 [21:18<03:18, 1.41it/s]
Training 3/3 epoch (loss 0.0986): 86%|βββββββββ | 1765/2046 [21:19<03:18, 1.41it/s]
Training 3/3 epoch (loss 0.0986): 86%|βββββββββ | 1766/2046 [21:19<03:24, 1.37it/s]
Training 3/3 epoch (loss 0.0743): 86%|βββββββββ | 1766/2046 [21:20<03:24, 1.37it/s]
Training 3/3 epoch (loss 0.0743): 86%|βββββββββ | 1767/2046 [21:20<03:15, 1.43it/s]
Training 3/3 epoch (loss 0.1362): 86%|βββββββββ | 1767/2046 [21:21<03:15, 1.43it/s]
Training 3/3 epoch (loss 0.1362): 86%|βββββββββ | 1768/2046 [21:21<03:23, 1.37it/s]
Training 3/3 epoch (loss 0.1373): 86%|βββββββββ | 1768/2046 [21:21<03:23, 1.37it/s]
Training 3/3 epoch (loss 0.1373): 86%|βββββββββ | 1769/2046 [21:21<03:26, 1.34it/s]
Training 3/3 epoch (loss 0.1049): 86%|βββββββββ | 1769/2046 [21:22<03:26, 1.34it/s]
Training 3/3 epoch (loss 0.1049): 87%|βββββββββ | 1770/2046 [21:22<03:17, 1.40it/s]
Training 3/3 epoch (loss 0.1531): 87%|βββββββββ | 1770/2046 [21:23<03:17, 1.40it/s]
Training 3/3 epoch (loss 0.1531): 87%|βββββββββ | 1771/2046 [21:23<03:07, 1.47it/s]
Training 3/3 epoch (loss 0.1283): 87%|βββββββββ | 1771/2046 [21:23<03:07, 1.47it/s]
Training 3/3 epoch (loss 0.1283): 87%|βββββββββ | 1772/2046 [21:23<03:12, 1.42it/s]
Training 3/3 epoch (loss 0.0622): 87%|βββββββββ | 1772/2046 [21:24<03:12, 1.42it/s]
Training 3/3 epoch (loss 0.0622): 87%|βββββββββ | 1773/2046 [21:24<03:12, 1.42it/s]
Training 3/3 epoch (loss 0.0735): 87%|βββββββββ | 1773/2046 [21:25<03:12, 1.42it/s]
Training 3/3 epoch (loss 0.0735): 87%|βββββββββ | 1774/2046 [21:25<03:44, 1.21it/s]
Training 3/3 epoch (loss 0.0925): 87%|βββββββββ | 1774/2046 [21:26<03:44, 1.21it/s]
Training 3/3 epoch (loss 0.0925): 87%|βββββββββ | 1775/2046 [21:26<03:27, 1.31it/s]
Training 3/3 epoch (loss 0.1417): 87%|βββββββββ | 1775/2046 [21:27<03:27, 1.31it/s]
Training 3/3 epoch (loss 0.1417): 87%|βββββββββ | 1776/2046 [21:27<03:26, 1.31it/s]
Training 3/3 epoch (loss 0.0922): 87%|βββββββββ | 1776/2046 [21:27<03:26, 1.31it/s]
Training 3/3 epoch (loss 0.0922): 87%|βββββββββ | 1777/2046 [21:27<03:18, 1.35it/s]
Training 3/3 epoch (loss 0.1769): 87%|βββββββββ | 1777/2046 [21:28<03:18, 1.35it/s]
Training 3/3 epoch (loss 0.1769): 87%|βββββββββ | 1778/2046 [21:28<03:24, 1.31it/s]
Training 3/3 epoch (loss 0.1256): 87%|βββββββββ | 1778/2046 [21:29<03:24, 1.31it/s]
Training 3/3 epoch (loss 0.1256): 87%|βββββββββ | 1779/2046 [21:29<03:12, 1.39it/s]
Training 3/3 epoch (loss 0.1120): 87%|βββββββββ | 1779/2046 [21:29<03:12, 1.39it/s]
Training 3/3 epoch (loss 0.1120): 87%|βββββββββ | 1780/2046 [21:29<03:08, 1.41it/s]
Training 3/3 epoch (loss 0.0899): 87%|βββββββββ | 1780/2046 [21:30<03:08, 1.41it/s]
Training 3/3 epoch (loss 0.0899): 87%|βββββββββ | 1781/2046 [21:30<03:05, 1.43it/s]
Training 3/3 epoch (loss 0.1032): 87%|βββββββββ | 1781/2046 [21:31<03:05, 1.43it/s]
Training 3/3 epoch (loss 0.1032): 87%|βββββββββ | 1782/2046 [21:31<03:00, 1.46it/s]
Training 3/3 epoch (loss 0.0735): 87%|βββββββββ | 1782/2046 [21:31<03:00, 1.46it/s]
Training 3/3 epoch (loss 0.0735): 87%|βββββββββ | 1783/2046 [21:31<02:59, 1.47it/s]
Training 3/3 epoch (loss 0.0732): 87%|βββββββββ | 1783/2046 [21:32<02:59, 1.47it/s]
Training 3/3 epoch (loss 0.0732): 87%|βββββββββ | 1784/2046 [21:32<03:23, 1.29it/s]
Training 3/3 epoch (loss 0.1530): 87%|βββββββββ | 1784/2046 [21:33<03:23, 1.29it/s]
Training 3/3 epoch (loss 0.1530): 87%|βββββββββ | 1785/2046 [21:33<03:20, 1.30it/s]
Training 3/3 epoch (loss 0.1216): 87%|βββββββββ | 1785/2046 [21:34<03:20, 1.30it/s]
Training 3/3 epoch (loss 0.1216): 87%|βββββββββ | 1786/2046 [21:34<03:27, 1.25it/s]
Training 3/3 epoch (loss 0.1037): 87%|βββββββββ | 1786/2046 [21:35<03:27, 1.25it/s]
Training 3/3 epoch (loss 0.1037): 87%|βββββββββ | 1787/2046 [21:35<03:15, 1.33it/s]
Training 3/3 epoch (loss 0.2362): 87%|βββββββββ | 1787/2046 [21:35<03:15, 1.33it/s]
Training 3/3 epoch (loss 0.2362): 87%|βββββββββ | 1788/2046 [21:35<03:11, 1.34it/s]
Training 3/3 epoch (loss 0.1490): 87%|βββββββββ | 1788/2046 [21:36<03:11, 1.34it/s]
Training 3/3 epoch (loss 0.1490): 87%|βββββββββ | 1789/2046 [21:36<03:23, 1.26it/s]
Training 3/3 epoch (loss 0.1078): 87%|βββββββββ | 1789/2046 [21:37<03:23, 1.26it/s]
Training 3/3 epoch (loss 0.1078): 87%|βββββββββ | 1790/2046 [21:37<03:10, 1.35it/s]
Training 3/3 epoch (loss 0.1444): 87%|βββββββββ | 1790/2046 [21:37<03:10, 1.35it/s]
Training 3/3 epoch (loss 0.1444): 88%|βββββββββ | 1791/2046 [21:37<03:01, 1.40it/s]
Training 3/3 epoch (loss 0.1651): 88%|βββββββββ | 1791/2046 [21:38<03:01, 1.40it/s]
Training 3/3 epoch (loss 0.1651): 88%|βββββββββ | 1792/2046 [21:38<03:17, 1.28it/s]
Training 3/3 epoch (loss 0.1568): 88%|βββββββββ | 1792/2046 [21:39<03:17, 1.28it/s]
Training 3/3 epoch (loss 0.1568): 88%|βββββββββ | 1793/2046 [21:39<03:20, 1.26it/s]
Training 3/3 epoch (loss 0.1422): 88%|βββββββββ | 1793/2046 [21:40<03:20, 1.26it/s]
Training 3/3 epoch (loss 0.1422): 88%|βββββββββ | 1794/2046 [21:40<03:19, 1.26it/s]
Training 3/3 epoch (loss 0.1376): 88%|βββββββββ | 1794/2046 [21:41<03:19, 1.26it/s]
Training 3/3 epoch (loss 0.1376): 88%|βββββββββ | 1795/2046 [21:41<03:14, 1.29it/s]
Training 3/3 epoch (loss 0.1330): 88%|βββββββββ | 1795/2046 [21:42<03:14, 1.29it/s]
Training 3/3 epoch (loss 0.1330): 88%|βββββββββ | 1796/2046 [21:42<03:12, 1.30it/s]
Training 3/3 epoch (loss 0.1632): 88%|βββββββββ | 1796/2046 [21:42<03:12, 1.30it/s]
Training 3/3 epoch (loss 0.1632): 88%|βββββββββ | 1797/2046 [21:42<03:11, 1.30it/s]
Training 3/3 epoch (loss 0.1718): 88%|βββββββββ | 1797/2046 [21:43<03:11, 1.30it/s]
Training 3/3 epoch (loss 0.1718): 88%|βββββββββ | 1798/2046 [21:43<02:58, 1.39it/s]
Training 3/3 epoch (loss 0.1183): 88%|βββββββββ | 1798/2046 [21:43<02:58, 1.39it/s]
Training 3/3 epoch (loss 0.1183): 88%|βββββββββ | 1799/2046 [21:43<02:49, 1.45it/s]
Training 3/3 epoch (loss 0.0887): 88%|βββββββββ | 1799/2046 [21:44<02:49, 1.45it/s]
Training 3/3 epoch (loss 0.0887): 88%|βββββββββ | 1800/2046 [21:44<03:05, 1.32it/s]
Training 3/3 epoch (loss 0.1134): 88%|βββββββββ | 1800/2046 [21:45<03:05, 1.32it/s]
Training 3/3 epoch (loss 0.1134): 88%|βββββββββ | 1801/2046 [21:45<03:06, 1.31it/s]
Training 3/3 epoch (loss 0.0641): 88%|βββββββββ | 1801/2046 [21:46<03:06, 1.31it/s]
Training 3/3 epoch (loss 0.0641): 88%|βββββββββ | 1802/2046 [21:46<03:02, 1.34it/s]
Training 3/3 epoch (loss 0.1664): 88%|βββββββββ | 1802/2046 [21:47<03:02, 1.34it/s]
Training 3/3 epoch (loss 0.1664): 88%|βββββββββ | 1803/2046 [21:47<02:54, 1.39it/s]
Training 3/3 epoch (loss 0.0645): 88%|βββββββββ | 1803/2046 [21:47<02:54, 1.39it/s]
Training 3/3 epoch (loss 0.0645): 88%|βββββββββ | 1804/2046 [21:47<02:49, 1.43it/s]
Training 3/3 epoch (loss 0.1669): 88%|βββββββββ | 1804/2046 [21:48<02:49, 1.43it/s]
Training 3/3 epoch (loss 0.1669): 88%|βββββββββ | 1805/2046 [21:48<02:43, 1.47it/s]
Training 3/3 epoch (loss 0.1961): 88%|βββββββββ | 1805/2046 [21:48<02:43, 1.47it/s]
Training 3/3 epoch (loss 0.1961): 88%|βββββββββ | 1806/2046 [21:48<02:38, 1.51it/s]
Training 3/3 epoch (loss 0.1205): 88%|βββββββββ | 1806/2046 [21:49<02:38, 1.51it/s]
Training 3/3 epoch (loss 0.1205): 88%|βββββββββ | 1807/2046 [21:49<02:36, 1.53it/s]
Training 3/3 epoch (loss 0.1012): 88%|βββββββββ | 1807/2046 [21:50<02:36, 1.53it/s]
Training 3/3 epoch (loss 0.1012): 88%|βββββββββ | 1808/2046 [21:50<02:41, 1.47it/s]
Training 3/3 epoch (loss 0.1376): 88%|βββββββββ | 1808/2046 [21:51<02:41, 1.47it/s]
Training 3/3 epoch (loss 0.1376): 88%|βββββββββ | 1809/2046 [21:51<02:39, 1.48it/s]
Training 3/3 epoch (loss 0.1180): 88%|βββββββββ | 1809/2046 [21:51<02:39, 1.48it/s]
Training 3/3 epoch (loss 0.1180): 88%|βββββββββ | 1810/2046 [21:51<02:36, 1.51it/s]
Training 3/3 epoch (loss 0.1400): 88%|βββββββββ | 1810/2046 [21:52<02:36, 1.51it/s]
Training 3/3 epoch (loss 0.1400): 89%|βββββββββ | 1811/2046 [21:52<02:37, 1.49it/s]
Training 3/3 epoch (loss 0.0751): 89%|βββββββββ | 1811/2046 [21:53<02:37, 1.49it/s]
Training 3/3 epoch (loss 0.0751): 89%|βββββββββ | 1812/2046 [21:53<02:38, 1.47it/s]
Training 3/3 epoch (loss 0.1078): 89%|βββββββββ | 1812/2046 [21:53<02:38, 1.47it/s]
Training 3/3 epoch (loss 0.1078): 89%|βββββββββ | 1813/2046 [21:53<02:48, 1.38it/s]
Training 3/3 epoch (loss 0.1187): 89%|βββββββββ | 1813/2046 [21:54<02:48, 1.38it/s]
Training 3/3 epoch (loss 0.1187): 89%|βββββββββ | 1814/2046 [21:54<02:50, 1.36it/s]
Training 3/3 epoch (loss 0.0990): 89%|βββββββββ | 1814/2046 [21:55<02:50, 1.36it/s]
Training 3/3 epoch (loss 0.0990): 89%|βββββββββ | 1815/2046 [21:55<02:47, 1.38it/s]
Training 3/3 epoch (loss 0.1141): 89%|βββββββββ | 1815/2046 [21:56<02:47, 1.38it/s]
Training 3/3 epoch (loss 0.1141): 89%|βββββββββ | 1816/2046 [21:56<02:48, 1.36it/s]
Training 3/3 epoch (loss 0.1613): 89%|βββββββββ | 1816/2046 [21:56<02:48, 1.36it/s]
Training 3/3 epoch (loss 0.1613): 89%|βββββββββ | 1817/2046 [21:56<02:41, 1.42it/s]
Training 3/3 epoch (loss 0.1420): 89%|βββββββββ | 1817/2046 [21:57<02:41, 1.42it/s]
Training 3/3 epoch (loss 0.1420): 89%|βββββββββ | 1818/2046 [21:57<02:48, 1.36it/s]
Training 3/3 epoch (loss 0.0966): 89%|βββββββββ | 1818/2046 [21:58<02:48, 1.36it/s]
Training 3/3 epoch (loss 0.0966): 89%|βββββββββ | 1819/2046 [21:58<02:40, 1.42it/s]
Training 3/3 epoch (loss 0.0996): 89%|βββββββββ | 1819/2046 [21:58<02:40, 1.42it/s]
Training 3/3 epoch (loss 0.0996): 89%|βββββββββ | 1820/2046 [21:58<02:35, 1.45it/s]
Training 3/3 epoch (loss 0.1758): 89%|βββββββββ | 1820/2046 [21:59<02:35, 1.45it/s]
Training 3/3 epoch (loss 0.1758): 89%|βββββββββ | 1821/2046 [21:59<02:42, 1.38it/s]
Training 3/3 epoch (loss 0.1103): 89%|βββββββββ | 1821/2046 [22:00<02:42, 1.38it/s]
Training 3/3 epoch (loss 0.1103): 89%|βββββββββ | 1822/2046 [22:00<02:37, 1.42it/s]
Training 3/3 epoch (loss 0.1374): 89%|βββββββββ | 1822/2046 [22:00<02:37, 1.42it/s]
Training 3/3 epoch (loss 0.1374): 89%|βββββββββ | 1823/2046 [22:00<02:29, 1.49it/s]
Training 3/3 epoch (loss 0.0831): 89%|βββββββββ | 1823/2046 [22:01<02:29, 1.49it/s]
Training 3/3 epoch (loss 0.0831): 89%|βββββββββ | 1824/2046 [22:01<02:43, 1.36it/s]
Training 3/3 epoch (loss 0.0685): 89%|βββββββββ | 1824/2046 [22:02<02:43, 1.36it/s]
Training 3/3 epoch (loss 0.0685): 89%|βββββββββ | 1825/2046 [22:02<02:40, 1.38it/s]
Training 3/3 epoch (loss 0.1151): 89%|βββββββββ | 1825/2046 [22:03<02:40, 1.38it/s]
Training 3/3 epoch (loss 0.1151): 89%|βββββββββ | 1826/2046 [22:03<02:34, 1.43it/s]
Training 3/3 epoch (loss 0.1012): 89%|βββββββββ | 1826/2046 [22:03<02:34, 1.43it/s]
Training 3/3 epoch (loss 0.1012): 89%|βββββββββ | 1827/2046 [22:03<02:30, 1.46it/s]
Training 3/3 epoch (loss 0.1426): 89%|βββββββββ | 1827/2046 [22:04<02:30, 1.46it/s]
Training 3/3 epoch (loss 0.1426): 89%|βββββββββ | 1828/2046 [22:04<02:30, 1.45it/s]
Training 3/3 epoch (loss 0.0780): 89%|βββββββββ | 1828/2046 [22:05<02:30, 1.45it/s]
Training 3/3 epoch (loss 0.0780): 89%|βββββββββ | 1829/2046 [22:05<02:26, 1.48it/s]
Training 3/3 epoch (loss 0.1489): 89%|βββββββββ | 1829/2046 [22:05<02:26, 1.48it/s]
Training 3/3 epoch (loss 0.1489): 89%|βββββββββ | 1830/2046 [22:05<02:22, 1.51it/s]
Training 3/3 epoch (loss 0.1270): 89%|βββββββββ | 1830/2046 [22:06<02:22, 1.51it/s]
Training 3/3 epoch (loss 0.1270): 89%|βββββββββ | 1831/2046 [22:06<02:21, 1.51it/s]
Training 3/3 epoch (loss 0.1029): 89%|βββββββββ | 1831/2046 [22:07<02:21, 1.51it/s]
Training 3/3 epoch (loss 0.1029): 90%|βββββββββ | 1832/2046 [22:07<02:28, 1.44it/s]
Training 3/3 epoch (loss 0.0959): 90%|βββββββββ | 1832/2046 [22:07<02:28, 1.44it/s]
Training 3/3 epoch (loss 0.0959): 90%|βββββββββ | 1833/2046 [22:07<02:38, 1.34it/s]
Training 3/3 epoch (loss 0.1358): 90%|βββββββββ | 1833/2046 [22:08<02:38, 1.34it/s]
Training 3/3 epoch (loss 0.1358): 90%|βββββββββ | 1834/2046 [22:08<02:33, 1.38it/s]
Training 3/3 epoch (loss 0.0613): 90%|βββββββββ | 1834/2046 [22:09<02:33, 1.38it/s]
Training 3/3 epoch (loss 0.0613): 90%|βββββββββ | 1835/2046 [22:09<02:28, 1.42it/s]
Training 3/3 epoch (loss 0.1112): 90%|βββββββββ | 1835/2046 [22:09<02:28, 1.42it/s]
Training 3/3 epoch (loss 0.1112): 90%|βββββββββ | 1836/2046 [22:09<02:22, 1.48it/s]
Training 3/3 epoch (loss 0.0948): 90%|βββββββββ | 1836/2046 [22:10<02:22, 1.48it/s]
Training 3/3 epoch (loss 0.0948): 90%|βββββββββ | 1837/2046 [22:10<02:20, 1.49it/s]
Training 3/3 epoch (loss 0.0739): 90%|βββββββββ | 1837/2046 [22:11<02:20, 1.49it/s]
Training 3/3 epoch (loss 0.0739): 90%|βββββββββ | 1838/2046 [22:11<02:34, 1.35it/s]
Training 3/3 epoch (loss 0.0802): 90%|βββββββββ | 1838/2046 [22:12<02:34, 1.35it/s]
Training 3/3 epoch (loss 0.0802): 90%|βββββββββ | 1839/2046 [22:12<02:33, 1.35it/s]
Training 3/3 epoch (loss 0.1088): 90%|βββββββββ | 1839/2046 [22:13<02:33, 1.35it/s]
Training 3/3 epoch (loss 0.1088): 90%|βββββββββ | 1840/2046 [22:13<02:36, 1.31it/s]
Training 3/3 epoch (loss 0.1669): 90%|βββββββββ | 1840/2046 [22:13<02:36, 1.31it/s]
Training 3/3 epoch (loss 0.1669): 90%|βββββββββ | 1841/2046 [22:13<02:26, 1.40it/s]
Training 3/3 epoch (loss 0.0550): 90%|βββββββββ | 1841/2046 [22:14<02:26, 1.40it/s]
Training 3/3 epoch (loss 0.0550): 90%|βββββββββ | 1842/2046 [22:14<02:20, 1.45it/s]
Training 3/3 epoch (loss 0.0963): 90%|βββββββββ | 1842/2046 [22:14<02:20, 1.45it/s]
Training 3/3 epoch (loss 0.0963): 90%|βββββββββ | 1843/2046 [22:14<02:15, 1.50it/s]
Training 3/3 epoch (loss 0.0795): 90%|βββββββββ | 1843/2046 [22:15<02:15, 1.50it/s]
Training 3/3 epoch (loss 0.0795): 90%|βββββββββ | 1844/2046 [22:15<02:12, 1.53it/s]
Training 3/3 epoch (loss 0.1452): 90%|βββββββββ | 1844/2046 [22:16<02:12, 1.53it/s]
Training 3/3 epoch (loss 0.1452): 90%|βββββββββ | 1845/2046 [22:16<02:09, 1.55it/s]
Training 3/3 epoch (loss 0.1653): 90%|βββββββββ | 1845/2046 [22:16<02:09, 1.55it/s]
Training 3/3 epoch (loss 0.1653): 90%|βββββββββ | 1846/2046 [22:16<02:08, 1.56it/s]
Training 3/3 epoch (loss 0.1150): 90%|βββββββββ | 1846/2046 [22:17<02:08, 1.56it/s]
Training 3/3 epoch (loss 0.1150): 90%|βββββββββ | 1847/2046 [22:17<02:06, 1.58it/s]
Training 3/3 epoch (loss 0.1140): 90%|βββββββββ | 1847/2046 [22:18<02:06, 1.58it/s]
Training 3/3 epoch (loss 0.1140): 90%|βββββββββ | 1848/2046 [22:18<02:14, 1.48it/s]
Training 3/3 epoch (loss 0.1918): 90%|βββββββββ | 1848/2046 [22:18<02:14, 1.48it/s]
Training 3/3 epoch (loss 0.1918): 90%|βββββββββ | 1849/2046 [22:18<02:14, 1.47it/s]
Training 3/3 epoch (loss 0.1190): 90%|βββββββββ | 1849/2046 [22:19<02:14, 1.47it/s]
Training 3/3 epoch (loss 0.1190): 90%|βββββββββ | 1850/2046 [22:19<02:11, 1.49it/s]
Training 3/3 epoch (loss 0.0927): 90%|βββββββββ | 1850/2046 [22:20<02:11, 1.49it/s]
Training 3/3 epoch (loss 0.0927): 90%|βββββββββ | 1851/2046 [22:20<02:11, 1.48it/s]
Training 3/3 epoch (loss 0.1167): 90%|βββββββββ | 1851/2046 [22:20<02:11, 1.48it/s]
Training 3/3 epoch (loss 0.1167): 91%|βββββββββ | 1852/2046 [22:20<02:07, 1.52it/s]
Training 3/3 epoch (loss 0.1501): 91%|βββββββββ | 1852/2046 [22:21<02:07, 1.52it/s]
Training 3/3 epoch (loss 0.1501): 91%|βββββββββ | 1853/2046 [22:21<02:18, 1.40it/s]
Training 3/3 epoch (loss 0.0609): 91%|βββββββββ | 1853/2046 [22:22<02:18, 1.40it/s]
Training 3/3 epoch (loss 0.0609): 91%|βββββββββ | 1854/2046 [22:22<02:35, 1.23it/s]
Training 3/3 epoch (loss 0.0945): 91%|βββββββββ | 1854/2046 [22:23<02:35, 1.23it/s]
Training 3/3 epoch (loss 0.0945): 91%|βββββββββ | 1855/2046 [22:23<02:23, 1.33it/s]
Training 3/3 epoch (loss 0.1108): 91%|βββββββββ | 1855/2046 [22:24<02:23, 1.33it/s]
Training 3/3 epoch (loss 0.1108): 91%|βββββββββ | 1856/2046 [22:24<02:27, 1.29it/s]
Training 3/3 epoch (loss 0.1819): 91%|βββββββββ | 1856/2046 [22:24<02:27, 1.29it/s]
Training 3/3 epoch (loss 0.1819): 91%|βββββββββ | 1857/2046 [22:24<02:24, 1.31it/s]
Training 3/3 epoch (loss 0.1644): 91%|βββββββββ | 1857/2046 [22:25<02:24, 1.31it/s]
Training 3/3 epoch (loss 0.1644): 91%|βββββββββ | 1858/2046 [22:25<02:20, 1.33it/s]
Training 3/3 epoch (loss 0.0950): 91%|βββββββββ | 1858/2046 [22:26<02:20, 1.33it/s]
Training 3/3 epoch (loss 0.0950): 91%|βββββββββ | 1859/2046 [22:26<02:14, 1.39it/s]
Training 3/3 epoch (loss 0.1509): 91%|βββββββββ | 1859/2046 [22:26<02:14, 1.39it/s]
Training 3/3 epoch (loss 0.1509): 91%|βββββββββ | 1860/2046 [22:26<02:13, 1.40it/s]
Training 3/3 epoch (loss 0.1196): 91%|βββββββββ | 1860/2046 [22:27<02:13, 1.40it/s]
Training 3/3 epoch (loss 0.1196): 91%|βββββββββ | 1861/2046 [22:27<02:07, 1.45it/s]
Training 3/3 epoch (loss 0.1891): 91%|βββββββββ | 1861/2046 [22:28<02:07, 1.45it/s]
Training 3/3 epoch (loss 0.1891): 91%|βββββββββ | 1862/2046 [22:28<02:12, 1.39it/s]
Training 3/3 epoch (loss 0.1553): 91%|βββββββββ | 1862/2046 [22:29<02:12, 1.39it/s]
Training 3/3 epoch (loss 0.1553): 91%|βββββββββ | 1863/2046 [22:29<02:08, 1.42it/s]
Training 3/3 epoch (loss 0.1509): 91%|βββββββββ | 1863/2046 [22:29<02:08, 1.42it/s]
Training 3/3 epoch (loss 0.1509): 91%|βββββββββ | 1864/2046 [22:29<02:09, 1.41it/s]
Training 3/3 epoch (loss 0.1040): 91%|βββββββββ | 1864/2046 [22:30<02:09, 1.41it/s]
Training 3/3 epoch (loss 0.1040): 91%|βββββββββ | 1865/2046 [22:30<02:06, 1.43it/s]
Training 3/3 epoch (loss 0.1525): 91%|βββββββββ | 1865/2046 [22:31<02:06, 1.43it/s]
Training 3/3 epoch (loss 0.1525): 91%|βββββββββ | 1866/2046 [22:31<02:09, 1.39it/s]
Training 3/3 epoch (loss 0.1014): 91%|βββββββββ | 1866/2046 [22:31<02:09, 1.39it/s]
Training 3/3 epoch (loss 0.1014): 91%|ββββββββββ| 1867/2046 [22:31<02:04, 1.44it/s]
Training 3/3 epoch (loss 0.1784): 91%|ββββββββββ| 1867/2046 [22:32<02:04, 1.44it/s]
Training 3/3 epoch (loss 0.1784): 91%|ββββββββββ| 1868/2046 [22:32<01:59, 1.49it/s]
Training 3/3 epoch (loss 0.1207): 91%|ββββββββββ| 1868/2046 [22:33<01:59, 1.49it/s]
Training 3/3 epoch (loss 0.1207): 91%|ββββββββββ| 1869/2046 [22:33<01:56, 1.52it/s]
Training 3/3 epoch (loss 0.1286): 91%|ββββββββββ| 1869/2046 [22:33<01:56, 1.52it/s]
Training 3/3 epoch (loss 0.1286): 91%|ββββββββββ| 1870/2046 [22:33<01:53, 1.54it/s]
Training 3/3 epoch (loss 0.1161): 91%|ββββββββββ| 1870/2046 [22:34<01:53, 1.54it/s]
Training 3/3 epoch (loss 0.1161): 91%|ββββββββββ| 1871/2046 [22:34<01:51, 1.57it/s]
Training 3/3 epoch (loss 0.1328): 91%|ββββββββββ| 1871/2046 [22:35<01:51, 1.57it/s]
Training 3/3 epoch (loss 0.1328): 91%|ββββββββββ| 1872/2046 [22:35<01:57, 1.48it/s]
Training 3/3 epoch (loss 0.1108): 91%|ββββββββββ| 1872/2046 [22:36<01:57, 1.48it/s]
Training 3/3 epoch (loss 0.1108): 92%|ββββββββββ| 1873/2046 [22:36<02:13, 1.29it/s]
Training 3/3 epoch (loss 0.1291): 92%|ββββββββββ| 1873/2046 [22:36<02:13, 1.29it/s]
Training 3/3 epoch (loss 0.1291): 92%|ββββββββββ| 1874/2046 [22:36<02:05, 1.38it/s]
Training 3/3 epoch (loss 0.0928): 92%|ββββββββββ| 1874/2046 [22:37<02:05, 1.38it/s]
Training 3/3 epoch (loss 0.0928): 92%|ββββββββββ| 1875/2046 [22:37<02:01, 1.41it/s]
Training 3/3 epoch (loss 0.1260): 92%|ββββββββββ| 1875/2046 [22:38<02:01, 1.41it/s]
Training 3/3 epoch (loss 0.1260): 92%|ββββββββββ| 1876/2046 [22:38<01:56, 1.46it/s]
Training 3/3 epoch (loss 0.1246): 92%|ββββββββββ| 1876/2046 [22:38<01:56, 1.46it/s]
Training 3/3 epoch (loss 0.1246): 92%|ββββββββββ| 1877/2046 [22:38<01:52, 1.50it/s]
Training 3/3 epoch (loss 0.1856): 92%|ββββββββββ| 1877/2046 [22:39<01:52, 1.50it/s]
Training 3/3 epoch (loss 0.1856): 92%|ββββββββββ| 1878/2046 [22:39<01:49, 1.53it/s]
Training 3/3 epoch (loss 0.1429): 92%|ββββββββββ| 1878/2046 [22:39<01:49, 1.53it/s]
Training 3/3 epoch (loss 0.1429): 92%|ββββββββββ| 1879/2046 [22:39<01:46, 1.56it/s]
Training 3/3 epoch (loss 0.1489): 92%|ββββββββββ| 1879/2046 [22:40<01:46, 1.56it/s]
Training 3/3 epoch (loss 0.1489): 92%|ββββββββββ| 1880/2046 [22:40<02:00, 1.38it/s]
Training 3/3 epoch (loss 0.1100): 92%|ββββββββββ| 1880/2046 [22:41<02:00, 1.38it/s]
Training 3/3 epoch (loss 0.1100): 92%|ββββββββββ| 1881/2046 [22:41<02:09, 1.28it/s]
Training 3/3 epoch (loss 0.1572): 92%|ββββββββββ| 1881/2046 [22:42<02:09, 1.28it/s]
Training 3/3 epoch (loss 0.1572): 92%|ββββββββββ| 1882/2046 [22:42<02:04, 1.32it/s]
Training 3/3 epoch (loss 0.0941): 92%|ββββββββββ| 1882/2046 [22:43<02:04, 1.32it/s]
Training 3/3 epoch (loss 0.0941): 92%|ββββββββββ| 1883/2046 [22:43<01:56, 1.40it/s]
Training 3/3 epoch (loss 0.1855): 92%|ββββββββββ| 1883/2046 [22:43<01:56, 1.40it/s]
Training 3/3 epoch (loss 0.1855): 92%|ββββββββββ| 1884/2046 [22:43<01:50, 1.47it/s]
Training 3/3 epoch (loss 0.1244): 92%|ββββββββββ| 1884/2046 [22:44<01:50, 1.47it/s]
Training 3/3 epoch (loss 0.1244): 92%|ββββββββββ| 1885/2046 [22:44<01:45, 1.52it/s]
Training 3/3 epoch (loss 0.1904): 92%|ββββββββββ| 1885/2046 [22:44<01:45, 1.52it/s]
Training 3/3 epoch (loss 0.1904): 92%|ββββββββββ| 1886/2046 [22:44<01:42, 1.56it/s]
Training 3/3 epoch (loss 0.1279): 92%|ββββββββββ| 1886/2046 [22:45<01:42, 1.56it/s]
Training 3/3 epoch (loss 0.1279): 92%|ββββββββββ| 1887/2046 [22:45<01:42, 1.56it/s]
Training 3/3 epoch (loss 0.1954): 92%|ββββββββββ| 1887/2046 [22:46<01:42, 1.56it/s]
Training 3/3 epoch (loss 0.1954): 92%|ββββββββββ| 1888/2046 [22:46<01:50, 1.43it/s]
Training 3/3 epoch (loss 0.1163): 92%|ββββββββββ| 1888/2046 [22:46<01:50, 1.43it/s]
Training 3/3 epoch (loss 0.1163): 92%|ββββββββββ| 1889/2046 [22:46<01:46, 1.47it/s]
Training 3/3 epoch (loss 0.1127): 92%|ββββββββββ| 1889/2046 [22:47<01:46, 1.47it/s]
Training 3/3 epoch (loss 0.1127): 92%|ββββββββββ| 1890/2046 [22:47<01:43, 1.51it/s]
Training 3/3 epoch (loss 0.1194): 92%|ββββββββββ| 1890/2046 [22:48<01:43, 1.51it/s]
Training 3/3 epoch (loss 0.1194): 92%|ββββββββββ| 1891/2046 [22:48<01:41, 1.53it/s]
Training 3/3 epoch (loss 0.1223): 92%|ββββββββββ| 1891/2046 [22:48<01:41, 1.53it/s]
Training 3/3 epoch (loss 0.1223): 92%|ββββββββββ| 1892/2046 [22:48<01:38, 1.57it/s]
Training 3/3 epoch (loss 0.1356): 92%|ββββββββββ| 1892/2046 [22:49<01:38, 1.57it/s]
Training 3/3 epoch (loss 0.1356): 93%|ββββββββββ| 1893/2046 [22:49<01:36, 1.59it/s]
Training 3/3 epoch (loss 0.1527): 93%|ββββββββββ| 1893/2046 [22:50<01:36, 1.59it/s]
Training 3/3 epoch (loss 0.1527): 93%|ββββββββββ| 1894/2046 [22:50<01:38, 1.54it/s]
Training 3/3 epoch (loss 0.1084): 93%|ββββββββββ| 1894/2046 [22:50<01:38, 1.54it/s]
Training 3/3 epoch (loss 0.1084): 93%|ββββββββββ| 1895/2046 [22:50<01:44, 1.44it/s]
Training 3/3 epoch (loss 0.1397): 93%|ββββββββββ| 1895/2046 [22:51<01:44, 1.44it/s]
Training 3/3 epoch (loss 0.1397): 93%|ββββββββββ| 1896/2046 [22:51<01:47, 1.40it/s]
Training 3/3 epoch (loss 0.0853): 93%|ββββββββββ| 1896/2046 [22:52<01:47, 1.40it/s]
Training 3/3 epoch (loss 0.0853): 93%|ββββββββββ| 1897/2046 [22:52<01:56, 1.28it/s]
Training 3/3 epoch (loss 0.0725): 93%|ββββββββββ| 1897/2046 [22:53<01:56, 1.28it/s]
Training 3/3 epoch (loss 0.0725): 93%|ββββββββββ| 1898/2046 [22:53<01:56, 1.27it/s]
Training 3/3 epoch (loss 0.1820): 93%|ββββββββββ| 1898/2046 [22:54<01:56, 1.27it/s]
Training 3/3 epoch (loss 0.1820): 93%|ββββββββββ| 1899/2046 [22:54<01:51, 1.32it/s]
Training 3/3 epoch (loss 0.0454): 93%|ββββββββββ| 1899/2046 [22:54<01:51, 1.32it/s]
Training 3/3 epoch (loss 0.0454): 93%|ββββββββββ| 1900/2046 [22:54<01:43, 1.40it/s]
Training 3/3 epoch (loss 0.1083): 93%|ββββββββββ| 1900/2046 [22:55<01:43, 1.40it/s]
Training 3/3 epoch (loss 0.1083): 93%|ββββββββββ| 1901/2046 [22:55<01:41, 1.42it/s]
Training 3/3 epoch (loss 0.0717): 93%|ββββββββββ| 1901/2046 [22:56<01:41, 1.42it/s]
Training 3/3 epoch (loss 0.0717): 93%|ββββββββββ| 1902/2046 [22:56<01:41, 1.42it/s]
Training 3/3 epoch (loss 0.1367): 93%|ββββββββββ| 1902/2046 [22:56<01:41, 1.42it/s]
Training 3/3 epoch (loss 0.1367): 93%|ββββββββββ| 1903/2046 [22:56<01:38, 1.46it/s]
Training 3/3 epoch (loss 0.1113): 93%|ββββββββββ| 1903/2046 [22:57<01:38, 1.46it/s]
Training 3/3 epoch (loss 0.1113): 93%|ββββββββββ| 1904/2046 [22:57<01:41, 1.40it/s]
Training 3/3 epoch (loss 0.1572): 93%|ββββββββββ| 1904/2046 [22:58<01:41, 1.40it/s]
Training 3/3 epoch (loss 0.1572): 93%|ββββββββββ| 1905/2046 [22:58<01:39, 1.42it/s]
Training 3/3 epoch (loss 0.2013): 93%|ββββββββββ| 1905/2046 [22:58<01:39, 1.42it/s]
Training 3/3 epoch (loss 0.2013): 93%|ββββββββββ| 1906/2046 [22:58<01:38, 1.42it/s]
Training 3/3 epoch (loss 0.1554): 93%|ββββββββββ| 1906/2046 [22:59<01:38, 1.42it/s]
Training 3/3 epoch (loss 0.1554): 93%|ββββββββββ| 1907/2046 [22:59<01:34, 1.47it/s]
Training 3/3 epoch (loss 0.1470): 93%|ββββββββββ| 1907/2046 [23:00<01:34, 1.47it/s]
Training 3/3 epoch (loss 0.1470): 93%|ββββββββββ| 1908/2046 [23:00<01:34, 1.46it/s]
Training 3/3 epoch (loss 0.1455): 93%|ββββββββββ| 1908/2046 [23:00<01:34, 1.46it/s]
Training 3/3 epoch (loss 0.1455): 93%|ββββββββββ| 1909/2046 [23:00<01:30, 1.51it/s]
Training 3/3 epoch (loss 0.1272): 93%|ββββββββββ| 1909/2046 [23:01<01:30, 1.51it/s]
Training 3/3 epoch (loss 0.1272): 93%|ββββββββββ| 1910/2046 [23:01<01:34, 1.44it/s]
Training 3/3 epoch (loss 0.0783): 93%|ββββββββββ| 1910/2046 [23:02<01:34, 1.44it/s]
Training 3/3 epoch (loss 0.0783): 93%|ββββββββββ| 1911/2046 [23:02<01:30, 1.49it/s]
Training 3/3 epoch (loss 0.1747): 93%|ββββββββββ| 1911/2046 [23:03<01:30, 1.49it/s]
Training 3/3 epoch (loss 0.1747): 93%|ββββββββββ| 1912/2046 [23:03<01:39, 1.35it/s]
Training 3/3 epoch (loss 0.1643): 93%|ββββββββββ| 1912/2046 [23:03<01:39, 1.35it/s]
Training 3/3 epoch (loss 0.1643): 93%|ββββββββββ| 1913/2046 [23:03<01:37, 1.36it/s]
Training 3/3 epoch (loss 0.1727): 93%|ββββββββββ| 1913/2046 [23:04<01:37, 1.36it/s]
Training 3/3 epoch (loss 0.1727): 94%|ββββββββββ| 1914/2046 [23:04<01:34, 1.39it/s]
Training 3/3 epoch (loss 0.1410): 94%|ββββββββββ| 1914/2046 [23:05<01:34, 1.39it/s]
Training 3/3 epoch (loss 0.1410): 94%|ββββββββββ| 1915/2046 [23:05<01:33, 1.40it/s]
Training 3/3 epoch (loss 0.1419): 94%|ββββββββββ| 1915/2046 [23:05<01:33, 1.40it/s]
Training 3/3 epoch (loss 0.1419): 94%|ββββββββββ| 1916/2046 [23:05<01:34, 1.37it/s]
Training 3/3 epoch (loss 0.0967): 94%|ββββββββββ| 1916/2046 [23:06<01:34, 1.37it/s]
Training 3/3 epoch (loss 0.0967): 94%|ββββββββββ| 1917/2046 [23:06<01:29, 1.44it/s]
Training 3/3 epoch (loss 0.1346): 94%|ββββββββββ| 1917/2046 [23:07<01:29, 1.44it/s]
Training 3/3 epoch (loss 0.1346): 94%|ββββββββββ| 1918/2046 [23:07<01:32, 1.39it/s]
Training 3/3 epoch (loss 0.0967): 94%|ββββββββββ| 1918/2046 [23:08<01:32, 1.39it/s]
Training 3/3 epoch (loss 0.0967): 94%|ββββββββββ| 1919/2046 [23:08<01:32, 1.38it/s]
Training 3/3 epoch (loss 0.1287): 94%|ββββββββββ| 1919/2046 [23:08<01:32, 1.38it/s]
Training 3/3 epoch (loss 0.1287): 94%|ββββββββββ| 1920/2046 [23:08<01:33, 1.35it/s]
Training 3/3 epoch (loss 0.1785): 94%|ββββββββββ| 1920/2046 [23:09<01:33, 1.35it/s]
Training 3/3 epoch (loss 0.1785): 94%|ββββββββββ| 1921/2046 [23:09<01:32, 1.36it/s]
Training 3/3 epoch (loss 0.1290): 94%|ββββββββββ| 1921/2046 [23:10<01:32, 1.36it/s]
Training 3/3 epoch (loss 0.1290): 94%|ββββββββββ| 1922/2046 [23:10<01:29, 1.39it/s]
Training 3/3 epoch (loss 0.1067): 94%|ββββββββββ| 1922/2046 [23:10<01:29, 1.39it/s]
Training 3/3 epoch (loss 0.1067): 94%|ββββββββββ| 1923/2046 [23:10<01:25, 1.44it/s]
Training 3/3 epoch (loss 0.1651): 94%|ββββββββββ| 1923/2046 [23:11<01:25, 1.44it/s]
Training 3/3 epoch (loss 0.1651): 94%|ββββββββββ| 1924/2046 [23:11<01:25, 1.43it/s]
Training 3/3 epoch (loss 0.0990): 94%|ββββββββββ| 1924/2046 [23:12<01:25, 1.43it/s]
Training 3/3 epoch (loss 0.0990): 94%|ββββββββββ| 1925/2046 [23:12<01:21, 1.48it/s]
Training 3/3 epoch (loss 0.0665): 94%|ββββββββββ| 1925/2046 [23:12<01:21, 1.48it/s]
Training 3/3 epoch (loss 0.0665): 94%|ββββββββββ| 1926/2046 [23:12<01:18, 1.53it/s]
Training 3/3 epoch (loss 0.1160): 94%|ββββββββββ| 1926/2046 [23:13<01:18, 1.53it/s]
Training 3/3 epoch (loss 0.1160): 94%|ββββββββββ| 1927/2046 [23:13<01:16, 1.56it/s]
Training 3/3 epoch (loss 0.0776): 94%|ββββββββββ| 1927/2046 [23:14<01:16, 1.56it/s]
Training 3/3 epoch (loss 0.0776): 94%|ββββββββββ| 1928/2046 [23:14<01:25, 1.39it/s]
Training 3/3 epoch (loss 0.1016): 94%|ββββββββββ| 1928/2046 [23:15<01:25, 1.39it/s]
Training 3/3 epoch (loss 0.1016): 94%|ββββββββββ| 1929/2046 [23:15<01:20, 1.45it/s]
Training 3/3 epoch (loss 0.1402): 94%|ββββββββββ| 1929/2046 [23:15<01:20, 1.45it/s]
Training 3/3 epoch (loss 0.1402): 94%|ββββββββββ| 1930/2046 [23:15<01:17, 1.50it/s]
Training 3/3 epoch (loss 0.1342): 94%|ββββββββββ| 1930/2046 [23:16<01:17, 1.50it/s]
Training 3/3 epoch (loss 0.1342): 94%|ββββββββββ| 1931/2046 [23:16<01:14, 1.55it/s]
Training 3/3 epoch (loss 0.1759): 94%|ββββββββββ| 1931/2046 [23:16<01:14, 1.55it/s]
Training 3/3 epoch (loss 0.1759): 94%|ββββββββββ| 1932/2046 [23:16<01:18, 1.45it/s]
Training 3/3 epoch (loss 0.1356): 94%|ββββββββββ| 1932/2046 [23:17<01:18, 1.45it/s]
Training 3/3 epoch (loss 0.1356): 94%|ββββββββββ| 1933/2046 [23:17<01:16, 1.48it/s]
Training 3/3 epoch (loss 0.1903): 94%|ββββββββββ| 1933/2046 [23:18<01:16, 1.48it/s]
Training 3/3 epoch (loss 0.1903): 95%|ββββββββββ| 1934/2046 [23:18<01:13, 1.53it/s]
Training 3/3 epoch (loss 0.1540): 95%|ββββββββββ| 1934/2046 [23:18<01:13, 1.53it/s]
Training 3/3 epoch (loss 0.1540): 95%|ββββββββββ| 1935/2046 [23:18<01:11, 1.55it/s]
Training 3/3 epoch (loss 0.2511): 95%|ββββββββββ| 1935/2046 [23:19<01:11, 1.55it/s]
Training 3/3 epoch (loss 0.2511): 95%|ββββββββββ| 1936/2046 [23:19<01:18, 1.40it/s]
Training 3/3 epoch (loss 0.0963): 95%|ββββββββββ| 1936/2046 [23:20<01:18, 1.40it/s]
Training 3/3 epoch (loss 0.0963): 95%|ββββββββββ| 1937/2046 [23:20<01:32, 1.18it/s]
Training 3/3 epoch (loss 0.1681): 95%|ββββββββββ| 1937/2046 [23:21<01:32, 1.18it/s]
Training 3/3 epoch (loss 0.1681): 95%|ββββββββββ| 1938/2046 [23:21<01:35, 1.13it/s]
Training 3/3 epoch (loss 0.1294): 95%|ββββββββββ| 1938/2046 [23:22<01:35, 1.13it/s]
Training 3/3 epoch (loss 0.1294): 95%|ββββββββββ| 1939/2046 [23:22<01:29, 1.19it/s]
Training 3/3 epoch (loss 0.1171): 95%|ββββββββββ| 1939/2046 [23:23<01:29, 1.19it/s]
Training 3/3 epoch (loss 0.1171): 95%|ββββββββββ| 1940/2046 [23:23<01:22, 1.29it/s]
Training 3/3 epoch (loss 0.1868): 95%|ββββββββββ| 1940/2046 [23:23<01:22, 1.29it/s]
Training 3/3 epoch (loss 0.1868): 95%|ββββββββββ| 1941/2046 [23:23<01:17, 1.35it/s]
Training 3/3 epoch (loss 0.2004): 95%|ββββββββββ| 1941/2046 [23:24<01:17, 1.35it/s]
Training 3/3 epoch (loss 0.2004): 95%|ββββββββββ| 1942/2046 [23:24<01:17, 1.34it/s]
Training 3/3 epoch (loss 0.1240): 95%|ββββββββββ| 1942/2046 [23:25<01:17, 1.34it/s]
Training 3/3 epoch (loss 0.1240): 95%|ββββββββββ| 1943/2046 [23:25<01:13, 1.39it/s]
Training 3/3 epoch (loss 0.1583): 95%|ββββββββββ| 1943/2046 [23:26<01:13, 1.39it/s]
Training 3/3 epoch (loss 0.1583): 95%|ββββββββββ| 1944/2046 [23:26<01:16, 1.34it/s]
Training 3/3 epoch (loss 0.0939): 95%|ββββββββββ| 1944/2046 [23:26<01:16, 1.34it/s]
Training 3/3 epoch (loss 0.0939): 95%|ββββββββββ| 1945/2046 [23:26<01:13, 1.38it/s]
Training 3/3 epoch (loss 0.1174): 95%|ββββββββββ| 1945/2046 [23:27<01:13, 1.38it/s]
Training 3/3 epoch (loss 0.1174): 95%|ββββββββββ| 1946/2046 [23:27<01:13, 1.36it/s]
Training 3/3 epoch (loss 0.1433): 95%|ββββββββββ| 1946/2046 [23:28<01:13, 1.36it/s]
Training 3/3 epoch (loss 0.1433): 95%|ββββββββββ| 1947/2046 [23:28<01:08, 1.44it/s]
Training 3/3 epoch (loss 0.0848): 95%|ββββββββββ| 1947/2046 [23:28<01:08, 1.44it/s]
Training 3/3 epoch (loss 0.0848): 95%|ββββββββββ| 1948/2046 [23:28<01:06, 1.46it/s]
Training 3/3 epoch (loss 0.1062): 95%|ββββββββββ| 1948/2046 [23:29<01:06, 1.46it/s]
Training 3/3 epoch (loss 0.1062): 95%|ββββββββββ| 1949/2046 [23:29<01:05, 1.48it/s]
Training 3/3 epoch (loss 0.1556): 95%|ββββββββββ| 1949/2046 [23:30<01:05, 1.48it/s]
Training 3/3 epoch (loss 0.1556): 95%|ββββββββββ| 1950/2046 [23:30<01:06, 1.45it/s]
Training 3/3 epoch (loss 0.1221): 95%|ββββββββββ| 1950/2046 [23:30<01:06, 1.45it/s]
Training 3/3 epoch (loss 0.1221): 95%|ββββββββββ| 1951/2046 [23:30<01:06, 1.44it/s]
Training 3/3 epoch (loss 0.1264): 95%|ββββββββββ| 1951/2046 [23:31<01:06, 1.44it/s]
Training 3/3 epoch (loss 0.1264): 95%|ββββββββββ| 1952/2046 [23:31<01:08, 1.38it/s]
Training 3/3 epoch (loss 0.1642): 95%|ββββββββββ| 1952/2046 [23:32<01:08, 1.38it/s]
Training 3/3 epoch (loss 0.1642): 95%|ββββββββββ| 1953/2046 [23:32<01:05, 1.43it/s]
Training 3/3 epoch (loss 0.2066): 95%|ββββββββββ| 1953/2046 [23:33<01:05, 1.43it/s]
Training 3/3 epoch (loss 0.2066): 96%|ββββββββββ| 1954/2046 [23:33<01:03, 1.45it/s]
Training 3/3 epoch (loss 0.1400): 96%|ββββββββββ| 1954/2046 [23:33<01:03, 1.45it/s]
Training 3/3 epoch (loss 0.1400): 96%|ββββββββββ| 1955/2046 [23:33<01:01, 1.49it/s]
Training 3/3 epoch (loss 0.1690): 96%|ββββββββββ| 1955/2046 [23:34<01:01, 1.49it/s]
Training 3/3 epoch (loss 0.1690): 96%|ββββββββββ| 1956/2046 [23:34<00:58, 1.53it/s]
Training 3/3 epoch (loss 0.0921): 96%|ββββββββββ| 1956/2046 [23:35<00:58, 1.53it/s]
Training 3/3 epoch (loss 0.0921): 96%|ββββββββββ| 1957/2046 [23:35<01:03, 1.40it/s]
Training 3/3 epoch (loss 0.0681): 96%|ββββββββββ| 1957/2046 [23:35<01:03, 1.40it/s]
Training 3/3 epoch (loss 0.0681): 96%|ββββββββββ| 1958/2046 [23:35<01:00, 1.45it/s]
Training 3/3 epoch (loss 0.0859): 96%|ββββββββββ| 1958/2046 [23:36<01:00, 1.45it/s]
Training 3/3 epoch (loss 0.0859): 96%|ββββββββββ| 1959/2046 [23:36<00:57, 1.50it/s]
Training 3/3 epoch (loss 0.1123): 96%|ββββββββββ| 1959/2046 [23:37<00:57, 1.50it/s]
Training 3/3 epoch (loss 0.1123): 96%|ββββββββββ| 1960/2046 [23:37<01:00, 1.43it/s]
Training 3/3 epoch (loss 0.1141): 96%|ββββββββββ| 1960/2046 [23:37<01:00, 1.43it/s]
Training 3/3 epoch (loss 0.1141): 96%|ββββββββββ| 1961/2046 [23:37<01:02, 1.35it/s]
Training 3/3 epoch (loss 0.0755): 96%|ββββββββββ| 1961/2046 [23:38<01:02, 1.35it/s]
Training 3/3 epoch (loss 0.0755): 96%|ββββββββββ| 1962/2046 [23:38<00:58, 1.43it/s]
Training 3/3 epoch (loss 0.1442): 96%|ββββββββββ| 1962/2046 [23:39<00:58, 1.43it/s]
Training 3/3 epoch (loss 0.1442): 96%|ββββββββββ| 1963/2046 [23:39<00:59, 1.40it/s]
Training 3/3 epoch (loss 0.1380): 96%|ββββββββββ| 1963/2046 [23:40<00:59, 1.40it/s]
Training 3/3 epoch (loss 0.1380): 96%|ββββββββββ| 1964/2046 [23:40<00:58, 1.40it/s]
Training 3/3 epoch (loss 0.1087): 96%|ββββββββββ| 1964/2046 [23:40<00:58, 1.40it/s]
Training 3/3 epoch (loss 0.1087): 96%|ββββββββββ| 1965/2046 [23:40<00:55, 1.45it/s]
Training 3/3 epoch (loss 0.1825): 96%|ββββββββββ| 1965/2046 [23:41<00:55, 1.45it/s]
Training 3/3 epoch (loss 0.1825): 96%|ββββββββββ| 1966/2046 [23:41<00:53, 1.50it/s]
Training 3/3 epoch (loss 0.1204): 96%|ββββββββββ| 1966/2046 [23:41<00:53, 1.50it/s]
Training 3/3 epoch (loss 0.1204): 96%|ββββββββββ| 1967/2046 [23:41<00:51, 1.53it/s]
Training 3/3 epoch (loss 0.1560): 96%|ββββββββββ| 1967/2046 [23:42<00:51, 1.53it/s]
Training 3/3 epoch (loss 0.1560): 96%|ββββββββββ| 1968/2046 [23:42<00:54, 1.43it/s]
Training 3/3 epoch (loss 0.1767): 96%|ββββββββββ| 1968/2046 [23:43<00:54, 1.43it/s]
Training 3/3 epoch (loss 0.1767): 96%|ββββββββββ| 1969/2046 [23:43<00:56, 1.37it/s]
Training 3/3 epoch (loss 0.1390): 96%|ββββββββββ| 1969/2046 [23:44<00:56, 1.37it/s]
Training 3/3 epoch (loss 0.1390): 96%|ββββββββββ| 1970/2046 [23:44<00:52, 1.44it/s]
Training 3/3 epoch (loss 0.0933): 96%|ββββββββββ| 1970/2046 [23:44<00:52, 1.44it/s]
Training 3/3 epoch (loss 0.0933): 96%|ββββββββββ| 1971/2046 [23:44<00:50, 1.48it/s]
Training 3/3 epoch (loss 0.1733): 96%|ββββββββββ| 1971/2046 [23:45<00:50, 1.48it/s]
Training 3/3 epoch (loss 0.1733): 96%|ββββββββββ| 1972/2046 [23:45<00:48, 1.53it/s]
Training 3/3 epoch (loss 0.1028): 96%|ββββββββββ| 1972/2046 [23:45<00:48, 1.53it/s]
Training 3/3 epoch (loss 0.1028): 96%|ββββββββββ| 1973/2046 [23:45<00:47, 1.55it/s]
Training 3/3 epoch (loss 0.1456): 96%|ββββββββββ| 1973/2046 [23:46<00:47, 1.55it/s]
Training 3/3 epoch (loss 0.1456): 96%|ββββββββββ| 1974/2046 [23:46<00:45, 1.58it/s]
Training 3/3 epoch (loss 0.1120): 96%|ββββββββββ| 1974/2046 [23:47<00:45, 1.58it/s]
Training 3/3 epoch (loss 0.1120): 97%|ββββββββββ| 1975/2046 [23:47<00:44, 1.60it/s]
Training 3/3 epoch (loss 0.1300): 97%|ββββββββββ| 1975/2046 [23:47<00:44, 1.60it/s]
Training 3/3 epoch (loss 0.1300): 97%|ββββββββββ| 1976/2046 [23:47<00:47, 1.48it/s]
Training 3/3 epoch (loss 0.1358): 97%|ββββββββββ| 1976/2046 [23:48<00:47, 1.48it/s]
Training 3/3 epoch (loss 0.1358): 97%|ββββββββββ| 1977/2046 [23:48<00:45, 1.52it/s]
Training 3/3 epoch (loss 0.1189): 97%|ββββββββββ| 1977/2046 [23:49<00:45, 1.52it/s]
Training 3/3 epoch (loss 0.1189): 97%|ββββββββββ| 1978/2046 [23:49<00:45, 1.49it/s]
Training 3/3 epoch (loss 0.1126): 97%|ββββββββββ| 1978/2046 [23:50<00:45, 1.49it/s]
Training 3/3 epoch (loss 0.1126): 97%|ββββββββββ| 1979/2046 [23:50<00:47, 1.42it/s]
Training 3/3 epoch (loss 0.1847): 97%|ββββββββββ| 1979/2046 [23:51<00:47, 1.42it/s]
Training 3/3 epoch (loss 0.1847): 97%|ββββββββββ| 1980/2046 [23:51<00:50, 1.30it/s]
Training 3/3 epoch (loss 0.1556): 97%|ββββββββββ| 1980/2046 [23:51<00:50, 1.30it/s]
Training 3/3 epoch (loss 0.1556): 97%|ββββββββββ| 1981/2046 [23:51<00:47, 1.38it/s]
Training 3/3 epoch (loss 0.1669): 97%|ββββββββββ| 1981/2046 [23:52<00:47, 1.38it/s]
Training 3/3 epoch (loss 0.1669): 97%|ββββββββββ| 1982/2046 [23:52<00:46, 1.38it/s]
Training 3/3 epoch (loss 0.0932): 97%|ββββββββββ| 1982/2046 [23:53<00:46, 1.38it/s]
Training 3/3 epoch (loss 0.0932): 97%|ββββββββββ| 1983/2046 [23:53<00:45, 1.37it/s]
Training 3/3 epoch (loss 0.1008): 97%|ββββββββββ| 1983/2046 [23:53<00:45, 1.37it/s]
Training 3/3 epoch (loss 0.1008): 97%|ββββββββββ| 1984/2046 [23:53<00:46, 1.33it/s]
Training 3/3 epoch (loss 0.2073): 97%|ββββββββββ| 1984/2046 [23:54<00:46, 1.33it/s]
Training 3/3 epoch (loss 0.2073): 97%|ββββββββββ| 1985/2046 [23:54<00:46, 1.30it/s]
Training 3/3 epoch (loss 0.0803): 97%|ββββββββββ| 1985/2046 [23:55<00:46, 1.30it/s]
Training 3/3 epoch (loss 0.0803): 97%|ββββββββββ| 1986/2046 [23:55<00:46, 1.28it/s]
Training 3/3 epoch (loss 0.0794): 97%|ββββββββββ| 1986/2046 [23:56<00:46, 1.28it/s]
Training 3/3 epoch (loss 0.0794): 97%|ββββββββββ| 1987/2046 [23:56<00:44, 1.34it/s]
Training 3/3 epoch (loss 0.1248): 97%|ββββββββββ| 1987/2046 [23:56<00:44, 1.34it/s]
Training 3/3 epoch (loss 0.1248): 97%|ββββββββββ| 1988/2046 [23:56<00:44, 1.30it/s]
Training 3/3 epoch (loss 0.2038): 97%|ββββββββββ| 1988/2046 [23:57<00:44, 1.30it/s]
Training 3/3 epoch (loss 0.2038): 97%|ββββββββββ| 1989/2046 [23:57<00:41, 1.37it/s]
Training 3/3 epoch (loss 0.1156): 97%|ββββββββββ| 1989/2046 [23:58<00:41, 1.37it/s]
Training 3/3 epoch (loss 0.1156): 97%|ββββββββββ| 1990/2046 [23:58<00:39, 1.41it/s]
Training 3/3 epoch (loss 0.2099): 97%|ββββββββββ| 1990/2046 [23:58<00:39, 1.41it/s]
Training 3/3 epoch (loss 0.2099): 97%|ββββββββββ| 1991/2046 [23:58<00:38, 1.41it/s]
Training 3/3 epoch (loss 0.1796): 97%|ββββββββββ| 1991/2046 [24:00<00:38, 1.41it/s]
Training 3/3 epoch (loss 0.1796): 97%|ββββββββββ| 1992/2046 [24:00<00:43, 1.23it/s]
Training 3/3 epoch (loss 0.1035): 97%|ββββββββββ| 1992/2046 [24:00<00:43, 1.23it/s]
Training 3/3 epoch (loss 0.1035): 97%|ββββββββββ| 1993/2046 [24:00<00:40, 1.30it/s]
Training 3/3 epoch (loss 0.1309): 97%|ββββββββββ| 1993/2046 [24:01<00:40, 1.30it/s]
Training 3/3 epoch (loss 0.1309): 97%|ββββββββββ| 1994/2046 [24:01<00:37, 1.38it/s]
Training 3/3 epoch (loss 0.1709): 97%|ββββββββββ| 1994/2046 [24:02<00:37, 1.38it/s]
Training 3/3 epoch (loss 0.1709): 98%|ββββββββββ| 1995/2046 [24:02<00:37, 1.36it/s]
Training 3/3 epoch (loss 0.1176): 98%|ββββββββββ| 1995/2046 [24:02<00:37, 1.36it/s]
Training 3/3 epoch (loss 0.1176): 98%|ββββββββββ| 1996/2046 [24:02<00:34, 1.43it/s]
Training 3/3 epoch (loss 0.1848): 98%|ββββββββββ| 1996/2046 [24:03<00:34, 1.43it/s]
Training 3/3 epoch (loss 0.1848): 98%|ββββββββββ| 1997/2046 [24:03<00:33, 1.45it/s]
Training 3/3 epoch (loss 0.1705): 98%|ββββββββββ| 1997/2046 [24:04<00:33, 1.45it/s]
Training 3/3 epoch (loss 0.1705): 98%|ββββββββββ| 1998/2046 [24:04<00:32, 1.49it/s]
Training 3/3 epoch (loss 0.1410): 98%|ββββββββββ| 1998/2046 [24:04<00:32, 1.49it/s]
Training 3/3 epoch (loss 0.1410): 98%|ββββββββββ| 1999/2046 [24:04<00:32, 1.44it/s]
Training 3/3 epoch (loss 0.1549): 98%|ββββββββββ| 1999/2046 [24:05<00:32, 1.44it/s]
Training 3/3 epoch (loss 0.1549): 98%|ββββββββββ| 2000/2046 [24:05<00:32, 1.41it/s]
Training 3/3 epoch (loss 0.1227): 98%|ββββββββββ| 2000/2046 [24:06<00:32, 1.41it/s]
Training 3/3 epoch (loss 0.1227): 98%|ββββββββββ| 2001/2046 [24:06<00:31, 1.45it/s]
Training 3/3 epoch (loss 0.1572): 98%|ββββββββββ| 2001/2046 [24:06<00:31, 1.45it/s]
Training 3/3 epoch (loss 0.1572): 98%|ββββββββββ| 2002/2046 [24:06<00:29, 1.51it/s]
Training 3/3 epoch (loss 0.1694): 98%|ββββββββββ| 2002/2046 [24:07<00:29, 1.51it/s]
Training 3/3 epoch (loss 0.1694): 98%|ββββββββββ| 2003/2046 [24:07<00:28, 1.52it/s]
Training 3/3 epoch (loss 0.1318): 98%|ββββββββββ| 2003/2046 [24:08<00:28, 1.52it/s]
Training 3/3 epoch (loss 0.1318): 98%|ββββββββββ| 2004/2046 [24:08<00:27, 1.54it/s]
Training 3/3 epoch (loss 0.1818): 98%|ββββββββββ| 2004/2046 [24:08<00:27, 1.54it/s]
Training 3/3 epoch (loss 0.1818): 98%|ββββββββββ| 2005/2046 [24:08<00:29, 1.41it/s]
Training 3/3 epoch (loss 0.1294): 98%|ββββββββββ| 2005/2046 [24:09<00:29, 1.41it/s]
Training 3/3 epoch (loss 0.1294): 98%|ββββββββββ| 2006/2046 [24:09<00:28, 1.42it/s]
Training 3/3 epoch (loss 0.1632): 98%|ββββββββββ| 2006/2046 [24:10<00:28, 1.42it/s]
Training 3/3 epoch (loss 0.1632): 98%|ββββββββββ| 2007/2046 [24:10<00:26, 1.45it/s]
Training 3/3 epoch (loss 0.1088): 98%|ββββββββββ| 2007/2046 [24:10<00:26, 1.45it/s]
Training 3/3 epoch (loss 0.1088): 98%|ββββββββββ| 2008/2046 [24:10<00:26, 1.42it/s]
Training 3/3 epoch (loss 0.1384): 98%|ββββββββββ| 2008/2046 [24:11<00:26, 1.42it/s]
Training 3/3 epoch (loss 0.1384): 98%|ββββββββββ| 2009/2046 [24:11<00:24, 1.49it/s]
Training 3/3 epoch (loss 0.1437): 98%|ββββββββββ| 2009/2046 [24:12<00:24, 1.49it/s]
Training 3/3 epoch (loss 0.1437): 98%|ββββββββββ| 2010/2046 [24:12<00:23, 1.52it/s]
Training 3/3 epoch (loss 0.1307): 98%|ββββββββββ| 2010/2046 [24:12<00:23, 1.52it/s]
Training 3/3 epoch (loss 0.1307): 98%|ββββββββββ| 2011/2046 [24:12<00:23, 1.50it/s]
Training 3/3 epoch (loss 0.1515): 98%|ββββββββββ| 2011/2046 [24:13<00:23, 1.50it/s]
Training 3/3 epoch (loss 0.1515): 98%|ββββββββββ| 2012/2046 [24:13<00:22, 1.51it/s]
Training 3/3 epoch (loss 0.1077): 98%|ββββββββββ| 2012/2046 [24:14<00:22, 1.51it/s]
Training 3/3 epoch (loss 0.1077): 98%|ββββββββββ| 2013/2046 [24:14<00:21, 1.55it/s]
Training 3/3 epoch (loss 0.2124): 98%|ββββββββββ| 2013/2046 [24:14<00:21, 1.55it/s]
Training 3/3 epoch (loss 0.2124): 98%|ββββββββββ| 2014/2046 [24:14<00:20, 1.56it/s]
Training 3/3 epoch (loss 0.2518): 98%|ββββββββββ| 2014/2046 [24:15<00:20, 1.56it/s]
Training 3/3 epoch (loss 0.2518): 98%|ββββββββββ| 2015/2046 [24:15<00:19, 1.58it/s]
Training 3/3 epoch (loss 0.1577): 98%|ββββββββββ| 2015/2046 [24:16<00:19, 1.58it/s]
Training 3/3 epoch (loss 0.1577): 99%|ββββββββββ| 2016/2046 [24:16<00:19, 1.50it/s]
Training 3/3 epoch (loss 0.1218): 99%|ββββββββββ| 2016/2046 [24:16<00:19, 1.50it/s]
Training 3/3 epoch (loss 0.1218): 99%|ββββββββββ| 2017/2046 [24:16<00:20, 1.42it/s]
Training 3/3 epoch (loss 0.2170): 99%|ββββββββββ| 2017/2046 [24:17<00:20, 1.42it/s]
Training 3/3 epoch (loss 0.2170): 99%|ββββββββββ| 2018/2046 [24:17<00:19, 1.47it/s]
Training 3/3 epoch (loss 0.0927): 99%|ββββββββββ| 2018/2046 [24:18<00:19, 1.47it/s]
Training 3/3 epoch (loss 0.0927): 99%|ββββββββββ| 2019/2046 [24:18<00:18, 1.49it/s]
Training 3/3 epoch (loss 0.1493): 99%|ββββββββββ| 2019/2046 [24:19<00:18, 1.49it/s]
Training 3/3 epoch (loss 0.1493): 99%|ββββββββββ| 2020/2046 [24:19<00:18, 1.40it/s]
Training 3/3 epoch (loss 0.1918): 99%|ββββββββββ| 2020/2046 [24:19<00:18, 1.40it/s]
Training 3/3 epoch (loss 0.1918): 99%|ββββββββββ| 2021/2046 [24:19<00:19, 1.31it/s]
Training 3/3 epoch (loss 0.1332): 99%|ββββββββββ| 2021/2046 [24:20<00:19, 1.31it/s]
Training 3/3 epoch (loss 0.1332): 99%|ββββββββββ| 2022/2046 [24:20<00:17, 1.34it/s]
Training 3/3 epoch (loss 0.1161): 99%|ββββββββββ| 2022/2046 [24:21<00:17, 1.34it/s]
Training 3/3 epoch (loss 0.1161): 99%|ββββββββββ| 2023/2046 [24:21<00:16, 1.38it/s]
Training 3/3 epoch (loss 0.2173): 99%|ββββββββββ| 2023/2046 [24:22<00:16, 1.38it/s]
Training 3/3 epoch (loss 0.2173): 99%|ββββββββββ| 2024/2046 [24:22<00:16, 1.32it/s]
Training 3/3 epoch (loss 0.1460): 99%|ββββββββββ| 2024/2046 [24:22<00:16, 1.32it/s]
Training 3/3 epoch (loss 0.1460): 99%|ββββββββββ| 2025/2046 [24:22<00:15, 1.40it/s]
Training 3/3 epoch (loss 0.2155): 99%|ββββββββββ| 2025/2046 [24:23<00:15, 1.40it/s]
Training 3/3 epoch (loss 0.2155): 99%|ββββββββββ| 2026/2046 [24:23<00:14, 1.36it/s]
Training 3/3 epoch (loss 0.1940): 99%|ββββββββββ| 2026/2046 [24:24<00:14, 1.36it/s]
Training 3/3 epoch (loss 0.1940): 99%|ββββββββββ| 2027/2046 [24:24<00:14, 1.35it/s]
Training 3/3 epoch (loss 0.1294): 99%|ββββββββββ| 2027/2046 [24:25<00:14, 1.35it/s]
Training 3/3 epoch (loss 0.1294): 99%|ββββββββββ| 2028/2046 [24:25<00:13, 1.35it/s]
Training 3/3 epoch (loss 0.2305): 99%|ββββββββββ| 2028/2046 [24:25<00:13, 1.35it/s]
Training 3/3 epoch (loss 0.2305): 99%|ββββββββββ| 2029/2046 [24:25<00:12, 1.32it/s]
Training 3/3 epoch (loss 0.1631): 99%|ββββββββββ| 2029/2046 [24:26<00:12, 1.32it/s]
Training 3/3 epoch (loss 0.1631): 99%|ββββββββββ| 2030/2046 [24:26<00:11, 1.35it/s]
Training 3/3 epoch (loss 0.1495): 99%|ββββββββββ| 2030/2046 [24:27<00:11, 1.35it/s]
Training 3/3 epoch (loss 0.1495): 99%|ββββββββββ| 2031/2046 [24:27<00:11, 1.34it/s]
Training 3/3 epoch (loss 0.1640): 99%|ββββββββββ| 2031/2046 [24:28<00:11, 1.34it/s]
Training 3/3 epoch (loss 0.1640): 99%|ββββββββββ| 2032/2046 [24:28<00:10, 1.33it/s]
Training 3/3 epoch (loss 0.1699): 99%|ββββββββββ| 2032/2046 [24:28<00:10, 1.33it/s]
Training 3/3 epoch (loss 0.1699): 99%|ββββββββββ| 2033/2046 [24:28<00:09, 1.38it/s]
Training 3/3 epoch (loss 0.1398): 99%|ββββββββββ| 2033/2046 [24:29<00:09, 1.38it/s]
Training 3/3 epoch (loss 0.1398): 99%|ββββββββββ| 2034/2046 [24:29<00:09, 1.33it/s]
Training 3/3 epoch (loss 0.1427): 99%|ββββββββββ| 2034/2046 [24:30<00:09, 1.33it/s]
Training 3/3 epoch (loss 0.1427): 99%|ββββββββββ| 2035/2046 [24:30<00:08, 1.36it/s]
Training 3/3 epoch (loss 0.1495): 99%|ββββββββββ| 2035/2046 [24:30<00:08, 1.36it/s]
Training 3/3 epoch (loss 0.1495): 100%|ββββββββββ| 2036/2046 [24:30<00:07, 1.41it/s]
Training 3/3 epoch (loss 0.1936): 100%|ββββββββββ| 2036/2046 [24:31<00:07, 1.41it/s]
Training 3/3 epoch (loss 0.1936): 100%|ββββββββββ| 2037/2046 [24:31<00:06, 1.43it/s]
Training 3/3 epoch (loss 0.0809): 100%|ββββββββββ| 2037/2046 [24:32<00:06, 1.43it/s]
Training 3/3 epoch (loss 0.0809): 100%|ββββββββββ| 2038/2046 [24:32<00:05, 1.49it/s]
Training 3/3 epoch (loss 0.1983): 100%|ββββββββββ| 2038/2046 [24:32<00:05, 1.49it/s]
Training 3/3 epoch (loss 0.1983): 100%|ββββββββββ| 2039/2046 [24:32<00:04, 1.53it/s]
Training 3/3 epoch (loss 0.1729): 100%|ββββββββββ| 2039/2046 [24:33<00:04, 1.53it/s]
Training 3/3 epoch (loss 0.1729): 100%|ββββββββββ| 2040/2046 [24:33<00:04, 1.45it/s]
Training 3/3 epoch (loss 0.1767): 100%|ββββββββββ| 2040/2046 [24:34<00:04, 1.45it/s]
Training 3/3 epoch (loss 0.1767): 100%|ββββββββββ| 2041/2046 [24:34<00:03, 1.39it/s]
Training 3/3 epoch (loss 0.1125): 100%|ββββββββββ| 2041/2046 [24:34<00:03, 1.39it/s]
Training 3/3 epoch (loss 0.1125): 100%|ββββββββββ| 2042/2046 [24:34<00:02, 1.43it/s]
Training 3/3 epoch (loss 0.1544): 100%|ββββββββββ| 2042/2046 [24:35<00:02, 1.43it/s]
Training 3/3 epoch (loss 0.1544): 100%|ββββββββββ| 2043/2046 [24:35<00:02, 1.49it/s]
Training 3/3 epoch (loss 0.1749): 100%|ββββββββββ| 2043/2046 [24:36<00:02, 1.49it/s]
Training 3/3 epoch (loss 0.1749): 100%|ββββββββββ| 2044/2046 [24:36<00:01, 1.46it/s]
Training 3/3 epoch (loss 0.2140): 100%|ββββββββββ| 2044/2046 [24:37<00:01, 1.46it/s]
Training 3/3 epoch (loss 0.2140): 100%|ββββββββββ| 2045/2046 [24:37<00:00, 1.38it/s]
Training 3/3 epoch (loss 0.0347): 100%|ββββββββββ| 2045/2046 [24:37<00:00, 1.38it/s]
Training 3/3 epoch (loss 0.0347): 100%|ββββββββββ| 2046/2046 [24:37<00:00, 1.46it/s]
Training 3/3 epoch (loss 0.0347): 100%|ββββββββββ| 2046/2046 [24:37<00:00, 1.38it/s] |