|
Evaluating perplexity: 0%| | 0/4620 [00:00<?, ?batch/s]
Evaluating perplexity: 0%| | 1/4620 [00:02<3:00:51, 2.35s/batch]
Evaluating perplexity: 0%| | 2/4620 [00:04<2:50:04, 2.21s/batch]
Evaluating perplexity: 0%| | 3/4620 [00:06<2:45:28, 2.15s/batch]
Evaluating perplexity: 0%| | 4/4620 [00:08<2:43:16, 2.12s/batch]
Evaluating perplexity: 0%| | 5/4620 [00:10<2:42:40, 2.12s/batch]
Evaluating perplexity: 0%| | 6/4620 [00:12<2:42:48, 2.12s/batch]
Evaluating perplexity: 0%| | 7/4620 [00:14<2:43:15, 2.12s/batch]
Evaluating perplexity: 0%| | 8/4620 [00:17<2:43:41, 2.13s/batch]
Evaluating perplexity: 0%| | 9/4620 [00:19<2:44:03, 2.13s/batch]
Evaluating perplexity: 0%| | 10/4620 [00:21<2:44:39, 2.14s/batch]
Evaluating perplexity: 0%| | 11/4620 [00:23<2:44:45, 2.14s/batch]
Evaluating perplexity: 0%| | 12/4620 [00:25<2:44:44, 2.15s/batch]
Evaluating perplexity: 0%| | 13/4620 [00:27<2:44:12, 2.14s/batch]
Evaluating perplexity: 0%| | 14/4620 [00:30<2:44:50, 2.15s/batch]
Evaluating perplexity: 0%| | 15/4620 [00:32<2:45:25, 2.16s/batch]
Evaluating perplexity: 0%| | 16/4620 [00:34<2:45:19, 2.15s/batch]
Evaluating perplexity: 0%| | 17/4620 [00:36<2:45:29, 2.16s/batch]
Evaluating perplexity: 0%| | 18/4620 [00:39<3:05:57, 2.42s/batch]
Evaluating perplexity: 0%| | 19/4620 [00:41<2:58:42, 2.33s/batch]
Evaluating perplexity: 0%| | 20/4620 [00:43<2:54:44, 2.28s/batch]
Evaluating perplexity: 0%| | 21/4620 [00:45<2:51:13, 2.23s/batch]
Evaluating perplexity: 0%| | 22/4620 [00:48<2:48:49, 2.20s/batch]
Evaluating perplexity: 0%| | 23/4620 [00:50<2:47:13, 2.18s/batch]
Evaluating perplexity: 1%| | 24/4620 [00:52<2:45:42, 2.16s/batch]
Evaluating perplexity: 1%| | 25/4620 [00:54<2:44:41, 2.15s/batch]
Evaluating perplexity: 1%| | 26/4620 [00:56<2:44:52, 2.15s/batch]
Evaluating perplexity: 1%| | 27/4620 [00:58<2:44:54, 2.15s/batch]
Evaluating perplexity: 1%| | 28/4620 [01:00<2:44:47, 2.15s/batch]
Evaluating perplexity: 1%| | 29/4620 [01:03<2:44:45, 2.15s/batch]
Evaluating perplexity: 1%| | 30/4620 [01:05<2:43:59, 2.14s/batch]
Evaluating perplexity: 1%| | 31/4620 [01:07<2:44:17, 2.15s/batch]
Evaluating perplexity: 1%| | 32/4620 [01:09<2:44:21, 2.15s/batch]
Evaluating perplexity: 1%| | 33/4620 [01:11<2:44:31, 2.15s/batch]
Evaluating perplexity: 1%| | 34/4620 [01:13<2:45:05, 2.16s/batch]
Evaluating perplexity: 1%| | 35/4620 [01:15<2:44:20, 2.15s/batch]
Evaluating perplexity: 1%| | 36/4620 [01:18<2:44:27, 2.15s/batch]
Evaluating perplexity: 1%| | 37/4620 [01:20<2:44:08, 2.15s/batch]
Evaluating perplexity: 1%| | 38/4620 [01:22<2:44:06, 2.15s/batch]
Evaluating perplexity: 1%| | 39/4620 [01:24<2:44:14, 2.15s/batch]
Evaluating perplexity: 1%| | 40/4620 [01:26<2:44:11, 2.15s/batch]
Evaluating perplexity: 1%| | 41/4620 [01:28<2:43:26, 2.14s/batch]
Evaluating perplexity: 1%| | 42/4620 [01:30<2:43:19, 2.14s/batch]
Evaluating perplexity: 1%| | 43/4620 [01:33<2:43:15, 2.14s/batch]
Evaluating perplexity: 1%| | 44/4620 [01:35<2:42:47, 2.13s/batch]
Evaluating perplexity: 1%| | 45/4620 [01:37<2:42:45, 2.13s/batch]
Evaluating perplexity: 1%| | 46/4620 [01:39<2:42:32, 2.13s/batch]
Evaluating perplexity: 1%| | 47/4620 [01:41<2:43:10, 2.14s/batch]
Evaluating perplexity: 1%| | 48/4620 [01:43<2:42:50, 2.14s/batch]
Evaluating perplexity: 1%| | 49/4620 [01:45<2:42:40, 2.14s/batch]
Evaluating perplexity: 1%| | 50/4620 [01:48<2:42:54, 2.14s/batch]
Evaluating perplexity: 1%| | 51/4620 [01:50<2:42:33, 2.13s/batch]
Evaluating perplexity: 1%| | 52/4620 [01:52<2:41:34, 2.12s/batch]
Evaluating perplexity: 1%| | 53/4620 [01:54<2:41:40, 2.12s/batch]
Evaluating perplexity: 1%| | 54/4620 [01:56<2:41:53, 2.13s/batch]
Evaluating perplexity: 1%| | 55/4620 [01:58<2:42:05, 2.13s/batch]
Evaluating perplexity: 1%| | 56/4620 [02:00<2:42:19, 2.13s/batch]
Evaluating perplexity: 1%| | 57/4620 [02:02<2:42:15, 2.13s/batch]
Evaluating perplexity: 1%|β | 58/4620 [02:05<2:41:30, 2.12s/batch]
Evaluating perplexity: 1%|β | 59/4620 [02:07<2:41:47, 2.13s/batch]
Evaluating perplexity: 1%|β | 60/4620 [02:09<2:42:01, 2.13s/batch]
Evaluating perplexity: 1%|β | 61/4620 [02:11<2:42:04, 2.13s/batch]
Evaluating perplexity: 1%|β | 62/4620 [02:13<2:42:06, 2.13s/batch]
Evaluating perplexity: 1%|β | 63/4620 [02:15<2:41:54, 2.13s/batch]
Evaluating perplexity: 1%|β | 64/4620 [02:17<2:42:07, 2.14s/batch]
Evaluating perplexity: 1%|β | 65/4620 [02:20<2:42:08, 2.14s/batch]
Evaluating perplexity: 1%|β | 66/4620 [02:22<2:41:07, 2.12s/batch]
Evaluating perplexity: 1%|β | 67/4620 [02:24<2:41:39, 2.13s/batch]
Evaluating perplexity: 1%|β | 68/4620 [02:26<2:41:43, 2.13s/batch]
Evaluating perplexity: 1%|β | 69/4620 [02:28<2:42:07, 2.14s/batch]
Evaluating perplexity: 2%|β | 70/4620 [02:30<2:42:28, 2.14s/batch]
Evaluating perplexity: 2%|β | 71/4620 [02:32<2:42:54, 2.15s/batch]
Evaluating perplexity: 2%|β | 72/4620 [02:35<2:43:18, 2.15s/batch]
Evaluating perplexity: 2%|β | 73/4620 [02:37<2:43:16, 2.15s/batch]
Evaluating perplexity: 2%|β | 74/4620 [02:39<2:42:45, 2.15s/batch]
Evaluating perplexity: 2%|β | 75/4620 [02:41<2:42:31, 2.15s/batch]
Evaluating perplexity: 2%|β | 76/4620 [02:43<2:42:14, 2.14s/batch]
Evaluating perplexity: 2%|β | 77/4620 [02:45<2:41:29, 2.13s/batch]
Evaluating perplexity: 2%|β | 78/4620 [02:47<2:40:26, 2.12s/batch]
Evaluating perplexity: 2%|β | 79/4620 [02:49<2:39:35, 2.11s/batch]
Evaluating perplexity: 2%|β | 80/4620 [02:52<2:40:22, 2.12s/batch]
Evaluating perplexity: 2%|β | 81/4620 [02:54<2:40:42, 2.12s/batch]
Evaluating perplexity: 2%|β | 82/4620 [02:56<2:40:58, 2.13s/batch]
Evaluating perplexity: 2%|β | 83/4620 [02:58<2:40:51, 2.13s/batch]
Evaluating perplexity: 2%|β | 84/4620 [03:00<2:40:57, 2.13s/batch]
Evaluating perplexity: 2%|β | 85/4620 [03:02<2:40:04, 2.12s/batch]
Evaluating perplexity: 2%|β | 86/4620 [03:04<2:40:03, 2.12s/batch]
Evaluating perplexity: 2%|β | 87/4620 [03:06<2:40:19, 2.12s/batch]
Evaluating perplexity: 2%|β | 88/4620 [03:09<2:40:02, 2.12s/batch]
Evaluating perplexity: 2%|β | 89/4620 [03:11<2:40:02, 2.12s/batch]
Evaluating perplexity: 2%|β | 90/4620 [03:13<2:40:19, 2.12s/batch]
Evaluating perplexity: 2%|β | 91/4620 [03:15<2:40:48, 2.13s/batch]
Evaluating perplexity: 2%|β | 92/4620 [03:17<2:41:37, 2.14s/batch]
Evaluating perplexity: 2%|β | 93/4620 [03:19<2:41:53, 2.15s/batch]
Evaluating perplexity: 2%|β | 94/4620 [03:21<2:42:12, 2.15s/batch]
Evaluating perplexity: 2%|β | 95/4620 [03:24<2:42:20, 2.15s/batch]
Evaluating perplexity: 2%|β | 96/4620 [03:26<2:42:29, 2.16s/batch]
Evaluating perplexity: 2%|β | 97/4620 [03:28<2:42:28, 2.16s/batch]
Evaluating perplexity: 2%|β | 98/4620 [03:30<2:40:50, 2.13s/batch]
Evaluating perplexity: 2%|β | 99/4620 [03:32<2:41:07, 2.14s/batch]
Evaluating perplexity: 2%|β | 100/4620 [03:34<2:41:00, 2.14s/batch]
Evaluating perplexity: 2%|β | 101/4620 [03:36<2:40:54, 2.14s/batch]
Evaluating perplexity: 2%|β | 102/4620 [03:39<2:40:46, 2.14s/batch]
Evaluating perplexity: 2%|β | 103/4620 [03:41<2:40:17, 2.13s/batch]
Evaluating perplexity: 2%|β | 104/4620 [03:43<2:40:28, 2.13s/batch]
Evaluating perplexity: 2%|β | 105/4620 [03:45<2:40:24, 2.13s/batch]
Evaluating perplexity: 2%|β | 106/4620 [03:47<2:40:14, 2.13s/batch]
Evaluating perplexity: 2%|β | 107/4620 [03:49<2:41:04, 2.14s/batch]
Evaluating perplexity: 2%|β | 108/4620 [03:51<2:40:45, 2.14s/batch]
Evaluating perplexity: 2%|β | 109/4620 [03:53<2:40:43, 2.14s/batch]
Evaluating perplexity: 2%|β | 110/4620 [03:56<2:40:54, 2.14s/batch]
Evaluating perplexity: 2%|β | 111/4620 [03:58<2:40:32, 2.14s/batch]
Evaluating perplexity: 2%|β | 112/4620 [04:00<2:40:24, 2.13s/batch] |