elliotthwang commited on
Commit
ae230db
·
verified ·
1 Parent(s): 5e1a132

Upload tokenizer

Browse files
added_tokens.json ADDED
@@ -0,0 +1,984 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "\t\t": 50294,
3
+ "\t\t\t": 50293,
4
+ "\t\t\t\t": 50292,
5
+ "\t\t\t\t\t": 50291,
6
+ "\t\t\t\t\t\t": 50290,
7
+ "\t\t\t\t\t\t\t": 50289,
8
+ "\t\t\t\t\t\t\t\t": 50288,
9
+ "\t\t\t\t\t\t\t\t\t": 50287,
10
+ " ": 50286,
11
+ " ": 50285,
12
+ " ": 50284,
13
+ " ": 50283,
14
+ " ": 50282,
15
+ " ": 50281,
16
+ " ": 50280,
17
+ " ": 50279,
18
+ " ": 50278,
19
+ " ": 50277,
20
+ " ": 50276,
21
+ " ": 50275,
22
+ " ": 50274,
23
+ " ": 50273,
24
+ " ": 50272,
25
+ " ": 50271,
26
+ " ": 50270,
27
+ " ": 50269,
28
+ " ": 50268,
29
+ " ": 50267,
30
+ " ": 50266,
31
+ " ": 50265,
32
+ " ": 50264,
33
+ " ": 50263,
34
+ " ": 50262,
35
+ " ": 50261,
36
+ " ": 50260,
37
+ " ": 50259,
38
+ " ": 50258,
39
+ " ": 50257,
40
+ "<SPL_0": 50295,
41
+ "<SPL_1": 50296,
42
+ "<SPL_10": 50305,
43
+ "<SPL_100": 50395,
44
+ "<SPL_101": 50396,
45
+ "<SPL_102": 50397,
46
+ "<SPL_103": 50398,
47
+ "<SPL_104": 50399,
48
+ "<SPL_105": 50400,
49
+ "<SPL_106": 50401,
50
+ "<SPL_107": 50402,
51
+ "<SPL_108": 50403,
52
+ "<SPL_109": 50404,
53
+ "<SPL_11": 50306,
54
+ "<SPL_110": 50405,
55
+ "<SPL_111": 50406,
56
+ "<SPL_112": 50407,
57
+ "<SPL_113": 50408,
58
+ "<SPL_114": 50409,
59
+ "<SPL_115": 50410,
60
+ "<SPL_116": 50411,
61
+ "<SPL_117": 50412,
62
+ "<SPL_118": 50413,
63
+ "<SPL_119": 50414,
64
+ "<SPL_12": 50307,
65
+ "<SPL_120": 50415,
66
+ "<SPL_121": 50416,
67
+ "<SPL_122": 50417,
68
+ "<SPL_123": 50418,
69
+ "<SPL_124": 50419,
70
+ "<SPL_125": 50420,
71
+ "<SPL_126": 50421,
72
+ "<SPL_127": 50422,
73
+ "<SPL_128": 50423,
74
+ "<SPL_129": 50424,
75
+ "<SPL_13": 50308,
76
+ "<SPL_130": 50425,
77
+ "<SPL_131": 50426,
78
+ "<SPL_132": 50427,
79
+ "<SPL_133": 50428,
80
+ "<SPL_134": 50429,
81
+ "<SPL_135": 50430,
82
+ "<SPL_136": 50431,
83
+ "<SPL_137": 50432,
84
+ "<SPL_138": 50433,
85
+ "<SPL_139": 50434,
86
+ "<SPL_14": 50309,
87
+ "<SPL_140": 50435,
88
+ "<SPL_141": 50436,
89
+ "<SPL_142": 50437,
90
+ "<SPL_143": 50438,
91
+ "<SPL_144": 50439,
92
+ "<SPL_145": 50440,
93
+ "<SPL_146": 50441,
94
+ "<SPL_147": 50442,
95
+ "<SPL_148": 50443,
96
+ "<SPL_149": 50444,
97
+ "<SPL_15": 50310,
98
+ "<SPL_150": 50445,
99
+ "<SPL_151": 50446,
100
+ "<SPL_152": 50447,
101
+ "<SPL_153": 50448,
102
+ "<SPL_154": 50449,
103
+ "<SPL_155": 50450,
104
+ "<SPL_156": 50451,
105
+ "<SPL_157": 50452,
106
+ "<SPL_158": 50453,
107
+ "<SPL_159": 50454,
108
+ "<SPL_16": 50311,
109
+ "<SPL_160": 50455,
110
+ "<SPL_161": 50456,
111
+ "<SPL_162": 50457,
112
+ "<SPL_163": 50458,
113
+ "<SPL_164": 50459,
114
+ "<SPL_165": 50460,
115
+ "<SPL_166": 50461,
116
+ "<SPL_167": 50462,
117
+ "<SPL_168": 50463,
118
+ "<SPL_169": 50464,
119
+ "<SPL_17": 50312,
120
+ "<SPL_170": 50465,
121
+ "<SPL_171": 50466,
122
+ "<SPL_172": 50467,
123
+ "<SPL_173": 50468,
124
+ "<SPL_174": 50469,
125
+ "<SPL_175": 50470,
126
+ "<SPL_176": 50471,
127
+ "<SPL_177": 50472,
128
+ "<SPL_178": 50473,
129
+ "<SPL_179": 50474,
130
+ "<SPL_18": 50313,
131
+ "<SPL_180": 50475,
132
+ "<SPL_181": 50476,
133
+ "<SPL_182": 50477,
134
+ "<SPL_183": 50478,
135
+ "<SPL_184": 50479,
136
+ "<SPL_185": 50480,
137
+ "<SPL_186": 50481,
138
+ "<SPL_187": 50482,
139
+ "<SPL_188": 50483,
140
+ "<SPL_189": 50484,
141
+ "<SPL_19": 50314,
142
+ "<SPL_190": 50485,
143
+ "<SPL_191": 50486,
144
+ "<SPL_192": 50487,
145
+ "<SPL_193": 50488,
146
+ "<SPL_194": 50489,
147
+ "<SPL_195": 50490,
148
+ "<SPL_196": 50491,
149
+ "<SPL_197": 50492,
150
+ "<SPL_198": 50493,
151
+ "<SPL_199": 50494,
152
+ "<SPL_2": 50297,
153
+ "<SPL_20": 50315,
154
+ "<SPL_200": 50495,
155
+ "<SPL_201": 50496,
156
+ "<SPL_202": 50497,
157
+ "<SPL_203": 50498,
158
+ "<SPL_204": 50499,
159
+ "<SPL_205": 50500,
160
+ "<SPL_206": 50501,
161
+ "<SPL_207": 50502,
162
+ "<SPL_208": 50503,
163
+ "<SPL_209": 50504,
164
+ "<SPL_21": 50316,
165
+ "<SPL_210": 50505,
166
+ "<SPL_211": 50506,
167
+ "<SPL_212": 50507,
168
+ "<SPL_213": 50508,
169
+ "<SPL_214": 50509,
170
+ "<SPL_215": 50510,
171
+ "<SPL_216": 50511,
172
+ "<SPL_217": 50512,
173
+ "<SPL_218": 50513,
174
+ "<SPL_219": 50514,
175
+ "<SPL_22": 50317,
176
+ "<SPL_220": 50515,
177
+ "<SPL_221": 50516,
178
+ "<SPL_222": 50517,
179
+ "<SPL_223": 50518,
180
+ "<SPL_224": 50519,
181
+ "<SPL_225": 50520,
182
+ "<SPL_226": 50521,
183
+ "<SPL_227": 50522,
184
+ "<SPL_228": 50523,
185
+ "<SPL_229": 50524,
186
+ "<SPL_23": 50318,
187
+ "<SPL_230": 50525,
188
+ "<SPL_231": 50526,
189
+ "<SPL_232": 50527,
190
+ "<SPL_233": 50528,
191
+ "<SPL_234": 50529,
192
+ "<SPL_235": 50530,
193
+ "<SPL_236": 50531,
194
+ "<SPL_237": 50532,
195
+ "<SPL_238": 50533,
196
+ "<SPL_239": 50534,
197
+ "<SPL_24": 50319,
198
+ "<SPL_240": 50535,
199
+ "<SPL_241": 50536,
200
+ "<SPL_242": 50537,
201
+ "<SPL_243": 50538,
202
+ "<SPL_244": 50539,
203
+ "<SPL_245": 50540,
204
+ "<SPL_246": 50541,
205
+ "<SPL_247": 50542,
206
+ "<SPL_248": 50543,
207
+ "<SPL_249": 50544,
208
+ "<SPL_25": 50320,
209
+ "<SPL_250": 50545,
210
+ "<SPL_251": 50546,
211
+ "<SPL_252": 50547,
212
+ "<SPL_253": 50548,
213
+ "<SPL_254": 50549,
214
+ "<SPL_255": 50550,
215
+ "<SPL_256": 50551,
216
+ "<SPL_257": 50552,
217
+ "<SPL_258": 50553,
218
+ "<SPL_259": 50554,
219
+ "<SPL_26": 50321,
220
+ "<SPL_260": 50555,
221
+ "<SPL_261": 50556,
222
+ "<SPL_262": 50557,
223
+ "<SPL_263": 50558,
224
+ "<SPL_264": 50559,
225
+ "<SPL_265": 50560,
226
+ "<SPL_266": 50561,
227
+ "<SPL_267": 50562,
228
+ "<SPL_268": 50563,
229
+ "<SPL_269": 50564,
230
+ "<SPL_27": 50322,
231
+ "<SPL_270": 50565,
232
+ "<SPL_271": 50566,
233
+ "<SPL_272": 50567,
234
+ "<SPL_273": 50568,
235
+ "<SPL_274": 50569,
236
+ "<SPL_275": 50570,
237
+ "<SPL_276": 50571,
238
+ "<SPL_277": 50572,
239
+ "<SPL_278": 50573,
240
+ "<SPL_279": 50574,
241
+ "<SPL_28": 50323,
242
+ "<SPL_280": 50575,
243
+ "<SPL_281": 50576,
244
+ "<SPL_282": 50577,
245
+ "<SPL_283": 50578,
246
+ "<SPL_284": 50579,
247
+ "<SPL_285": 50580,
248
+ "<SPL_286": 50581,
249
+ "<SPL_287": 50582,
250
+ "<SPL_288": 50583,
251
+ "<SPL_289": 50584,
252
+ "<SPL_29": 50324,
253
+ "<SPL_290": 50585,
254
+ "<SPL_291": 50586,
255
+ "<SPL_292": 50587,
256
+ "<SPL_293": 50588,
257
+ "<SPL_294": 50589,
258
+ "<SPL_295": 50590,
259
+ "<SPL_296": 50591,
260
+ "<SPL_297": 50592,
261
+ "<SPL_298": 50593,
262
+ "<SPL_299": 50594,
263
+ "<SPL_3": 50298,
264
+ "<SPL_30": 50325,
265
+ "<SPL_300": 50595,
266
+ "<SPL_301": 50596,
267
+ "<SPL_302": 50597,
268
+ "<SPL_303": 50598,
269
+ "<SPL_304": 50599,
270
+ "<SPL_305": 50600,
271
+ "<SPL_306": 50601,
272
+ "<SPL_307": 50602,
273
+ "<SPL_308": 50603,
274
+ "<SPL_309": 50604,
275
+ "<SPL_31": 50326,
276
+ "<SPL_310": 50605,
277
+ "<SPL_311": 50606,
278
+ "<SPL_312": 50607,
279
+ "<SPL_313": 50608,
280
+ "<SPL_314": 50609,
281
+ "<SPL_315": 50610,
282
+ "<SPL_316": 50611,
283
+ "<SPL_317": 50612,
284
+ "<SPL_318": 50613,
285
+ "<SPL_319": 50614,
286
+ "<SPL_32": 50327,
287
+ "<SPL_320": 50615,
288
+ "<SPL_321": 50616,
289
+ "<SPL_322": 50617,
290
+ "<SPL_323": 50618,
291
+ "<SPL_324": 50619,
292
+ "<SPL_325": 50620,
293
+ "<SPL_326": 50621,
294
+ "<SPL_327": 50622,
295
+ "<SPL_328": 50623,
296
+ "<SPL_329": 50624,
297
+ "<SPL_33": 50328,
298
+ "<SPL_330": 50625,
299
+ "<SPL_331": 50626,
300
+ "<SPL_332": 50627,
301
+ "<SPL_333": 50628,
302
+ "<SPL_334": 50629,
303
+ "<SPL_335": 50630,
304
+ "<SPL_336": 50631,
305
+ "<SPL_337": 50632,
306
+ "<SPL_338": 50633,
307
+ "<SPL_339": 50634,
308
+ "<SPL_34": 50329,
309
+ "<SPL_340": 50635,
310
+ "<SPL_341": 50636,
311
+ "<SPL_342": 50637,
312
+ "<SPL_343": 50638,
313
+ "<SPL_344": 50639,
314
+ "<SPL_345": 50640,
315
+ "<SPL_346": 50641,
316
+ "<SPL_347": 50642,
317
+ "<SPL_348": 50643,
318
+ "<SPL_349": 50644,
319
+ "<SPL_35": 50330,
320
+ "<SPL_350": 50645,
321
+ "<SPL_351": 50646,
322
+ "<SPL_352": 50647,
323
+ "<SPL_353": 50648,
324
+ "<SPL_354": 50649,
325
+ "<SPL_355": 50650,
326
+ "<SPL_356": 50651,
327
+ "<SPL_357": 50652,
328
+ "<SPL_358": 50653,
329
+ "<SPL_359": 50654,
330
+ "<SPL_36": 50331,
331
+ "<SPL_360": 50655,
332
+ "<SPL_361": 50656,
333
+ "<SPL_362": 50657,
334
+ "<SPL_363": 50658,
335
+ "<SPL_364": 50659,
336
+ "<SPL_365": 50660,
337
+ "<SPL_366": 50661,
338
+ "<SPL_367": 50662,
339
+ "<SPL_368": 50663,
340
+ "<SPL_369": 50664,
341
+ "<SPL_37": 50332,
342
+ "<SPL_370": 50665,
343
+ "<SPL_371": 50666,
344
+ "<SPL_372": 50667,
345
+ "<SPL_373": 50668,
346
+ "<SPL_374": 50669,
347
+ "<SPL_375": 50670,
348
+ "<SPL_376": 50671,
349
+ "<SPL_377": 50672,
350
+ "<SPL_378": 50673,
351
+ "<SPL_379": 50674,
352
+ "<SPL_38": 50333,
353
+ "<SPL_380": 50675,
354
+ "<SPL_381": 50676,
355
+ "<SPL_382": 50677,
356
+ "<SPL_383": 50678,
357
+ "<SPL_384": 50679,
358
+ "<SPL_385": 50680,
359
+ "<SPL_386": 50681,
360
+ "<SPL_387": 50682,
361
+ "<SPL_388": 50683,
362
+ "<SPL_389": 50684,
363
+ "<SPL_39": 50334,
364
+ "<SPL_390": 50685,
365
+ "<SPL_391": 50686,
366
+ "<SPL_392": 50687,
367
+ "<SPL_393": 50688,
368
+ "<SPL_394": 50689,
369
+ "<SPL_395": 50690,
370
+ "<SPL_396": 50691,
371
+ "<SPL_397": 50692,
372
+ "<SPL_398": 50693,
373
+ "<SPL_399": 50694,
374
+ "<SPL_4": 50299,
375
+ "<SPL_40": 50335,
376
+ "<SPL_400": 50695,
377
+ "<SPL_401": 50696,
378
+ "<SPL_402": 50697,
379
+ "<SPL_403": 50698,
380
+ "<SPL_404": 50699,
381
+ "<SPL_405": 50700,
382
+ "<SPL_406": 50701,
383
+ "<SPL_407": 50702,
384
+ "<SPL_408": 50703,
385
+ "<SPL_409": 50704,
386
+ "<SPL_41": 50336,
387
+ "<SPL_410": 50705,
388
+ "<SPL_411": 50706,
389
+ "<SPL_412": 50707,
390
+ "<SPL_413": 50708,
391
+ "<SPL_414": 50709,
392
+ "<SPL_415": 50710,
393
+ "<SPL_416": 50711,
394
+ "<SPL_417": 50712,
395
+ "<SPL_418": 50713,
396
+ "<SPL_419": 50714,
397
+ "<SPL_42": 50337,
398
+ "<SPL_420": 50715,
399
+ "<SPL_421": 50716,
400
+ "<SPL_422": 50717,
401
+ "<SPL_423": 50718,
402
+ "<SPL_424": 50719,
403
+ "<SPL_425": 50720,
404
+ "<SPL_426": 50721,
405
+ "<SPL_427": 50722,
406
+ "<SPL_428": 50723,
407
+ "<SPL_429": 50724,
408
+ "<SPL_43": 50338,
409
+ "<SPL_430": 50725,
410
+ "<SPL_431": 50726,
411
+ "<SPL_432": 50727,
412
+ "<SPL_433": 50728,
413
+ "<SPL_434": 50729,
414
+ "<SPL_435": 50730,
415
+ "<SPL_436": 50731,
416
+ "<SPL_437": 50732,
417
+ "<SPL_438": 50733,
418
+ "<SPL_439": 50734,
419
+ "<SPL_44": 50339,
420
+ "<SPL_440": 50735,
421
+ "<SPL_441": 50736,
422
+ "<SPL_442": 50737,
423
+ "<SPL_443": 50738,
424
+ "<SPL_444": 50739,
425
+ "<SPL_445": 50740,
426
+ "<SPL_446": 50741,
427
+ "<SPL_447": 50742,
428
+ "<SPL_448": 50743,
429
+ "<SPL_449": 50744,
430
+ "<SPL_45": 50340,
431
+ "<SPL_450": 50745,
432
+ "<SPL_451": 50746,
433
+ "<SPL_452": 50747,
434
+ "<SPL_453": 50748,
435
+ "<SPL_454": 50749,
436
+ "<SPL_455": 50750,
437
+ "<SPL_456": 50751,
438
+ "<SPL_457": 50752,
439
+ "<SPL_458": 50753,
440
+ "<SPL_459": 50754,
441
+ "<SPL_46": 50341,
442
+ "<SPL_460": 50755,
443
+ "<SPL_461": 50756,
444
+ "<SPL_462": 50757,
445
+ "<SPL_463": 50758,
446
+ "<SPL_464": 50759,
447
+ "<SPL_465": 50760,
448
+ "<SPL_466": 50761,
449
+ "<SPL_467": 50762,
450
+ "<SPL_468": 50763,
451
+ "<SPL_469": 50764,
452
+ "<SPL_47": 50342,
453
+ "<SPL_470": 50765,
454
+ "<SPL_471": 50766,
455
+ "<SPL_472": 50767,
456
+ "<SPL_473": 50768,
457
+ "<SPL_474": 50769,
458
+ "<SPL_475": 50770,
459
+ "<SPL_476": 50771,
460
+ "<SPL_477": 50772,
461
+ "<SPL_478": 50773,
462
+ "<SPL_479": 50774,
463
+ "<SPL_48": 50343,
464
+ "<SPL_480": 50775,
465
+ "<SPL_481": 50776,
466
+ "<SPL_482": 50777,
467
+ "<SPL_483": 50778,
468
+ "<SPL_484": 50779,
469
+ "<SPL_485": 50780,
470
+ "<SPL_486": 50781,
471
+ "<SPL_487": 50782,
472
+ "<SPL_488": 50783,
473
+ "<SPL_489": 50784,
474
+ "<SPL_49": 50344,
475
+ "<SPL_490": 50785,
476
+ "<SPL_491": 50786,
477
+ "<SPL_492": 50787,
478
+ "<SPL_493": 50788,
479
+ "<SPL_494": 50789,
480
+ "<SPL_495": 50790,
481
+ "<SPL_496": 50791,
482
+ "<SPL_497": 50792,
483
+ "<SPL_498": 50793,
484
+ "<SPL_499": 50794,
485
+ "<SPL_5": 50300,
486
+ "<SPL_50": 50345,
487
+ "<SPL_500": 50795,
488
+ "<SPL_501": 50796,
489
+ "<SPL_502": 50797,
490
+ "<SPL_503": 50798,
491
+ "<SPL_504": 50799,
492
+ "<SPL_505": 50800,
493
+ "<SPL_506": 50801,
494
+ "<SPL_507": 50802,
495
+ "<SPL_508": 50803,
496
+ "<SPL_509": 50804,
497
+ "<SPL_51": 50346,
498
+ "<SPL_510": 50805,
499
+ "<SPL_511": 50806,
500
+ "<SPL_512": 50807,
501
+ "<SPL_513": 50808,
502
+ "<SPL_514": 50809,
503
+ "<SPL_515": 50810,
504
+ "<SPL_516": 50811,
505
+ "<SPL_517": 50812,
506
+ "<SPL_518": 50813,
507
+ "<SPL_519": 50814,
508
+ "<SPL_52": 50347,
509
+ "<SPL_520": 50815,
510
+ "<SPL_521": 50816,
511
+ "<SPL_522": 50817,
512
+ "<SPL_523": 50818,
513
+ "<SPL_524": 50819,
514
+ "<SPL_525": 50820,
515
+ "<SPL_526": 50821,
516
+ "<SPL_527": 50822,
517
+ "<SPL_528": 50823,
518
+ "<SPL_529": 50824,
519
+ "<SPL_53": 50348,
520
+ "<SPL_530": 50825,
521
+ "<SPL_531": 50826,
522
+ "<SPL_532": 50827,
523
+ "<SPL_533": 50828,
524
+ "<SPL_534": 50829,
525
+ "<SPL_535": 50830,
526
+ "<SPL_536": 50831,
527
+ "<SPL_537": 50832,
528
+ "<SPL_538": 50833,
529
+ "<SPL_539": 50834,
530
+ "<SPL_54": 50349,
531
+ "<SPL_540": 50835,
532
+ "<SPL_541": 50836,
533
+ "<SPL_542": 50837,
534
+ "<SPL_543": 50838,
535
+ "<SPL_544": 50839,
536
+ "<SPL_545": 50840,
537
+ "<SPL_546": 50841,
538
+ "<SPL_547": 50842,
539
+ "<SPL_548": 50843,
540
+ "<SPL_549": 50844,
541
+ "<SPL_55": 50350,
542
+ "<SPL_550": 50845,
543
+ "<SPL_551": 50846,
544
+ "<SPL_552": 50847,
545
+ "<SPL_553": 50848,
546
+ "<SPL_554": 50849,
547
+ "<SPL_555": 50850,
548
+ "<SPL_556": 50851,
549
+ "<SPL_557": 50852,
550
+ "<SPL_558": 50853,
551
+ "<SPL_559": 50854,
552
+ "<SPL_56": 50351,
553
+ "<SPL_560": 50855,
554
+ "<SPL_561": 50856,
555
+ "<SPL_562": 50857,
556
+ "<SPL_563": 50858,
557
+ "<SPL_564": 50859,
558
+ "<SPL_565": 50860,
559
+ "<SPL_566": 50861,
560
+ "<SPL_567": 50862,
561
+ "<SPL_568": 50863,
562
+ "<SPL_569": 50864,
563
+ "<SPL_57": 50352,
564
+ "<SPL_570": 50865,
565
+ "<SPL_571": 50866,
566
+ "<SPL_572": 50867,
567
+ "<SPL_573": 50868,
568
+ "<SPL_574": 50869,
569
+ "<SPL_575": 50870,
570
+ "<SPL_576": 50871,
571
+ "<SPL_577": 50872,
572
+ "<SPL_578": 50873,
573
+ "<SPL_579": 50874,
574
+ "<SPL_58": 50353,
575
+ "<SPL_580": 50875,
576
+ "<SPL_581": 50876,
577
+ "<SPL_582": 50877,
578
+ "<SPL_583": 50878,
579
+ "<SPL_584": 50879,
580
+ "<SPL_585": 50880,
581
+ "<SPL_586": 50881,
582
+ "<SPL_587": 50882,
583
+ "<SPL_588": 50883,
584
+ "<SPL_589": 50884,
585
+ "<SPL_59": 50354,
586
+ "<SPL_590": 50885,
587
+ "<SPL_591": 50886,
588
+ "<SPL_592": 50887,
589
+ "<SPL_593": 50888,
590
+ "<SPL_594": 50889,
591
+ "<SPL_595": 50890,
592
+ "<SPL_596": 50891,
593
+ "<SPL_597": 50892,
594
+ "<SPL_598": 50893,
595
+ "<SPL_599": 50894,
596
+ "<SPL_6": 50301,
597
+ "<SPL_60": 50355,
598
+ "<SPL_600": 50895,
599
+ "<SPL_601": 50896,
600
+ "<SPL_602": 50897,
601
+ "<SPL_603": 50898,
602
+ "<SPL_604": 50899,
603
+ "<SPL_605": 50900,
604
+ "<SPL_606": 50901,
605
+ "<SPL_607": 50902,
606
+ "<SPL_608": 50903,
607
+ "<SPL_609": 50904,
608
+ "<SPL_61": 50356,
609
+ "<SPL_610": 50905,
610
+ "<SPL_611": 50906,
611
+ "<SPL_612": 50907,
612
+ "<SPL_613": 50908,
613
+ "<SPL_614": 50909,
614
+ "<SPL_615": 50910,
615
+ "<SPL_616": 50911,
616
+ "<SPL_617": 50912,
617
+ "<SPL_618": 50913,
618
+ "<SPL_619": 50914,
619
+ "<SPL_62": 50357,
620
+ "<SPL_620": 50915,
621
+ "<SPL_621": 50916,
622
+ "<SPL_622": 50917,
623
+ "<SPL_623": 50918,
624
+ "<SPL_624": 50919,
625
+ "<SPL_625": 50920,
626
+ "<SPL_626": 50921,
627
+ "<SPL_627": 50922,
628
+ "<SPL_628": 50923,
629
+ "<SPL_629": 50924,
630
+ "<SPL_63": 50358,
631
+ "<SPL_630": 50925,
632
+ "<SPL_631": 50926,
633
+ "<SPL_632": 50927,
634
+ "<SPL_633": 50928,
635
+ "<SPL_634": 50929,
636
+ "<SPL_635": 50930,
637
+ "<SPL_636": 50931,
638
+ "<SPL_637": 50932,
639
+ "<SPL_638": 50933,
640
+ "<SPL_639": 50934,
641
+ "<SPL_64": 50359,
642
+ "<SPL_640": 50935,
643
+ "<SPL_641": 50936,
644
+ "<SPL_642": 50937,
645
+ "<SPL_643": 50938,
646
+ "<SPL_644": 50939,
647
+ "<SPL_645": 50940,
648
+ "<SPL_646": 50941,
649
+ "<SPL_647": 50942,
650
+ "<SPL_648": 50943,
651
+ "<SPL_649": 50944,
652
+ "<SPL_65": 50360,
653
+ "<SPL_650": 50945,
654
+ "<SPL_651": 50946,
655
+ "<SPL_652": 50947,
656
+ "<SPL_653": 50948,
657
+ "<SPL_654": 50949,
658
+ "<SPL_655": 50950,
659
+ "<SPL_656": 50951,
660
+ "<SPL_657": 50952,
661
+ "<SPL_658": 50953,
662
+ "<SPL_659": 50954,
663
+ "<SPL_66": 50361,
664
+ "<SPL_660": 50955,
665
+ "<SPL_661": 50956,
666
+ "<SPL_662": 50957,
667
+ "<SPL_663": 50958,
668
+ "<SPL_664": 50959,
669
+ "<SPL_665": 50960,
670
+ "<SPL_666": 50961,
671
+ "<SPL_667": 50962,
672
+ "<SPL_668": 50963,
673
+ "<SPL_669": 50964,
674
+ "<SPL_67": 50362,
675
+ "<SPL_670": 50965,
676
+ "<SPL_671": 50966,
677
+ "<SPL_672": 50967,
678
+ "<SPL_673": 50968,
679
+ "<SPL_674": 50969,
680
+ "<SPL_675": 50970,
681
+ "<SPL_676": 50971,
682
+ "<SPL_677": 50972,
683
+ "<SPL_678": 50973,
684
+ "<SPL_679": 50974,
685
+ "<SPL_68": 50363,
686
+ "<SPL_680": 50975,
687
+ "<SPL_681": 50976,
688
+ "<SPL_682": 50977,
689
+ "<SPL_683": 50978,
690
+ "<SPL_684": 50979,
691
+ "<SPL_685": 50980,
692
+ "<SPL_686": 50981,
693
+ "<SPL_687": 50982,
694
+ "<SPL_688": 50983,
695
+ "<SPL_689": 50984,
696
+ "<SPL_69": 50364,
697
+ "<SPL_690": 50985,
698
+ "<SPL_691": 50986,
699
+ "<SPL_692": 50987,
700
+ "<SPL_693": 50988,
701
+ "<SPL_694": 50989,
702
+ "<SPL_695": 50990,
703
+ "<SPL_696": 50991,
704
+ "<SPL_697": 50992,
705
+ "<SPL_698": 50993,
706
+ "<SPL_699": 50994,
707
+ "<SPL_7": 50302,
708
+ "<SPL_70": 50365,
709
+ "<SPL_700": 50995,
710
+ "<SPL_701": 50996,
711
+ "<SPL_702": 50997,
712
+ "<SPL_703": 50998,
713
+ "<SPL_704": 50999,
714
+ "<SPL_705": 51000,
715
+ "<SPL_706": 51001,
716
+ "<SPL_707": 51002,
717
+ "<SPL_708": 51003,
718
+ "<SPL_709": 51004,
719
+ "<SPL_71": 50366,
720
+ "<SPL_710": 51005,
721
+ "<SPL_711": 51006,
722
+ "<SPL_712": 51007,
723
+ "<SPL_713": 51008,
724
+ "<SPL_714": 51009,
725
+ "<SPL_715": 51010,
726
+ "<SPL_716": 51011,
727
+ "<SPL_717": 51012,
728
+ "<SPL_718": 51013,
729
+ "<SPL_719": 51014,
730
+ "<SPL_72": 50367,
731
+ "<SPL_720": 51015,
732
+ "<SPL_721": 51016,
733
+ "<SPL_722": 51017,
734
+ "<SPL_723": 51018,
735
+ "<SPL_724": 51019,
736
+ "<SPL_725": 51020,
737
+ "<SPL_726": 51021,
738
+ "<SPL_727": 51022,
739
+ "<SPL_728": 51023,
740
+ "<SPL_729": 51024,
741
+ "<SPL_73": 50368,
742
+ "<SPL_730": 51025,
743
+ "<SPL_731": 51026,
744
+ "<SPL_732": 51027,
745
+ "<SPL_733": 51028,
746
+ "<SPL_734": 51029,
747
+ "<SPL_735": 51030,
748
+ "<SPL_736": 51031,
749
+ "<SPL_737": 51032,
750
+ "<SPL_738": 51033,
751
+ "<SPL_739": 51034,
752
+ "<SPL_74": 50369,
753
+ "<SPL_740": 51035,
754
+ "<SPL_741": 51036,
755
+ "<SPL_742": 51037,
756
+ "<SPL_743": 51038,
757
+ "<SPL_744": 51039,
758
+ "<SPL_745": 51040,
759
+ "<SPL_746": 51041,
760
+ "<SPL_747": 51042,
761
+ "<SPL_748": 51043,
762
+ "<SPL_749": 51044,
763
+ "<SPL_75": 50370,
764
+ "<SPL_750": 51045,
765
+ "<SPL_751": 51046,
766
+ "<SPL_752": 51047,
767
+ "<SPL_753": 51048,
768
+ "<SPL_754": 51049,
769
+ "<SPL_755": 51050,
770
+ "<SPL_756": 51051,
771
+ "<SPL_757": 51052,
772
+ "<SPL_758": 51053,
773
+ "<SPL_759": 51054,
774
+ "<SPL_76": 50371,
775
+ "<SPL_760": 51055,
776
+ "<SPL_761": 51056,
777
+ "<SPL_762": 51057,
778
+ "<SPL_763": 51058,
779
+ "<SPL_764": 51059,
780
+ "<SPL_765": 51060,
781
+ "<SPL_766": 51061,
782
+ "<SPL_767": 51062,
783
+ "<SPL_768": 51063,
784
+ "<SPL_769": 51064,
785
+ "<SPL_77": 50372,
786
+ "<SPL_770": 51065,
787
+ "<SPL_771": 51066,
788
+ "<SPL_772": 51067,
789
+ "<SPL_773": 51068,
790
+ "<SPL_774": 51069,
791
+ "<SPL_775": 51070,
792
+ "<SPL_776": 51071,
793
+ "<SPL_777": 51072,
794
+ "<SPL_778": 51073,
795
+ "<SPL_779": 51074,
796
+ "<SPL_78": 50373,
797
+ "<SPL_780": 51075,
798
+ "<SPL_781": 51076,
799
+ "<SPL_782": 51077,
800
+ "<SPL_783": 51078,
801
+ "<SPL_784": 51079,
802
+ "<SPL_785": 51080,
803
+ "<SPL_786": 51081,
804
+ "<SPL_787": 51082,
805
+ "<SPL_788": 51083,
806
+ "<SPL_789": 51084,
807
+ "<SPL_79": 50374,
808
+ "<SPL_790": 51085,
809
+ "<SPL_791": 51086,
810
+ "<SPL_792": 51087,
811
+ "<SPL_793": 51088,
812
+ "<SPL_794": 51089,
813
+ "<SPL_795": 51090,
814
+ "<SPL_796": 51091,
815
+ "<SPL_797": 51092,
816
+ "<SPL_798": 51093,
817
+ "<SPL_799": 51094,
818
+ "<SPL_8": 50303,
819
+ "<SPL_80": 50375,
820
+ "<SPL_800": 51095,
821
+ "<SPL_801": 51096,
822
+ "<SPL_802": 51097,
823
+ "<SPL_803": 51098,
824
+ "<SPL_804": 51099,
825
+ "<SPL_805": 51100,
826
+ "<SPL_806": 51101,
827
+ "<SPL_807": 51102,
828
+ "<SPL_808": 51103,
829
+ "<SPL_809": 51104,
830
+ "<SPL_81": 50376,
831
+ "<SPL_810": 51105,
832
+ "<SPL_811": 51106,
833
+ "<SPL_812": 51107,
834
+ "<SPL_813": 51108,
835
+ "<SPL_814": 51109,
836
+ "<SPL_815": 51110,
837
+ "<SPL_816": 51111,
838
+ "<SPL_817": 51112,
839
+ "<SPL_818": 51113,
840
+ "<SPL_819": 51114,
841
+ "<SPL_82": 50377,
842
+ "<SPL_820": 51115,
843
+ "<SPL_821": 51116,
844
+ "<SPL_822": 51117,
845
+ "<SPL_823": 51118,
846
+ "<SPL_824": 51119,
847
+ "<SPL_825": 51120,
848
+ "<SPL_826": 51121,
849
+ "<SPL_827": 51122,
850
+ "<SPL_828": 51123,
851
+ "<SPL_829": 51124,
852
+ "<SPL_83": 50378,
853
+ "<SPL_830": 51125,
854
+ "<SPL_831": 51126,
855
+ "<SPL_832": 51127,
856
+ "<SPL_833": 51128,
857
+ "<SPL_834": 51129,
858
+ "<SPL_835": 51130,
859
+ "<SPL_836": 51131,
860
+ "<SPL_837": 51132,
861
+ "<SPL_838": 51133,
862
+ "<SPL_839": 51134,
863
+ "<SPL_84": 50379,
864
+ "<SPL_840": 51135,
865
+ "<SPL_841": 51136,
866
+ "<SPL_842": 51137,
867
+ "<SPL_843": 51138,
868
+ "<SPL_844": 51139,
869
+ "<SPL_845": 51140,
870
+ "<SPL_846": 51141,
871
+ "<SPL_847": 51142,
872
+ "<SPL_848": 51143,
873
+ "<SPL_849": 51144,
874
+ "<SPL_85": 50380,
875
+ "<SPL_850": 51145,
876
+ "<SPL_851": 51146,
877
+ "<SPL_852": 51147,
878
+ "<SPL_853": 51148,
879
+ "<SPL_854": 51149,
880
+ "<SPL_855": 51150,
881
+ "<SPL_856": 51151,
882
+ "<SPL_857": 51152,
883
+ "<SPL_858": 51153,
884
+ "<SPL_859": 51154,
885
+ "<SPL_86": 50381,
886
+ "<SPL_860": 51155,
887
+ "<SPL_861": 51156,
888
+ "<SPL_862": 51157,
889
+ "<SPL_863": 51158,
890
+ "<SPL_864": 51159,
891
+ "<SPL_865": 51160,
892
+ "<SPL_866": 51161,
893
+ "<SPL_867": 51162,
894
+ "<SPL_868": 51163,
895
+ "<SPL_869": 51164,
896
+ "<SPL_87": 50382,
897
+ "<SPL_870": 51165,
898
+ "<SPL_871": 51166,
899
+ "<SPL_872": 51167,
900
+ "<SPL_873": 51168,
901
+ "<SPL_874": 51169,
902
+ "<SPL_875": 51170,
903
+ "<SPL_876": 51171,
904
+ "<SPL_877": 51172,
905
+ "<SPL_878": 51173,
906
+ "<SPL_879": 51174,
907
+ "<SPL_88": 50383,
908
+ "<SPL_880": 51175,
909
+ "<SPL_881": 51176,
910
+ "<SPL_882": 51177,
911
+ "<SPL_883": 51178,
912
+ "<SPL_884": 51179,
913
+ "<SPL_885": 51180,
914
+ "<SPL_886": 51181,
915
+ "<SPL_887": 51182,
916
+ "<SPL_888": 51183,
917
+ "<SPL_889": 51184,
918
+ "<SPL_89": 50384,
919
+ "<SPL_890": 51185,
920
+ "<SPL_891": 51186,
921
+ "<SPL_892": 51187,
922
+ "<SPL_893": 51188,
923
+ "<SPL_894": 51189,
924
+ "<SPL_895": 51190,
925
+ "<SPL_896": 51191,
926
+ "<SPL_897": 51192,
927
+ "<SPL_898": 51193,
928
+ "<SPL_899": 51194,
929
+ "<SPL_9": 50304,
930
+ "<SPL_90": 50385,
931
+ "<SPL_900": 51195,
932
+ "<SPL_901": 51196,
933
+ "<SPL_902": 51197,
934
+ "<SPL_903": 51198,
935
+ "<SPL_904": 51199,
936
+ "<SPL_905": 51200,
937
+ "<SPL_906": 51201,
938
+ "<SPL_907": 51202,
939
+ "<SPL_908": 51203,
940
+ "<SPL_909": 51204,
941
+ "<SPL_91": 50386,
942
+ "<SPL_910": 51205,
943
+ "<SPL_911": 51206,
944
+ "<SPL_912": 51207,
945
+ "<SPL_913": 51208,
946
+ "<SPL_914": 51209,
947
+ "<SPL_915": 51210,
948
+ "<SPL_916": 51211,
949
+ "<SPL_917": 51212,
950
+ "<SPL_918": 51213,
951
+ "<SPL_919": 51214,
952
+ "<SPL_92": 50387,
953
+ "<SPL_920": 51215,
954
+ "<SPL_921": 51216,
955
+ "<SPL_922": 51217,
956
+ "<SPL_923": 51218,
957
+ "<SPL_924": 51219,
958
+ "<SPL_925": 51220,
959
+ "<SPL_926": 51221,
960
+ "<SPL_927": 51222,
961
+ "<SPL_928": 51223,
962
+ "<SPL_929": 51224,
963
+ "<SPL_93": 50388,
964
+ "<SPL_930": 51225,
965
+ "<SPL_931": 51226,
966
+ "<SPL_932": 51227,
967
+ "<SPL_933": 51228,
968
+ "<SPL_934": 51229,
969
+ "<SPL_935": 51230,
970
+ "<SPL_936": 51231,
971
+ "<SPL_937": 51232,
972
+ "<SPL_938": 51233,
973
+ "<SPL_939": 51234,
974
+ "<SPL_94": 50389,
975
+ "<SPL_940": 51235,
976
+ "<SPL_941": 51236,
977
+ "<SPL_942": 51237,
978
+ "<SPL_95": 50390,
979
+ "<SPL_96": 50391,
980
+ "<SPL_97": 50392,
981
+ "<SPL_98": 50393,
982
+ "<SPL_99": 50394,
983
+ "<|im_end|>": 51238
984
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<|endoftext|>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "<|im_end|>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": "</s>",
17
+ "unk_token": {
18
+ "content": "<|endoftext|>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ }
24
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff
 
vocab.json ADDED
The diff for this file is too large to render. See raw diff