rfbr commited on
Commit
d3e9e26
·
verified ·
1 Parent(s): c4a4924

Update README, tokenizer files; rename image (remove v2 suffix)

Browse files
Files changed (5) hide show
  1. README.md +10 -8
  2. kairos_seq_model.png +3 -0
  3. special_tokens_map.json +5 -22
  4. tokenizer.json +2458 -150
  5. tokenizer_config.json +8 -863
README.md CHANGED
@@ -99,7 +99,7 @@ The list of available checkpoints is disclosed below:
99
  | shuffle_eq_2024 | 2.2T | 2024 | 2020 | yes |
100
  | shuffle_eq_2025 | 2.5T | 2024| 2020 | yes |
101
 
102
- <sup>*</sup> **Note on Non-Cooldown Variants:** For these specific checkpoints, we also provide "non-cooldown" counterparts. These are extracted directly from the training process at the equivalent token count without applying a learning rate decay (cooldown phase).
103
  ## Training Details
104
 
105
  ### Training Data
@@ -141,7 +141,7 @@ While our models are primarily designed to facilitate research on LLM temporalit
141
 
142
  ### Temporal improvements
143
 
144
- We underline in the paper [Understanding Data Temporality Impact on Large Language Models Pre-training]() that our sequentially trained Helium 6B benefits from more up-to-date as tested on our [KairosQA](https://huggingface.co/datasets/kyutai/KairosQA) dataset.
145
 
146
 
147
 
@@ -156,12 +156,14 @@ Helium 6B models are licensed under the CC-BY-SA 4.0 license.
156
  If you use one of these models, please cite:
157
 
158
  ```bibtex
159
- @unpublished{kairos2025temporality,
160
- title={Understanding Data Temporality Impact on Large Language Models Pre-training},
161
- author={Pilchen, Hippolyte and Fabre, Romain and Signe Talla, Franck and Perez, Patrick and Grave, Edouard.},
162
- note={Preprint. Work in progress.},
163
- year={2025},
164
- url={https://github.com/kyutai-labs/kairos}
 
 
165
  }
166
  ```
167
 
 
99
  | shuffle_eq_2024 | 2.2T | 2024 | 2020 | yes |
100
  | shuffle_eq_2025 | 2.5T | 2024| 2020 | yes |
101
 
102
+ <sup>*</sup> **Note on Non-Cooldown Variants:** For these specific checkpoints, we can also provide "non-cooldown" counterparts. These are extracted directly from the training process at the equivalent token count without applying a learning rate decay (cooldown phase).
103
  ## Training Details
104
 
105
  ### Training Data
 
141
 
142
  ### Temporal improvements
143
 
144
+ We underline in the paper [Understanding Data Temporality Impact on Large Language Models Pre-training](https://arxiv.org/abs/2605.22769) that our sequentially trained Helium 6B benefits from more up-to-date as tested on our [KairosQA](https://huggingface.co/datasets/kyutai/KairosQA) dataset.
145
 
146
 
147
 
 
156
  If you use one of these models, please cite:
157
 
158
  ```bibtex
159
+ @misc{pilchen2026understandingdatatemporalityimpact,
160
+ title={Understanding Data Temporality Impact on Large Language Models Pre-training},
161
+ author={Hippolyte Pilchen and Romain Fabre and Franck Signe Talla and Patrick Perez and Edouard Grave},
162
+ year={2026},
163
+ eprint={2605.22769},
164
+ archivePrefix={arXiv},
165
+ primaryClass={cs.CL},
166
+ url={https://arxiv.org/abs/2605.22769},
167
  }
168
  ```
169
 
kairos_seq_model.png ADDED

Git LFS Details

  • SHA256: 7b6da9b5efeb71d2cae06f1c36b78463dcad17dfe2b9d2a1d69c2a5b552950ef
  • Pointer size: 132 Bytes
  • Size of remote file: 1.34 MB
special_tokens_map.json CHANGED
@@ -1,23 +1,6 @@
1
  {
2
- "bos_token": {
3
- "content": "<s>",
4
- "lstrip": false,
5
- "normalized": false,
6
- "rstrip": false,
7
- "single_word": false
8
- },
9
- "eos_token": {
10
- "content": "</s>",
11
- "lstrip": false,
12
- "normalized": false,
13
- "rstrip": false,
14
- "single_word": false
15
- },
16
- "unk_token": {
17
- "content": "<unk>",
18
- "lstrip": false,
19
- "normalized": false,
20
- "rstrip": false,
21
- "single_word": false
22
- }
23
- }
 
1
  {
2
+ "bos_token": "<s>",
3
+ "eos_token": "</s>",
4
+ "unk_token": "<unk>",
5
+ "pad_token": "<pad>"
6
+ }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tokenizer.json CHANGED
@@ -46,7 +46,7 @@
46
  "lstrip": false,
47
  "rstrip": false,
48
  "normalized": false,
49
- "special": false
50
  },
51
  {
52
  "id": 5,
@@ -55,7 +55,7 @@
55
  "lstrip": false,
56
  "rstrip": false,
57
  "normalized": false,
58
- "special": false
59
  },
60
  {
61
  "id": 6,
@@ -64,7 +64,7 @@
64
  "lstrip": false,
65
  "rstrip": false,
66
  "normalized": false,
67
- "special": false
68
  },
69
  {
70
  "id": 7,
@@ -73,7 +73,7 @@
73
  "lstrip": false,
74
  "rstrip": false,
75
  "normalized": false,
76
- "special": false
77
  },
78
  {
79
  "id": 8,
@@ -82,7 +82,7 @@
82
  "lstrip": false,
83
  "rstrip": false,
84
  "normalized": false,
85
- "special": false
86
  },
87
  {
88
  "id": 9,
@@ -91,7 +91,7 @@
91
  "lstrip": false,
92
  "rstrip": false,
93
  "normalized": false,
94
- "special": false
95
  },
96
  {
97
  "id": 10,
@@ -100,7 +100,7 @@
100
  "lstrip": false,
101
  "rstrip": false,
102
  "normalized": false,
103
- "special": false
104
  },
105
  {
106
  "id": 11,
@@ -109,7 +109,7 @@
109
  "lstrip": false,
110
  "rstrip": false,
111
  "normalized": false,
112
- "special": false
113
  },
114
  {
115
  "id": 12,
@@ -118,7 +118,7 @@
118
  "lstrip": false,
119
  "rstrip": false,
120
  "normalized": false,
121
- "special": false
122
  },
123
  {
124
  "id": 13,
@@ -127,7 +127,7 @@
127
  "lstrip": false,
128
  "rstrip": false,
129
  "normalized": false,
130
- "special": false
131
  },
132
  {
133
  "id": 14,
@@ -136,7 +136,7 @@
136
  "lstrip": false,
137
  "rstrip": false,
138
  "normalized": false,
139
- "special": false
140
  },
141
  {
142
  "id": 15,
@@ -145,7 +145,7 @@
145
  "lstrip": false,
146
  "rstrip": false,
147
  "normalized": false,
148
- "special": false
149
  },
150
  {
151
  "id": 16,
@@ -154,7 +154,7 @@
154
  "lstrip": false,
155
  "rstrip": false,
156
  "normalized": false,
157
- "special": false
158
  },
159
  {
160
  "id": 17,
@@ -163,7 +163,7 @@
163
  "lstrip": false,
164
  "rstrip": false,
165
  "normalized": false,
166
- "special": false
167
  },
168
  {
169
  "id": 18,
@@ -172,7 +172,7 @@
172
  "lstrip": false,
173
  "rstrip": false,
174
  "normalized": false,
175
- "special": false
176
  },
177
  {
178
  "id": 19,
@@ -181,7 +181,7 @@
181
  "lstrip": false,
182
  "rstrip": false,
183
  "normalized": false,
184
- "special": false
185
  },
186
  {
187
  "id": 20,
@@ -190,7 +190,7 @@
190
  "lstrip": false,
191
  "rstrip": false,
192
  "normalized": false,
193
- "special": false
194
  },
195
  {
196
  "id": 21,
@@ -199,7 +199,7 @@
199
  "lstrip": false,
200
  "rstrip": false,
201
  "normalized": false,
202
- "special": false
203
  },
204
  {
205
  "id": 22,
@@ -208,7 +208,7 @@
208
  "lstrip": false,
209
  "rstrip": false,
210
  "normalized": false,
211
- "special": false
212
  },
213
  {
214
  "id": 23,
@@ -217,7 +217,7 @@
217
  "lstrip": false,
218
  "rstrip": false,
219
  "normalized": false,
220
- "special": false
221
  },
222
  {
223
  "id": 24,
@@ -226,7 +226,7 @@
226
  "lstrip": false,
227
  "rstrip": false,
228
  "normalized": false,
229
- "special": false
230
  },
231
  {
232
  "id": 25,
@@ -235,7 +235,7 @@
235
  "lstrip": false,
236
  "rstrip": false,
237
  "normalized": false,
238
- "special": false
239
  },
240
  {
241
  "id": 26,
@@ -244,7 +244,7 @@
244
  "lstrip": false,
245
  "rstrip": false,
246
  "normalized": false,
247
- "special": false
248
  },
249
  {
250
  "id": 27,
@@ -253,7 +253,7 @@
253
  "lstrip": false,
254
  "rstrip": false,
255
  "normalized": false,
256
- "special": false
257
  },
258
  {
259
  "id": 28,
@@ -262,7 +262,7 @@
262
  "lstrip": false,
263
  "rstrip": false,
264
  "normalized": false,
265
- "special": false
266
  },
267
  {
268
  "id": 29,
@@ -271,7 +271,7 @@
271
  "lstrip": false,
272
  "rstrip": false,
273
  "normalized": false,
274
- "special": false
275
  },
276
  {
277
  "id": 30,
@@ -280,7 +280,7 @@
280
  "lstrip": false,
281
  "rstrip": false,
282
  "normalized": false,
283
- "special": false
284
  },
285
  {
286
  "id": 31,
@@ -289,7 +289,7 @@
289
  "lstrip": false,
290
  "rstrip": false,
291
  "normalized": false,
292
- "special": false
293
  },
294
  {
295
  "id": 32,
@@ -298,7 +298,7 @@
298
  "lstrip": false,
299
  "rstrip": false,
300
  "normalized": false,
301
- "special": false
302
  },
303
  {
304
  "id": 33,
@@ -307,7 +307,7 @@
307
  "lstrip": false,
308
  "rstrip": false,
309
  "normalized": false,
310
- "special": false
311
  },
312
  {
313
  "id": 34,
@@ -316,7 +316,7 @@
316
  "lstrip": false,
317
  "rstrip": false,
318
  "normalized": false,
319
- "special": false
320
  },
321
  {
322
  "id": 35,
@@ -325,7 +325,7 @@
325
  "lstrip": false,
326
  "rstrip": false,
327
  "normalized": false,
328
- "special": false
329
  },
330
  {
331
  "id": 36,
@@ -334,7 +334,7 @@
334
  "lstrip": false,
335
  "rstrip": false,
336
  "normalized": false,
337
- "special": false
338
  },
339
  {
340
  "id": 37,
@@ -343,7 +343,7 @@
343
  "lstrip": false,
344
  "rstrip": false,
345
  "normalized": false,
346
- "special": false
347
  },
348
  {
349
  "id": 38,
@@ -352,7 +352,7 @@
352
  "lstrip": false,
353
  "rstrip": false,
354
  "normalized": false,
355
- "special": false
356
  },
357
  {
358
  "id": 39,
@@ -361,7 +361,7 @@
361
  "lstrip": false,
362
  "rstrip": false,
363
  "normalized": false,
364
- "special": false
365
  },
366
  {
367
  "id": 40,
@@ -370,7 +370,7 @@
370
  "lstrip": false,
371
  "rstrip": false,
372
  "normalized": false,
373
- "special": false
374
  },
375
  {
376
  "id": 41,
@@ -379,7 +379,7 @@
379
  "lstrip": false,
380
  "rstrip": false,
381
  "normalized": false,
382
- "special": false
383
  },
384
  {
385
  "id": 42,
@@ -388,7 +388,7 @@
388
  "lstrip": false,
389
  "rstrip": false,
390
  "normalized": false,
391
- "special": false
392
  },
393
  {
394
  "id": 43,
@@ -397,7 +397,7 @@
397
  "lstrip": false,
398
  "rstrip": false,
399
  "normalized": false,
400
- "special": false
401
  },
402
  {
403
  "id": 44,
@@ -406,7 +406,7 @@
406
  "lstrip": false,
407
  "rstrip": false,
408
  "normalized": false,
409
- "special": false
410
  },
411
  {
412
  "id": 45,
@@ -415,7 +415,7 @@
415
  "lstrip": false,
416
  "rstrip": false,
417
  "normalized": false,
418
- "special": false
419
  },
420
  {
421
  "id": 46,
@@ -424,7 +424,7 @@
424
  "lstrip": false,
425
  "rstrip": false,
426
  "normalized": false,
427
- "special": false
428
  },
429
  {
430
  "id": 47,
@@ -433,7 +433,7 @@
433
  "lstrip": false,
434
  "rstrip": false,
435
  "normalized": false,
436
- "special": false
437
  },
438
  {
439
  "id": 48,
@@ -442,7 +442,7 @@
442
  "lstrip": false,
443
  "rstrip": false,
444
  "normalized": false,
445
- "special": false
446
  },
447
  {
448
  "id": 49,
@@ -451,7 +451,7 @@
451
  "lstrip": false,
452
  "rstrip": false,
453
  "normalized": false,
454
- "special": false
455
  },
456
  {
457
  "id": 50,
@@ -460,7 +460,7 @@
460
  "lstrip": false,
461
  "rstrip": false,
462
  "normalized": false,
463
- "special": false
464
  },
465
  {
466
  "id": 51,
@@ -469,7 +469,7 @@
469
  "lstrip": false,
470
  "rstrip": false,
471
  "normalized": false,
472
- "special": false
473
  },
474
  {
475
  "id": 52,
@@ -478,7 +478,7 @@
478
  "lstrip": false,
479
  "rstrip": false,
480
  "normalized": false,
481
- "special": false
482
  },
483
  {
484
  "id": 53,
@@ -487,7 +487,7 @@
487
  "lstrip": false,
488
  "rstrip": false,
489
  "normalized": false,
490
- "special": false
491
  },
492
  {
493
  "id": 54,
@@ -496,7 +496,7 @@
496
  "lstrip": false,
497
  "rstrip": false,
498
  "normalized": false,
499
- "special": false
500
  },
501
  {
502
  "id": 55,
@@ -505,7 +505,7 @@
505
  "lstrip": false,
506
  "rstrip": false,
507
  "normalized": false,
508
- "special": false
509
  },
510
  {
511
  "id": 56,
@@ -514,7 +514,7 @@
514
  "lstrip": false,
515
  "rstrip": false,
516
  "normalized": false,
517
- "special": false
518
  },
519
  {
520
  "id": 57,
@@ -523,7 +523,7 @@
523
  "lstrip": false,
524
  "rstrip": false,
525
  "normalized": false,
526
- "special": false
527
  },
528
  {
529
  "id": 58,
@@ -532,7 +532,7 @@
532
  "lstrip": false,
533
  "rstrip": false,
534
  "normalized": false,
535
- "special": false
536
  },
537
  {
538
  "id": 59,
@@ -541,7 +541,7 @@
541
  "lstrip": false,
542
  "rstrip": false,
543
  "normalized": false,
544
- "special": false
545
  },
546
  {
547
  "id": 60,
@@ -550,7 +550,7 @@
550
  "lstrip": false,
551
  "rstrip": false,
552
  "normalized": false,
553
- "special": false
554
  },
555
  {
556
  "id": 61,
@@ -559,7 +559,7 @@
559
  "lstrip": false,
560
  "rstrip": false,
561
  "normalized": false,
562
- "special": false
563
  },
564
  {
565
  "id": 62,
@@ -568,7 +568,7 @@
568
  "lstrip": false,
569
  "rstrip": false,
570
  "normalized": false,
571
- "special": false
572
  },
573
  {
574
  "id": 63,
@@ -577,7 +577,7 @@
577
  "lstrip": false,
578
  "rstrip": false,
579
  "normalized": false,
580
- "special": false
581
  },
582
  {
583
  "id": 64,
@@ -586,7 +586,7 @@
586
  "lstrip": false,
587
  "rstrip": false,
588
  "normalized": false,
589
- "special": false
590
  },
591
  {
592
  "id": 65,
@@ -595,7 +595,7 @@
595
  "lstrip": false,
596
  "rstrip": false,
597
  "normalized": false,
598
- "special": false
599
  },
600
  {
601
  "id": 66,
@@ -604,7 +604,7 @@
604
  "lstrip": false,
605
  "rstrip": false,
606
  "normalized": false,
607
- "special": false
608
  },
609
  {
610
  "id": 67,
@@ -613,7 +613,7 @@
613
  "lstrip": false,
614
  "rstrip": false,
615
  "normalized": false,
616
- "special": false
617
  },
618
  {
619
  "id": 68,
@@ -622,7 +622,7 @@
622
  "lstrip": false,
623
  "rstrip": false,
624
  "normalized": false,
625
- "special": false
626
  },
627
  {
628
  "id": 69,
@@ -631,7 +631,7 @@
631
  "lstrip": false,
632
  "rstrip": false,
633
  "normalized": false,
634
- "special": false
635
  },
636
  {
637
  "id": 70,
@@ -640,7 +640,7 @@
640
  "lstrip": false,
641
  "rstrip": false,
642
  "normalized": false,
643
- "special": false
644
  },
645
  {
646
  "id": 71,
@@ -649,7 +649,7 @@
649
  "lstrip": false,
650
  "rstrip": false,
651
  "normalized": false,
652
- "special": false
653
  },
654
  {
655
  "id": 72,
@@ -658,7 +658,7 @@
658
  "lstrip": false,
659
  "rstrip": false,
660
  "normalized": false,
661
- "special": false
662
  },
663
  {
664
  "id": 73,
@@ -667,7 +667,7 @@
667
  "lstrip": false,
668
  "rstrip": false,
669
  "normalized": false,
670
- "special": false
671
  },
672
  {
673
  "id": 74,
@@ -676,7 +676,7 @@
676
  "lstrip": false,
677
  "rstrip": false,
678
  "normalized": false,
679
- "special": false
680
  },
681
  {
682
  "id": 75,
@@ -685,7 +685,7 @@
685
  "lstrip": false,
686
  "rstrip": false,
687
  "normalized": false,
688
- "special": false
689
  },
690
  {
691
  "id": 76,
@@ -694,7 +694,7 @@
694
  "lstrip": false,
695
  "rstrip": false,
696
  "normalized": false,
697
- "special": false
698
  },
699
  {
700
  "id": 77,
@@ -703,7 +703,7 @@
703
  "lstrip": false,
704
  "rstrip": false,
705
  "normalized": false,
706
- "special": false
707
  },
708
  {
709
  "id": 78,
@@ -712,7 +712,7 @@
712
  "lstrip": false,
713
  "rstrip": false,
714
  "normalized": false,
715
- "special": false
716
  },
717
  {
718
  "id": 79,
@@ -721,7 +721,7 @@
721
  "lstrip": false,
722
  "rstrip": false,
723
  "normalized": false,
724
- "special": false
725
  },
726
  {
727
  "id": 80,
@@ -730,7 +730,7 @@
730
  "lstrip": false,
731
  "rstrip": false,
732
  "normalized": false,
733
- "special": false
734
  },
735
  {
736
  "id": 81,
@@ -739,7 +739,7 @@
739
  "lstrip": false,
740
  "rstrip": false,
741
  "normalized": false,
742
- "special": false
743
  },
744
  {
745
  "id": 82,
@@ -748,7 +748,7 @@
748
  "lstrip": false,
749
  "rstrip": false,
750
  "normalized": false,
751
- "special": false
752
  },
753
  {
754
  "id": 83,
@@ -757,7 +757,7 @@
757
  "lstrip": false,
758
  "rstrip": false,
759
  "normalized": false,
760
- "special": false
761
  },
762
  {
763
  "id": 84,
@@ -766,7 +766,7 @@
766
  "lstrip": false,
767
  "rstrip": false,
768
  "normalized": false,
769
- "special": false
770
  },
771
  {
772
  "id": 85,
@@ -775,7 +775,7 @@
775
  "lstrip": false,
776
  "rstrip": false,
777
  "normalized": false,
778
- "special": false
779
  },
780
  {
781
  "id": 86,
@@ -784,7 +784,7 @@
784
  "lstrip": false,
785
  "rstrip": false,
786
  "normalized": false,
787
- "special": false
788
  },
789
  {
790
  "id": 87,
@@ -793,7 +793,7 @@
793
  "lstrip": false,
794
  "rstrip": false,
795
  "normalized": false,
796
- "special": false
797
  },
798
  {
799
  "id": 88,
@@ -802,7 +802,7 @@
802
  "lstrip": false,
803
  "rstrip": false,
804
  "normalized": false,
805
- "special": false
806
  },
807
  {
808
  "id": 89,
@@ -811,7 +811,7 @@
811
  "lstrip": false,
812
  "rstrip": false,
813
  "normalized": false,
814
- "special": false
815
  },
816
  {
817
  "id": 90,
@@ -820,7 +820,7 @@
820
  "lstrip": false,
821
  "rstrip": false,
822
  "normalized": false,
823
- "special": false
824
  },
825
  {
826
  "id": 91,
@@ -829,7 +829,7 @@
829
  "lstrip": false,
830
  "rstrip": false,
831
  "normalized": false,
832
- "special": false
833
  },
834
  {
835
  "id": 92,
@@ -838,7 +838,7 @@
838
  "lstrip": false,
839
  "rstrip": false,
840
  "normalized": false,
841
- "special": false
842
  },
843
  {
844
  "id": 93,
@@ -847,7 +847,7 @@
847
  "lstrip": false,
848
  "rstrip": false,
849
  "normalized": false,
850
- "special": false
851
  },
852
  {
853
  "id": 94,
@@ -856,7 +856,7 @@
856
  "lstrip": false,
857
  "rstrip": false,
858
  "normalized": false,
859
- "special": false
860
  },
861
  {
862
  "id": 95,
@@ -865,7 +865,7 @@
865
  "lstrip": false,
866
  "rstrip": false,
867
  "normalized": false,
868
- "special": false
869
  },
870
  {
871
  "id": 96,
@@ -874,7 +874,7 @@
874
  "lstrip": false,
875
  "rstrip": false,
876
  "normalized": false,
877
- "special": false
878
  },
879
  {
880
  "id": 97,
@@ -883,7 +883,7 @@
883
  "lstrip": false,
884
  "rstrip": false,
885
  "normalized": false,
886
- "special": false
887
  },
888
  {
889
  "id": 98,
@@ -892,7 +892,7 @@
892
  "lstrip": false,
893
  "rstrip": false,
894
  "normalized": false,
895
- "special": false
896
  },
897
  {
898
  "id": 99,
@@ -901,7 +901,7 @@
901
  "lstrip": false,
902
  "rstrip": false,
903
  "normalized": false,
904
- "special": false
905
  },
906
  {
907
  "id": 100,
@@ -910,7 +910,7 @@
910
  "lstrip": false,
911
  "rstrip": false,
912
  "normalized": false,
913
- "special": false
914
  },
915
  {
916
  "id": 101,
@@ -919,7 +919,7 @@
919
  "lstrip": false,
920
  "rstrip": false,
921
  "normalized": false,
922
- "special": false
923
  },
924
  {
925
  "id": 102,
@@ -928,7 +928,7 @@
928
  "lstrip": false,
929
  "rstrip": false,
930
  "normalized": false,
931
- "special": false
932
  },
933
  {
934
  "id": 103,
@@ -937,7 +937,7 @@
937
  "lstrip": false,
938
  "rstrip": false,
939
  "normalized": false,
940
- "special": false
941
  },
942
  {
943
  "id": 104,
@@ -946,7 +946,7 @@
946
  "lstrip": false,
947
  "rstrip": false,
948
  "normalized": false,
949
- "special": false
950
  },
951
  {
952
  "id": 105,
@@ -955,55 +955,2363 @@
955
  "lstrip": false,
956
  "rstrip": false,
957
  "normalized": false,
958
- "special": false
959
- }
960
- ],
961
- "normalizer": {
962
- "type": "Sequence",
963
- "normalizers": [
964
- {
965
- "type": "Replace",
966
- "pattern": {
967
- "String": " "
968
- },
969
- "content": "▁"
970
- }
971
- ]
972
- },
973
- "pre_tokenizer": null,
974
- "post_processor": {
975
- "type": "TemplateProcessing",
976
- "single": [
977
- {
978
- "SpecialToken": {
979
- "id": "<s>",
980
- "type_id": 0
981
- }
982
- },
983
- {
984
- "Sequence": {
985
- "id": "A",
986
- "type_id": 0
987
- }
988
- }
989
- ],
990
- "pair": [
991
- {
992
- "SpecialToken": {
993
- "id": "<s>",
994
- "type_id": 0
995
- }
996
- },
997
- {
998
- "Sequence": {
999
- "id": "A",
1000
- "type_id": 0
1001
- }
1002
- },
1003
- {
1004
- "SpecialToken": {
1005
- "id": "<s>",
1006
- "type_id": 1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1007
  }
1008
  },
1009
  {
 
46
  "lstrip": false,
47
  "rstrip": false,
48
  "normalized": false,
49
+ "special": true
50
  },
51
  {
52
  "id": 5,
 
55
  "lstrip": false,
56
  "rstrip": false,
57
  "normalized": false,
58
+ "special": true
59
  },
60
  {
61
  "id": 6,
 
64
  "lstrip": false,
65
  "rstrip": false,
66
  "normalized": false,
67
+ "special": true
68
  },
69
  {
70
  "id": 7,
 
73
  "lstrip": false,
74
  "rstrip": false,
75
  "normalized": false,
76
+ "special": true
77
  },
78
  {
79
  "id": 8,
 
82
  "lstrip": false,
83
  "rstrip": false,
84
  "normalized": false,
85
+ "special": true
86
  },
87
  {
88
  "id": 9,
 
91
  "lstrip": false,
92
  "rstrip": false,
93
  "normalized": false,
94
+ "special": true
95
  },
96
  {
97
  "id": 10,
 
100
  "lstrip": false,
101
  "rstrip": false,
102
  "normalized": false,
103
+ "special": true
104
  },
105
  {
106
  "id": 11,
 
109
  "lstrip": false,
110
  "rstrip": false,
111
  "normalized": false,
112
+ "special": true
113
  },
114
  {
115
  "id": 12,
 
118
  "lstrip": false,
119
  "rstrip": false,
120
  "normalized": false,
121
+ "special": true
122
  },
123
  {
124
  "id": 13,
 
127
  "lstrip": false,
128
  "rstrip": false,
129
  "normalized": false,
130
+ "special": true
131
  },
132
  {
133
  "id": 14,
 
136
  "lstrip": false,
137
  "rstrip": false,
138
  "normalized": false,
139
+ "special": true
140
  },
141
  {
142
  "id": 15,
 
145
  "lstrip": false,
146
  "rstrip": false,
147
  "normalized": false,
148
+ "special": true
149
  },
150
  {
151
  "id": 16,
 
154
  "lstrip": false,
155
  "rstrip": false,
156
  "normalized": false,
157
+ "special": true
158
  },
159
  {
160
  "id": 17,
 
163
  "lstrip": false,
164
  "rstrip": false,
165
  "normalized": false,
166
+ "special": true
167
  },
168
  {
169
  "id": 18,
 
172
  "lstrip": false,
173
  "rstrip": false,
174
  "normalized": false,
175
+ "special": true
176
  },
177
  {
178
  "id": 19,
 
181
  "lstrip": false,
182
  "rstrip": false,
183
  "normalized": false,
184
+ "special": true
185
  },
186
  {
187
  "id": 20,
 
190
  "lstrip": false,
191
  "rstrip": false,
192
  "normalized": false,
193
+ "special": true
194
  },
195
  {
196
  "id": 21,
 
199
  "lstrip": false,
200
  "rstrip": false,
201
  "normalized": false,
202
+ "special": true
203
  },
204
  {
205
  "id": 22,
 
208
  "lstrip": false,
209
  "rstrip": false,
210
  "normalized": false,
211
+ "special": true
212
  },
213
  {
214
  "id": 23,
 
217
  "lstrip": false,
218
  "rstrip": false,
219
  "normalized": false,
220
+ "special": true
221
  },
222
  {
223
  "id": 24,
 
226
  "lstrip": false,
227
  "rstrip": false,
228
  "normalized": false,
229
+ "special": true
230
  },
231
  {
232
  "id": 25,
 
235
  "lstrip": false,
236
  "rstrip": false,
237
  "normalized": false,
238
+ "special": true
239
  },
240
  {
241
  "id": 26,
 
244
  "lstrip": false,
245
  "rstrip": false,
246
  "normalized": false,
247
+ "special": true
248
  },
249
  {
250
  "id": 27,
 
253
  "lstrip": false,
254
  "rstrip": false,
255
  "normalized": false,
256
+ "special": true
257
  },
258
  {
259
  "id": 28,
 
262
  "lstrip": false,
263
  "rstrip": false,
264
  "normalized": false,
265
+ "special": true
266
  },
267
  {
268
  "id": 29,
 
271
  "lstrip": false,
272
  "rstrip": false,
273
  "normalized": false,
274
+ "special": true
275
  },
276
  {
277
  "id": 30,
 
280
  "lstrip": false,
281
  "rstrip": false,
282
  "normalized": false,
283
+ "special": true
284
  },
285
  {
286
  "id": 31,
 
289
  "lstrip": false,
290
  "rstrip": false,
291
  "normalized": false,
292
+ "special": true
293
  },
294
  {
295
  "id": 32,
 
298
  "lstrip": false,
299
  "rstrip": false,
300
  "normalized": false,
301
+ "special": true
302
  },
303
  {
304
  "id": 33,
 
307
  "lstrip": false,
308
  "rstrip": false,
309
  "normalized": false,
310
+ "special": true
311
  },
312
  {
313
  "id": 34,
 
316
  "lstrip": false,
317
  "rstrip": false,
318
  "normalized": false,
319
+ "special": true
320
  },
321
  {
322
  "id": 35,
 
325
  "lstrip": false,
326
  "rstrip": false,
327
  "normalized": false,
328
+ "special": true
329
  },
330
  {
331
  "id": 36,
 
334
  "lstrip": false,
335
  "rstrip": false,
336
  "normalized": false,
337
+ "special": true
338
  },
339
  {
340
  "id": 37,
 
343
  "lstrip": false,
344
  "rstrip": false,
345
  "normalized": false,
346
+ "special": true
347
  },
348
  {
349
  "id": 38,
 
352
  "lstrip": false,
353
  "rstrip": false,
354
  "normalized": false,
355
+ "special": true
356
  },
357
  {
358
  "id": 39,
 
361
  "lstrip": false,
362
  "rstrip": false,
363
  "normalized": false,
364
+ "special": true
365
  },
366
  {
367
  "id": 40,
 
370
  "lstrip": false,
371
  "rstrip": false,
372
  "normalized": false,
373
+ "special": true
374
  },
375
  {
376
  "id": 41,
 
379
  "lstrip": false,
380
  "rstrip": false,
381
  "normalized": false,
382
+ "special": true
383
  },
384
  {
385
  "id": 42,
 
388
  "lstrip": false,
389
  "rstrip": false,
390
  "normalized": false,
391
+ "special": true
392
  },
393
  {
394
  "id": 43,
 
397
  "lstrip": false,
398
  "rstrip": false,
399
  "normalized": false,
400
+ "special": true
401
  },
402
  {
403
  "id": 44,
 
406
  "lstrip": false,
407
  "rstrip": false,
408
  "normalized": false,
409
+ "special": true
410
  },
411
  {
412
  "id": 45,
 
415
  "lstrip": false,
416
  "rstrip": false,
417
  "normalized": false,
418
+ "special": true
419
  },
420
  {
421
  "id": 46,
 
424
  "lstrip": false,
425
  "rstrip": false,
426
  "normalized": false,
427
+ "special": true
428
  },
429
  {
430
  "id": 47,
 
433
  "lstrip": false,
434
  "rstrip": false,
435
  "normalized": false,
436
+ "special": true
437
  },
438
  {
439
  "id": 48,
 
442
  "lstrip": false,
443
  "rstrip": false,
444
  "normalized": false,
445
+ "special": true
446
  },
447
  {
448
  "id": 49,
 
451
  "lstrip": false,
452
  "rstrip": false,
453
  "normalized": false,
454
+ "special": true
455
  },
456
  {
457
  "id": 50,
 
460
  "lstrip": false,
461
  "rstrip": false,
462
  "normalized": false,
463
+ "special": true
464
  },
465
  {
466
  "id": 51,
 
469
  "lstrip": false,
470
  "rstrip": false,
471
  "normalized": false,
472
+ "special": true
473
  },
474
  {
475
  "id": 52,
 
478
  "lstrip": false,
479
  "rstrip": false,
480
  "normalized": false,
481
+ "special": true
482
  },
483
  {
484
  "id": 53,
 
487
  "lstrip": false,
488
  "rstrip": false,
489
  "normalized": false,
490
+ "special": true
491
  },
492
  {
493
  "id": 54,
 
496
  "lstrip": false,
497
  "rstrip": false,
498
  "normalized": false,
499
+ "special": true
500
  },
501
  {
502
  "id": 55,
 
505
  "lstrip": false,
506
  "rstrip": false,
507
  "normalized": false,
508
+ "special": true
509
  },
510
  {
511
  "id": 56,
 
514
  "lstrip": false,
515
  "rstrip": false,
516
  "normalized": false,
517
+ "special": true
518
  },
519
  {
520
  "id": 57,
 
523
  "lstrip": false,
524
  "rstrip": false,
525
  "normalized": false,
526
+ "special": true
527
  },
528
  {
529
  "id": 58,
 
532
  "lstrip": false,
533
  "rstrip": false,
534
  "normalized": false,
535
+ "special": true
536
  },
537
  {
538
  "id": 59,
 
541
  "lstrip": false,
542
  "rstrip": false,
543
  "normalized": false,
544
+ "special": true
545
  },
546
  {
547
  "id": 60,
 
550
  "lstrip": false,
551
  "rstrip": false,
552
  "normalized": false,
553
+ "special": true
554
  },
555
  {
556
  "id": 61,
 
559
  "lstrip": false,
560
  "rstrip": false,
561
  "normalized": false,
562
+ "special": true
563
  },
564
  {
565
  "id": 62,
 
568
  "lstrip": false,
569
  "rstrip": false,
570
  "normalized": false,
571
+ "special": true
572
  },
573
  {
574
  "id": 63,
 
577
  "lstrip": false,
578
  "rstrip": false,
579
  "normalized": false,
580
+ "special": true
581
  },
582
  {
583
  "id": 64,
 
586
  "lstrip": false,
587
  "rstrip": false,
588
  "normalized": false,
589
+ "special": true
590
  },
591
  {
592
  "id": 65,
 
595
  "lstrip": false,
596
  "rstrip": false,
597
  "normalized": false,
598
+ "special": true
599
  },
600
  {
601
  "id": 66,
 
604
  "lstrip": false,
605
  "rstrip": false,
606
  "normalized": false,
607
+ "special": true
608
  },
609
  {
610
  "id": 67,
 
613
  "lstrip": false,
614
  "rstrip": false,
615
  "normalized": false,
616
+ "special": true
617
  },
618
  {
619
  "id": 68,
 
622
  "lstrip": false,
623
  "rstrip": false,
624
  "normalized": false,
625
+ "special": true
626
  },
627
  {
628
  "id": 69,
 
631
  "lstrip": false,
632
  "rstrip": false,
633
  "normalized": false,
634
+ "special": true
635
  },
636
  {
637
  "id": 70,
 
640
  "lstrip": false,
641
  "rstrip": false,
642
  "normalized": false,
643
+ "special": true
644
  },
645
  {
646
  "id": 71,
 
649
  "lstrip": false,
650
  "rstrip": false,
651
  "normalized": false,
652
+ "special": true
653
  },
654
  {
655
  "id": 72,
 
658
  "lstrip": false,
659
  "rstrip": false,
660
  "normalized": false,
661
+ "special": true
662
  },
663
  {
664
  "id": 73,
 
667
  "lstrip": false,
668
  "rstrip": false,
669
  "normalized": false,
670
+ "special": true
671
  },
672
  {
673
  "id": 74,
 
676
  "lstrip": false,
677
  "rstrip": false,
678
  "normalized": false,
679
+ "special": true
680
  },
681
  {
682
  "id": 75,
 
685
  "lstrip": false,
686
  "rstrip": false,
687
  "normalized": false,
688
+ "special": true
689
  },
690
  {
691
  "id": 76,
 
694
  "lstrip": false,
695
  "rstrip": false,
696
  "normalized": false,
697
+ "special": true
698
  },
699
  {
700
  "id": 77,
 
703
  "lstrip": false,
704
  "rstrip": false,
705
  "normalized": false,
706
+ "special": true
707
  },
708
  {
709
  "id": 78,
 
712
  "lstrip": false,
713
  "rstrip": false,
714
  "normalized": false,
715
+ "special": true
716
  },
717
  {
718
  "id": 79,
 
721
  "lstrip": false,
722
  "rstrip": false,
723
  "normalized": false,
724
+ "special": true
725
  },
726
  {
727
  "id": 80,
 
730
  "lstrip": false,
731
  "rstrip": false,
732
  "normalized": false,
733
+ "special": true
734
  },
735
  {
736
  "id": 81,
 
739
  "lstrip": false,
740
  "rstrip": false,
741
  "normalized": false,
742
+ "special": true
743
  },
744
  {
745
  "id": 82,
 
748
  "lstrip": false,
749
  "rstrip": false,
750
  "normalized": false,
751
+ "special": true
752
  },
753
  {
754
  "id": 83,
 
757
  "lstrip": false,
758
  "rstrip": false,
759
  "normalized": false,
760
+ "special": true
761
  },
762
  {
763
  "id": 84,
 
766
  "lstrip": false,
767
  "rstrip": false,
768
  "normalized": false,
769
+ "special": true
770
  },
771
  {
772
  "id": 85,
 
775
  "lstrip": false,
776
  "rstrip": false,
777
  "normalized": false,
778
+ "special": true
779
  },
780
  {
781
  "id": 86,
 
784
  "lstrip": false,
785
  "rstrip": false,
786
  "normalized": false,
787
+ "special": true
788
  },
789
  {
790
  "id": 87,
 
793
  "lstrip": false,
794
  "rstrip": false,
795
  "normalized": false,
796
+ "special": true
797
  },
798
  {
799
  "id": 88,
 
802
  "lstrip": false,
803
  "rstrip": false,
804
  "normalized": false,
805
+ "special": true
806
  },
807
  {
808
  "id": 89,
 
811
  "lstrip": false,
812
  "rstrip": false,
813
  "normalized": false,
814
+ "special": true
815
  },
816
  {
817
  "id": 90,
 
820
  "lstrip": false,
821
  "rstrip": false,
822
  "normalized": false,
823
+ "special": true
824
  },
825
  {
826
  "id": 91,
 
829
  "lstrip": false,
830
  "rstrip": false,
831
  "normalized": false,
832
+ "special": true
833
  },
834
  {
835
  "id": 92,
 
838
  "lstrip": false,
839
  "rstrip": false,
840
  "normalized": false,
841
+ "special": true
842
  },
843
  {
844
  "id": 93,
 
847
  "lstrip": false,
848
  "rstrip": false,
849
  "normalized": false,
850
+ "special": true
851
  },
852
  {
853
  "id": 94,
 
856
  "lstrip": false,
857
  "rstrip": false,
858
  "normalized": false,
859
+ "special": true
860
  },
861
  {
862
  "id": 95,
 
865
  "lstrip": false,
866
  "rstrip": false,
867
  "normalized": false,
868
+ "special": true
869
  },
870
  {
871
  "id": 96,
 
874
  "lstrip": false,
875
  "rstrip": false,
876
  "normalized": false,
877
+ "special": true
878
  },
879
  {
880
  "id": 97,
 
883
  "lstrip": false,
884
  "rstrip": false,
885
  "normalized": false,
886
+ "special": true
887
  },
888
  {
889
  "id": 98,
 
892
  "lstrip": false,
893
  "rstrip": false,
894
  "normalized": false,
895
+ "special": true
896
  },
897
  {
898
  "id": 99,
 
901
  "lstrip": false,
902
  "rstrip": false,
903
  "normalized": false,
904
+ "special": true
905
  },
906
  {
907
  "id": 100,
 
910
  "lstrip": false,
911
  "rstrip": false,
912
  "normalized": false,
913
+ "special": true
914
  },
915
  {
916
  "id": 101,
 
919
  "lstrip": false,
920
  "rstrip": false,
921
  "normalized": false,
922
+ "special": true
923
  },
924
  {
925
  "id": 102,
 
928
  "lstrip": false,
929
  "rstrip": false,
930
  "normalized": false,
931
+ "special": true
932
  },
933
  {
934
  "id": 103,
 
937
  "lstrip": false,
938
  "rstrip": false,
939
  "normalized": false,
940
+ "special": true
941
  },
942
  {
943
  "id": 104,
 
946
  "lstrip": false,
947
  "rstrip": false,
948
  "normalized": false,
949
+ "special": true
950
  },
951
  {
952
  "id": 105,
 
955
  "lstrip": false,
956
  "rstrip": false,
957
  "normalized": false,
958
+ "special": true
959
+ },
960
+ {
961
+ "id": 106,
962
+ "content": "<0x00>",
963
+ "single_word": false,
964
+ "lstrip": false,
965
+ "rstrip": false,
966
+ "normalized": false,
967
+ "special": true
968
+ },
969
+ {
970
+ "id": 107,
971
+ "content": "<0x01>",
972
+ "single_word": false,
973
+ "lstrip": false,
974
+ "rstrip": false,
975
+ "normalized": false,
976
+ "special": true
977
+ },
978
+ {
979
+ "id": 108,
980
+ "content": "<0x02>",
981
+ "single_word": false,
982
+ "lstrip": false,
983
+ "rstrip": false,
984
+ "normalized": false,
985
+ "special": true
986
+ },
987
+ {
988
+ "id": 109,
989
+ "content": "<0x03>",
990
+ "single_word": false,
991
+ "lstrip": false,
992
+ "rstrip": false,
993
+ "normalized": false,
994
+ "special": true
995
+ },
996
+ {
997
+ "id": 110,
998
+ "content": "<0x04>",
999
+ "single_word": false,
1000
+ "lstrip": false,
1001
+ "rstrip": false,
1002
+ "normalized": false,
1003
+ "special": true
1004
+ },
1005
+ {
1006
+ "id": 111,
1007
+ "content": "<0x05>",
1008
+ "single_word": false,
1009
+ "lstrip": false,
1010
+ "rstrip": false,
1011
+ "normalized": false,
1012
+ "special": true
1013
+ },
1014
+ {
1015
+ "id": 112,
1016
+ "content": "<0x06>",
1017
+ "single_word": false,
1018
+ "lstrip": false,
1019
+ "rstrip": false,
1020
+ "normalized": false,
1021
+ "special": true
1022
+ },
1023
+ {
1024
+ "id": 113,
1025
+ "content": "<0x07>",
1026
+ "single_word": false,
1027
+ "lstrip": false,
1028
+ "rstrip": false,
1029
+ "normalized": false,
1030
+ "special": true
1031
+ },
1032
+ {
1033
+ "id": 114,
1034
+ "content": "<0x08>",
1035
+ "single_word": false,
1036
+ "lstrip": false,
1037
+ "rstrip": false,
1038
+ "normalized": false,
1039
+ "special": true
1040
+ },
1041
+ {
1042
+ "id": 115,
1043
+ "content": "<0x09>",
1044
+ "single_word": false,
1045
+ "lstrip": false,
1046
+ "rstrip": false,
1047
+ "normalized": false,
1048
+ "special": true
1049
+ },
1050
+ {
1051
+ "id": 116,
1052
+ "content": "<0x0A>",
1053
+ "single_word": false,
1054
+ "lstrip": false,
1055
+ "rstrip": false,
1056
+ "normalized": false,
1057
+ "special": true
1058
+ },
1059
+ {
1060
+ "id": 117,
1061
+ "content": "<0x0B>",
1062
+ "single_word": false,
1063
+ "lstrip": false,
1064
+ "rstrip": false,
1065
+ "normalized": false,
1066
+ "special": true
1067
+ },
1068
+ {
1069
+ "id": 118,
1070
+ "content": "<0x0C>",
1071
+ "single_word": false,
1072
+ "lstrip": false,
1073
+ "rstrip": false,
1074
+ "normalized": false,
1075
+ "special": true
1076
+ },
1077
+ {
1078
+ "id": 119,
1079
+ "content": "<0x0D>",
1080
+ "single_word": false,
1081
+ "lstrip": false,
1082
+ "rstrip": false,
1083
+ "normalized": false,
1084
+ "special": true
1085
+ },
1086
+ {
1087
+ "id": 120,
1088
+ "content": "<0x0E>",
1089
+ "single_word": false,
1090
+ "lstrip": false,
1091
+ "rstrip": false,
1092
+ "normalized": false,
1093
+ "special": true
1094
+ },
1095
+ {
1096
+ "id": 121,
1097
+ "content": "<0x0F>",
1098
+ "single_word": false,
1099
+ "lstrip": false,
1100
+ "rstrip": false,
1101
+ "normalized": false,
1102
+ "special": true
1103
+ },
1104
+ {
1105
+ "id": 122,
1106
+ "content": "<0x10>",
1107
+ "single_word": false,
1108
+ "lstrip": false,
1109
+ "rstrip": false,
1110
+ "normalized": false,
1111
+ "special": true
1112
+ },
1113
+ {
1114
+ "id": 123,
1115
+ "content": "<0x11>",
1116
+ "single_word": false,
1117
+ "lstrip": false,
1118
+ "rstrip": false,
1119
+ "normalized": false,
1120
+ "special": true
1121
+ },
1122
+ {
1123
+ "id": 124,
1124
+ "content": "<0x12>",
1125
+ "single_word": false,
1126
+ "lstrip": false,
1127
+ "rstrip": false,
1128
+ "normalized": false,
1129
+ "special": true
1130
+ },
1131
+ {
1132
+ "id": 125,
1133
+ "content": "<0x13>",
1134
+ "single_word": false,
1135
+ "lstrip": false,
1136
+ "rstrip": false,
1137
+ "normalized": false,
1138
+ "special": true
1139
+ },
1140
+ {
1141
+ "id": 126,
1142
+ "content": "<0x14>",
1143
+ "single_word": false,
1144
+ "lstrip": false,
1145
+ "rstrip": false,
1146
+ "normalized": false,
1147
+ "special": true
1148
+ },
1149
+ {
1150
+ "id": 127,
1151
+ "content": "<0x15>",
1152
+ "single_word": false,
1153
+ "lstrip": false,
1154
+ "rstrip": false,
1155
+ "normalized": false,
1156
+ "special": true
1157
+ },
1158
+ {
1159
+ "id": 128,
1160
+ "content": "<0x16>",
1161
+ "single_word": false,
1162
+ "lstrip": false,
1163
+ "rstrip": false,
1164
+ "normalized": false,
1165
+ "special": true
1166
+ },
1167
+ {
1168
+ "id": 129,
1169
+ "content": "<0x17>",
1170
+ "single_word": false,
1171
+ "lstrip": false,
1172
+ "rstrip": false,
1173
+ "normalized": false,
1174
+ "special": true
1175
+ },
1176
+ {
1177
+ "id": 130,
1178
+ "content": "<0x18>",
1179
+ "single_word": false,
1180
+ "lstrip": false,
1181
+ "rstrip": false,
1182
+ "normalized": false,
1183
+ "special": true
1184
+ },
1185
+ {
1186
+ "id": 131,
1187
+ "content": "<0x19>",
1188
+ "single_word": false,
1189
+ "lstrip": false,
1190
+ "rstrip": false,
1191
+ "normalized": false,
1192
+ "special": true
1193
+ },
1194
+ {
1195
+ "id": 132,
1196
+ "content": "<0x1A>",
1197
+ "single_word": false,
1198
+ "lstrip": false,
1199
+ "rstrip": false,
1200
+ "normalized": false,
1201
+ "special": true
1202
+ },
1203
+ {
1204
+ "id": 133,
1205
+ "content": "<0x1B>",
1206
+ "single_word": false,
1207
+ "lstrip": false,
1208
+ "rstrip": false,
1209
+ "normalized": false,
1210
+ "special": true
1211
+ },
1212
+ {
1213
+ "id": 134,
1214
+ "content": "<0x1C>",
1215
+ "single_word": false,
1216
+ "lstrip": false,
1217
+ "rstrip": false,
1218
+ "normalized": false,
1219
+ "special": true
1220
+ },
1221
+ {
1222
+ "id": 135,
1223
+ "content": "<0x1D>",
1224
+ "single_word": false,
1225
+ "lstrip": false,
1226
+ "rstrip": false,
1227
+ "normalized": false,
1228
+ "special": true
1229
+ },
1230
+ {
1231
+ "id": 136,
1232
+ "content": "<0x1E>",
1233
+ "single_word": false,
1234
+ "lstrip": false,
1235
+ "rstrip": false,
1236
+ "normalized": false,
1237
+ "special": true
1238
+ },
1239
+ {
1240
+ "id": 137,
1241
+ "content": "<0x1F>",
1242
+ "single_word": false,
1243
+ "lstrip": false,
1244
+ "rstrip": false,
1245
+ "normalized": false,
1246
+ "special": true
1247
+ },
1248
+ {
1249
+ "id": 138,
1250
+ "content": "<0x20>",
1251
+ "single_word": false,
1252
+ "lstrip": false,
1253
+ "rstrip": false,
1254
+ "normalized": false,
1255
+ "special": true
1256
+ },
1257
+ {
1258
+ "id": 139,
1259
+ "content": "<0x21>",
1260
+ "single_word": false,
1261
+ "lstrip": false,
1262
+ "rstrip": false,
1263
+ "normalized": false,
1264
+ "special": true
1265
+ },
1266
+ {
1267
+ "id": 140,
1268
+ "content": "<0x22>",
1269
+ "single_word": false,
1270
+ "lstrip": false,
1271
+ "rstrip": false,
1272
+ "normalized": false,
1273
+ "special": true
1274
+ },
1275
+ {
1276
+ "id": 141,
1277
+ "content": "<0x23>",
1278
+ "single_word": false,
1279
+ "lstrip": false,
1280
+ "rstrip": false,
1281
+ "normalized": false,
1282
+ "special": true
1283
+ },
1284
+ {
1285
+ "id": 142,
1286
+ "content": "<0x24>",
1287
+ "single_word": false,
1288
+ "lstrip": false,
1289
+ "rstrip": false,
1290
+ "normalized": false,
1291
+ "special": true
1292
+ },
1293
+ {
1294
+ "id": 143,
1295
+ "content": "<0x25>",
1296
+ "single_word": false,
1297
+ "lstrip": false,
1298
+ "rstrip": false,
1299
+ "normalized": false,
1300
+ "special": true
1301
+ },
1302
+ {
1303
+ "id": 144,
1304
+ "content": "<0x26>",
1305
+ "single_word": false,
1306
+ "lstrip": false,
1307
+ "rstrip": false,
1308
+ "normalized": false,
1309
+ "special": true
1310
+ },
1311
+ {
1312
+ "id": 145,
1313
+ "content": "<0x27>",
1314
+ "single_word": false,
1315
+ "lstrip": false,
1316
+ "rstrip": false,
1317
+ "normalized": false,
1318
+ "special": true
1319
+ },
1320
+ {
1321
+ "id": 146,
1322
+ "content": "<0x28>",
1323
+ "single_word": false,
1324
+ "lstrip": false,
1325
+ "rstrip": false,
1326
+ "normalized": false,
1327
+ "special": true
1328
+ },
1329
+ {
1330
+ "id": 147,
1331
+ "content": "<0x29>",
1332
+ "single_word": false,
1333
+ "lstrip": false,
1334
+ "rstrip": false,
1335
+ "normalized": false,
1336
+ "special": true
1337
+ },
1338
+ {
1339
+ "id": 148,
1340
+ "content": "<0x2A>",
1341
+ "single_word": false,
1342
+ "lstrip": false,
1343
+ "rstrip": false,
1344
+ "normalized": false,
1345
+ "special": true
1346
+ },
1347
+ {
1348
+ "id": 149,
1349
+ "content": "<0x2B>",
1350
+ "single_word": false,
1351
+ "lstrip": false,
1352
+ "rstrip": false,
1353
+ "normalized": false,
1354
+ "special": true
1355
+ },
1356
+ {
1357
+ "id": 150,
1358
+ "content": "<0x2C>",
1359
+ "single_word": false,
1360
+ "lstrip": false,
1361
+ "rstrip": false,
1362
+ "normalized": false,
1363
+ "special": true
1364
+ },
1365
+ {
1366
+ "id": 151,
1367
+ "content": "<0x2D>",
1368
+ "single_word": false,
1369
+ "lstrip": false,
1370
+ "rstrip": false,
1371
+ "normalized": false,
1372
+ "special": true
1373
+ },
1374
+ {
1375
+ "id": 152,
1376
+ "content": "<0x2E>",
1377
+ "single_word": false,
1378
+ "lstrip": false,
1379
+ "rstrip": false,
1380
+ "normalized": false,
1381
+ "special": true
1382
+ },
1383
+ {
1384
+ "id": 153,
1385
+ "content": "<0x2F>",
1386
+ "single_word": false,
1387
+ "lstrip": false,
1388
+ "rstrip": false,
1389
+ "normalized": false,
1390
+ "special": true
1391
+ },
1392
+ {
1393
+ "id": 154,
1394
+ "content": "<0x30>",
1395
+ "single_word": false,
1396
+ "lstrip": false,
1397
+ "rstrip": false,
1398
+ "normalized": false,
1399
+ "special": true
1400
+ },
1401
+ {
1402
+ "id": 155,
1403
+ "content": "<0x31>",
1404
+ "single_word": false,
1405
+ "lstrip": false,
1406
+ "rstrip": false,
1407
+ "normalized": false,
1408
+ "special": true
1409
+ },
1410
+ {
1411
+ "id": 156,
1412
+ "content": "<0x32>",
1413
+ "single_word": false,
1414
+ "lstrip": false,
1415
+ "rstrip": false,
1416
+ "normalized": false,
1417
+ "special": true
1418
+ },
1419
+ {
1420
+ "id": 157,
1421
+ "content": "<0x33>",
1422
+ "single_word": false,
1423
+ "lstrip": false,
1424
+ "rstrip": false,
1425
+ "normalized": false,
1426
+ "special": true
1427
+ },
1428
+ {
1429
+ "id": 158,
1430
+ "content": "<0x34>",
1431
+ "single_word": false,
1432
+ "lstrip": false,
1433
+ "rstrip": false,
1434
+ "normalized": false,
1435
+ "special": true
1436
+ },
1437
+ {
1438
+ "id": 159,
1439
+ "content": "<0x35>",
1440
+ "single_word": false,
1441
+ "lstrip": false,
1442
+ "rstrip": false,
1443
+ "normalized": false,
1444
+ "special": true
1445
+ },
1446
+ {
1447
+ "id": 160,
1448
+ "content": "<0x36>",
1449
+ "single_word": false,
1450
+ "lstrip": false,
1451
+ "rstrip": false,
1452
+ "normalized": false,
1453
+ "special": true
1454
+ },
1455
+ {
1456
+ "id": 161,
1457
+ "content": "<0x37>",
1458
+ "single_word": false,
1459
+ "lstrip": false,
1460
+ "rstrip": false,
1461
+ "normalized": false,
1462
+ "special": true
1463
+ },
1464
+ {
1465
+ "id": 162,
1466
+ "content": "<0x38>",
1467
+ "single_word": false,
1468
+ "lstrip": false,
1469
+ "rstrip": false,
1470
+ "normalized": false,
1471
+ "special": true
1472
+ },
1473
+ {
1474
+ "id": 163,
1475
+ "content": "<0x39>",
1476
+ "single_word": false,
1477
+ "lstrip": false,
1478
+ "rstrip": false,
1479
+ "normalized": false,
1480
+ "special": true
1481
+ },
1482
+ {
1483
+ "id": 164,
1484
+ "content": "<0x3A>",
1485
+ "single_word": false,
1486
+ "lstrip": false,
1487
+ "rstrip": false,
1488
+ "normalized": false,
1489
+ "special": true
1490
+ },
1491
+ {
1492
+ "id": 165,
1493
+ "content": "<0x3B>",
1494
+ "single_word": false,
1495
+ "lstrip": false,
1496
+ "rstrip": false,
1497
+ "normalized": false,
1498
+ "special": true
1499
+ },
1500
+ {
1501
+ "id": 166,
1502
+ "content": "<0x3C>",
1503
+ "single_word": false,
1504
+ "lstrip": false,
1505
+ "rstrip": false,
1506
+ "normalized": false,
1507
+ "special": true
1508
+ },
1509
+ {
1510
+ "id": 167,
1511
+ "content": "<0x3D>",
1512
+ "single_word": false,
1513
+ "lstrip": false,
1514
+ "rstrip": false,
1515
+ "normalized": false,
1516
+ "special": true
1517
+ },
1518
+ {
1519
+ "id": 168,
1520
+ "content": "<0x3E>",
1521
+ "single_word": false,
1522
+ "lstrip": false,
1523
+ "rstrip": false,
1524
+ "normalized": false,
1525
+ "special": true
1526
+ },
1527
+ {
1528
+ "id": 169,
1529
+ "content": "<0x3F>",
1530
+ "single_word": false,
1531
+ "lstrip": false,
1532
+ "rstrip": false,
1533
+ "normalized": false,
1534
+ "special": true
1535
+ },
1536
+ {
1537
+ "id": 170,
1538
+ "content": "<0x40>",
1539
+ "single_word": false,
1540
+ "lstrip": false,
1541
+ "rstrip": false,
1542
+ "normalized": false,
1543
+ "special": true
1544
+ },
1545
+ {
1546
+ "id": 171,
1547
+ "content": "<0x41>",
1548
+ "single_word": false,
1549
+ "lstrip": false,
1550
+ "rstrip": false,
1551
+ "normalized": false,
1552
+ "special": true
1553
+ },
1554
+ {
1555
+ "id": 172,
1556
+ "content": "<0x42>",
1557
+ "single_word": false,
1558
+ "lstrip": false,
1559
+ "rstrip": false,
1560
+ "normalized": false,
1561
+ "special": true
1562
+ },
1563
+ {
1564
+ "id": 173,
1565
+ "content": "<0x43>",
1566
+ "single_word": false,
1567
+ "lstrip": false,
1568
+ "rstrip": false,
1569
+ "normalized": false,
1570
+ "special": true
1571
+ },
1572
+ {
1573
+ "id": 174,
1574
+ "content": "<0x44>",
1575
+ "single_word": false,
1576
+ "lstrip": false,
1577
+ "rstrip": false,
1578
+ "normalized": false,
1579
+ "special": true
1580
+ },
1581
+ {
1582
+ "id": 175,
1583
+ "content": "<0x45>",
1584
+ "single_word": false,
1585
+ "lstrip": false,
1586
+ "rstrip": false,
1587
+ "normalized": false,
1588
+ "special": true
1589
+ },
1590
+ {
1591
+ "id": 176,
1592
+ "content": "<0x46>",
1593
+ "single_word": false,
1594
+ "lstrip": false,
1595
+ "rstrip": false,
1596
+ "normalized": false,
1597
+ "special": true
1598
+ },
1599
+ {
1600
+ "id": 177,
1601
+ "content": "<0x47>",
1602
+ "single_word": false,
1603
+ "lstrip": false,
1604
+ "rstrip": false,
1605
+ "normalized": false,
1606
+ "special": true
1607
+ },
1608
+ {
1609
+ "id": 178,
1610
+ "content": "<0x48>",
1611
+ "single_word": false,
1612
+ "lstrip": false,
1613
+ "rstrip": false,
1614
+ "normalized": false,
1615
+ "special": true
1616
+ },
1617
+ {
1618
+ "id": 179,
1619
+ "content": "<0x49>",
1620
+ "single_word": false,
1621
+ "lstrip": false,
1622
+ "rstrip": false,
1623
+ "normalized": false,
1624
+ "special": true
1625
+ },
1626
+ {
1627
+ "id": 180,
1628
+ "content": "<0x4A>",
1629
+ "single_word": false,
1630
+ "lstrip": false,
1631
+ "rstrip": false,
1632
+ "normalized": false,
1633
+ "special": true
1634
+ },
1635
+ {
1636
+ "id": 181,
1637
+ "content": "<0x4B>",
1638
+ "single_word": false,
1639
+ "lstrip": false,
1640
+ "rstrip": false,
1641
+ "normalized": false,
1642
+ "special": true
1643
+ },
1644
+ {
1645
+ "id": 182,
1646
+ "content": "<0x4C>",
1647
+ "single_word": false,
1648
+ "lstrip": false,
1649
+ "rstrip": false,
1650
+ "normalized": false,
1651
+ "special": true
1652
+ },
1653
+ {
1654
+ "id": 183,
1655
+ "content": "<0x4D>",
1656
+ "single_word": false,
1657
+ "lstrip": false,
1658
+ "rstrip": false,
1659
+ "normalized": false,
1660
+ "special": true
1661
+ },
1662
+ {
1663
+ "id": 184,
1664
+ "content": "<0x4E>",
1665
+ "single_word": false,
1666
+ "lstrip": false,
1667
+ "rstrip": false,
1668
+ "normalized": false,
1669
+ "special": true
1670
+ },
1671
+ {
1672
+ "id": 185,
1673
+ "content": "<0x4F>",
1674
+ "single_word": false,
1675
+ "lstrip": false,
1676
+ "rstrip": false,
1677
+ "normalized": false,
1678
+ "special": true
1679
+ },
1680
+ {
1681
+ "id": 186,
1682
+ "content": "<0x50>",
1683
+ "single_word": false,
1684
+ "lstrip": false,
1685
+ "rstrip": false,
1686
+ "normalized": false,
1687
+ "special": true
1688
+ },
1689
+ {
1690
+ "id": 187,
1691
+ "content": "<0x51>",
1692
+ "single_word": false,
1693
+ "lstrip": false,
1694
+ "rstrip": false,
1695
+ "normalized": false,
1696
+ "special": true
1697
+ },
1698
+ {
1699
+ "id": 188,
1700
+ "content": "<0x52>",
1701
+ "single_word": false,
1702
+ "lstrip": false,
1703
+ "rstrip": false,
1704
+ "normalized": false,
1705
+ "special": true
1706
+ },
1707
+ {
1708
+ "id": 189,
1709
+ "content": "<0x53>",
1710
+ "single_word": false,
1711
+ "lstrip": false,
1712
+ "rstrip": false,
1713
+ "normalized": false,
1714
+ "special": true
1715
+ },
1716
+ {
1717
+ "id": 190,
1718
+ "content": "<0x54>",
1719
+ "single_word": false,
1720
+ "lstrip": false,
1721
+ "rstrip": false,
1722
+ "normalized": false,
1723
+ "special": true
1724
+ },
1725
+ {
1726
+ "id": 191,
1727
+ "content": "<0x55>",
1728
+ "single_word": false,
1729
+ "lstrip": false,
1730
+ "rstrip": false,
1731
+ "normalized": false,
1732
+ "special": true
1733
+ },
1734
+ {
1735
+ "id": 192,
1736
+ "content": "<0x56>",
1737
+ "single_word": false,
1738
+ "lstrip": false,
1739
+ "rstrip": false,
1740
+ "normalized": false,
1741
+ "special": true
1742
+ },
1743
+ {
1744
+ "id": 193,
1745
+ "content": "<0x57>",
1746
+ "single_word": false,
1747
+ "lstrip": false,
1748
+ "rstrip": false,
1749
+ "normalized": false,
1750
+ "special": true
1751
+ },
1752
+ {
1753
+ "id": 194,
1754
+ "content": "<0x58>",
1755
+ "single_word": false,
1756
+ "lstrip": false,
1757
+ "rstrip": false,
1758
+ "normalized": false,
1759
+ "special": true
1760
+ },
1761
+ {
1762
+ "id": 195,
1763
+ "content": "<0x59>",
1764
+ "single_word": false,
1765
+ "lstrip": false,
1766
+ "rstrip": false,
1767
+ "normalized": false,
1768
+ "special": true
1769
+ },
1770
+ {
1771
+ "id": 196,
1772
+ "content": "<0x5A>",
1773
+ "single_word": false,
1774
+ "lstrip": false,
1775
+ "rstrip": false,
1776
+ "normalized": false,
1777
+ "special": true
1778
+ },
1779
+ {
1780
+ "id": 197,
1781
+ "content": "<0x5B>",
1782
+ "single_word": false,
1783
+ "lstrip": false,
1784
+ "rstrip": false,
1785
+ "normalized": false,
1786
+ "special": true
1787
+ },
1788
+ {
1789
+ "id": 198,
1790
+ "content": "<0x5C>",
1791
+ "single_word": false,
1792
+ "lstrip": false,
1793
+ "rstrip": false,
1794
+ "normalized": false,
1795
+ "special": true
1796
+ },
1797
+ {
1798
+ "id": 199,
1799
+ "content": "<0x5D>",
1800
+ "single_word": false,
1801
+ "lstrip": false,
1802
+ "rstrip": false,
1803
+ "normalized": false,
1804
+ "special": true
1805
+ },
1806
+ {
1807
+ "id": 200,
1808
+ "content": "<0x5E>",
1809
+ "single_word": false,
1810
+ "lstrip": false,
1811
+ "rstrip": false,
1812
+ "normalized": false,
1813
+ "special": true
1814
+ },
1815
+ {
1816
+ "id": 201,
1817
+ "content": "<0x5F>",
1818
+ "single_word": false,
1819
+ "lstrip": false,
1820
+ "rstrip": false,
1821
+ "normalized": false,
1822
+ "special": true
1823
+ },
1824
+ {
1825
+ "id": 202,
1826
+ "content": "<0x60>",
1827
+ "single_word": false,
1828
+ "lstrip": false,
1829
+ "rstrip": false,
1830
+ "normalized": false,
1831
+ "special": true
1832
+ },
1833
+ {
1834
+ "id": 203,
1835
+ "content": "<0x61>",
1836
+ "single_word": false,
1837
+ "lstrip": false,
1838
+ "rstrip": false,
1839
+ "normalized": false,
1840
+ "special": true
1841
+ },
1842
+ {
1843
+ "id": 204,
1844
+ "content": "<0x62>",
1845
+ "single_word": false,
1846
+ "lstrip": false,
1847
+ "rstrip": false,
1848
+ "normalized": false,
1849
+ "special": true
1850
+ },
1851
+ {
1852
+ "id": 205,
1853
+ "content": "<0x63>",
1854
+ "single_word": false,
1855
+ "lstrip": false,
1856
+ "rstrip": false,
1857
+ "normalized": false,
1858
+ "special": true
1859
+ },
1860
+ {
1861
+ "id": 206,
1862
+ "content": "<0x64>",
1863
+ "single_word": false,
1864
+ "lstrip": false,
1865
+ "rstrip": false,
1866
+ "normalized": false,
1867
+ "special": true
1868
+ },
1869
+ {
1870
+ "id": 207,
1871
+ "content": "<0x65>",
1872
+ "single_word": false,
1873
+ "lstrip": false,
1874
+ "rstrip": false,
1875
+ "normalized": false,
1876
+ "special": true
1877
+ },
1878
+ {
1879
+ "id": 208,
1880
+ "content": "<0x66>",
1881
+ "single_word": false,
1882
+ "lstrip": false,
1883
+ "rstrip": false,
1884
+ "normalized": false,
1885
+ "special": true
1886
+ },
1887
+ {
1888
+ "id": 209,
1889
+ "content": "<0x67>",
1890
+ "single_word": false,
1891
+ "lstrip": false,
1892
+ "rstrip": false,
1893
+ "normalized": false,
1894
+ "special": true
1895
+ },
1896
+ {
1897
+ "id": 210,
1898
+ "content": "<0x68>",
1899
+ "single_word": false,
1900
+ "lstrip": false,
1901
+ "rstrip": false,
1902
+ "normalized": false,
1903
+ "special": true
1904
+ },
1905
+ {
1906
+ "id": 211,
1907
+ "content": "<0x69>",
1908
+ "single_word": false,
1909
+ "lstrip": false,
1910
+ "rstrip": false,
1911
+ "normalized": false,
1912
+ "special": true
1913
+ },
1914
+ {
1915
+ "id": 212,
1916
+ "content": "<0x6A>",
1917
+ "single_word": false,
1918
+ "lstrip": false,
1919
+ "rstrip": false,
1920
+ "normalized": false,
1921
+ "special": true
1922
+ },
1923
+ {
1924
+ "id": 213,
1925
+ "content": "<0x6B>",
1926
+ "single_word": false,
1927
+ "lstrip": false,
1928
+ "rstrip": false,
1929
+ "normalized": false,
1930
+ "special": true
1931
+ },
1932
+ {
1933
+ "id": 214,
1934
+ "content": "<0x6C>",
1935
+ "single_word": false,
1936
+ "lstrip": false,
1937
+ "rstrip": false,
1938
+ "normalized": false,
1939
+ "special": true
1940
+ },
1941
+ {
1942
+ "id": 215,
1943
+ "content": "<0x6D>",
1944
+ "single_word": false,
1945
+ "lstrip": false,
1946
+ "rstrip": false,
1947
+ "normalized": false,
1948
+ "special": true
1949
+ },
1950
+ {
1951
+ "id": 216,
1952
+ "content": "<0x6E>",
1953
+ "single_word": false,
1954
+ "lstrip": false,
1955
+ "rstrip": false,
1956
+ "normalized": false,
1957
+ "special": true
1958
+ },
1959
+ {
1960
+ "id": 217,
1961
+ "content": "<0x6F>",
1962
+ "single_word": false,
1963
+ "lstrip": false,
1964
+ "rstrip": false,
1965
+ "normalized": false,
1966
+ "special": true
1967
+ },
1968
+ {
1969
+ "id": 218,
1970
+ "content": "<0x70>",
1971
+ "single_word": false,
1972
+ "lstrip": false,
1973
+ "rstrip": false,
1974
+ "normalized": false,
1975
+ "special": true
1976
+ },
1977
+ {
1978
+ "id": 219,
1979
+ "content": "<0x71>",
1980
+ "single_word": false,
1981
+ "lstrip": false,
1982
+ "rstrip": false,
1983
+ "normalized": false,
1984
+ "special": true
1985
+ },
1986
+ {
1987
+ "id": 220,
1988
+ "content": "<0x72>",
1989
+ "single_word": false,
1990
+ "lstrip": false,
1991
+ "rstrip": false,
1992
+ "normalized": false,
1993
+ "special": true
1994
+ },
1995
+ {
1996
+ "id": 221,
1997
+ "content": "<0x73>",
1998
+ "single_word": false,
1999
+ "lstrip": false,
2000
+ "rstrip": false,
2001
+ "normalized": false,
2002
+ "special": true
2003
+ },
2004
+ {
2005
+ "id": 222,
2006
+ "content": "<0x74>",
2007
+ "single_word": false,
2008
+ "lstrip": false,
2009
+ "rstrip": false,
2010
+ "normalized": false,
2011
+ "special": true
2012
+ },
2013
+ {
2014
+ "id": 223,
2015
+ "content": "<0x75>",
2016
+ "single_word": false,
2017
+ "lstrip": false,
2018
+ "rstrip": false,
2019
+ "normalized": false,
2020
+ "special": true
2021
+ },
2022
+ {
2023
+ "id": 224,
2024
+ "content": "<0x76>",
2025
+ "single_word": false,
2026
+ "lstrip": false,
2027
+ "rstrip": false,
2028
+ "normalized": false,
2029
+ "special": true
2030
+ },
2031
+ {
2032
+ "id": 225,
2033
+ "content": "<0x77>",
2034
+ "single_word": false,
2035
+ "lstrip": false,
2036
+ "rstrip": false,
2037
+ "normalized": false,
2038
+ "special": true
2039
+ },
2040
+ {
2041
+ "id": 226,
2042
+ "content": "<0x78>",
2043
+ "single_word": false,
2044
+ "lstrip": false,
2045
+ "rstrip": false,
2046
+ "normalized": false,
2047
+ "special": true
2048
+ },
2049
+ {
2050
+ "id": 227,
2051
+ "content": "<0x79>",
2052
+ "single_word": false,
2053
+ "lstrip": false,
2054
+ "rstrip": false,
2055
+ "normalized": false,
2056
+ "special": true
2057
+ },
2058
+ {
2059
+ "id": 228,
2060
+ "content": "<0x7A>",
2061
+ "single_word": false,
2062
+ "lstrip": false,
2063
+ "rstrip": false,
2064
+ "normalized": false,
2065
+ "special": true
2066
+ },
2067
+ {
2068
+ "id": 229,
2069
+ "content": "<0x7B>",
2070
+ "single_word": false,
2071
+ "lstrip": false,
2072
+ "rstrip": false,
2073
+ "normalized": false,
2074
+ "special": true
2075
+ },
2076
+ {
2077
+ "id": 230,
2078
+ "content": "<0x7C>",
2079
+ "single_word": false,
2080
+ "lstrip": false,
2081
+ "rstrip": false,
2082
+ "normalized": false,
2083
+ "special": true
2084
+ },
2085
+ {
2086
+ "id": 231,
2087
+ "content": "<0x7D>",
2088
+ "single_word": false,
2089
+ "lstrip": false,
2090
+ "rstrip": false,
2091
+ "normalized": false,
2092
+ "special": true
2093
+ },
2094
+ {
2095
+ "id": 232,
2096
+ "content": "<0x7E>",
2097
+ "single_word": false,
2098
+ "lstrip": false,
2099
+ "rstrip": false,
2100
+ "normalized": false,
2101
+ "special": true
2102
+ },
2103
+ {
2104
+ "id": 233,
2105
+ "content": "<0x7F>",
2106
+ "single_word": false,
2107
+ "lstrip": false,
2108
+ "rstrip": false,
2109
+ "normalized": false,
2110
+ "special": true
2111
+ },
2112
+ {
2113
+ "id": 234,
2114
+ "content": "<0x80>",
2115
+ "single_word": false,
2116
+ "lstrip": false,
2117
+ "rstrip": false,
2118
+ "normalized": false,
2119
+ "special": true
2120
+ },
2121
+ {
2122
+ "id": 235,
2123
+ "content": "<0x81>",
2124
+ "single_word": false,
2125
+ "lstrip": false,
2126
+ "rstrip": false,
2127
+ "normalized": false,
2128
+ "special": true
2129
+ },
2130
+ {
2131
+ "id": 236,
2132
+ "content": "<0x82>",
2133
+ "single_word": false,
2134
+ "lstrip": false,
2135
+ "rstrip": false,
2136
+ "normalized": false,
2137
+ "special": true
2138
+ },
2139
+ {
2140
+ "id": 237,
2141
+ "content": "<0x83>",
2142
+ "single_word": false,
2143
+ "lstrip": false,
2144
+ "rstrip": false,
2145
+ "normalized": false,
2146
+ "special": true
2147
+ },
2148
+ {
2149
+ "id": 238,
2150
+ "content": "<0x84>",
2151
+ "single_word": false,
2152
+ "lstrip": false,
2153
+ "rstrip": false,
2154
+ "normalized": false,
2155
+ "special": true
2156
+ },
2157
+ {
2158
+ "id": 239,
2159
+ "content": "<0x85>",
2160
+ "single_word": false,
2161
+ "lstrip": false,
2162
+ "rstrip": false,
2163
+ "normalized": false,
2164
+ "special": true
2165
+ },
2166
+ {
2167
+ "id": 240,
2168
+ "content": "<0x86>",
2169
+ "single_word": false,
2170
+ "lstrip": false,
2171
+ "rstrip": false,
2172
+ "normalized": false,
2173
+ "special": true
2174
+ },
2175
+ {
2176
+ "id": 241,
2177
+ "content": "<0x87>",
2178
+ "single_word": false,
2179
+ "lstrip": false,
2180
+ "rstrip": false,
2181
+ "normalized": false,
2182
+ "special": true
2183
+ },
2184
+ {
2185
+ "id": 242,
2186
+ "content": "<0x88>",
2187
+ "single_word": false,
2188
+ "lstrip": false,
2189
+ "rstrip": false,
2190
+ "normalized": false,
2191
+ "special": true
2192
+ },
2193
+ {
2194
+ "id": 243,
2195
+ "content": "<0x89>",
2196
+ "single_word": false,
2197
+ "lstrip": false,
2198
+ "rstrip": false,
2199
+ "normalized": false,
2200
+ "special": true
2201
+ },
2202
+ {
2203
+ "id": 244,
2204
+ "content": "<0x8A>",
2205
+ "single_word": false,
2206
+ "lstrip": false,
2207
+ "rstrip": false,
2208
+ "normalized": false,
2209
+ "special": true
2210
+ },
2211
+ {
2212
+ "id": 245,
2213
+ "content": "<0x8B>",
2214
+ "single_word": false,
2215
+ "lstrip": false,
2216
+ "rstrip": false,
2217
+ "normalized": false,
2218
+ "special": true
2219
+ },
2220
+ {
2221
+ "id": 246,
2222
+ "content": "<0x8C>",
2223
+ "single_word": false,
2224
+ "lstrip": false,
2225
+ "rstrip": false,
2226
+ "normalized": false,
2227
+ "special": true
2228
+ },
2229
+ {
2230
+ "id": 247,
2231
+ "content": "<0x8D>",
2232
+ "single_word": false,
2233
+ "lstrip": false,
2234
+ "rstrip": false,
2235
+ "normalized": false,
2236
+ "special": true
2237
+ },
2238
+ {
2239
+ "id": 248,
2240
+ "content": "<0x8E>",
2241
+ "single_word": false,
2242
+ "lstrip": false,
2243
+ "rstrip": false,
2244
+ "normalized": false,
2245
+ "special": true
2246
+ },
2247
+ {
2248
+ "id": 249,
2249
+ "content": "<0x8F>",
2250
+ "single_word": false,
2251
+ "lstrip": false,
2252
+ "rstrip": false,
2253
+ "normalized": false,
2254
+ "special": true
2255
+ },
2256
+ {
2257
+ "id": 250,
2258
+ "content": "<0x90>",
2259
+ "single_word": false,
2260
+ "lstrip": false,
2261
+ "rstrip": false,
2262
+ "normalized": false,
2263
+ "special": true
2264
+ },
2265
+ {
2266
+ "id": 251,
2267
+ "content": "<0x91>",
2268
+ "single_word": false,
2269
+ "lstrip": false,
2270
+ "rstrip": false,
2271
+ "normalized": false,
2272
+ "special": true
2273
+ },
2274
+ {
2275
+ "id": 252,
2276
+ "content": "<0x92>",
2277
+ "single_word": false,
2278
+ "lstrip": false,
2279
+ "rstrip": false,
2280
+ "normalized": false,
2281
+ "special": true
2282
+ },
2283
+ {
2284
+ "id": 253,
2285
+ "content": "<0x93>",
2286
+ "single_word": false,
2287
+ "lstrip": false,
2288
+ "rstrip": false,
2289
+ "normalized": false,
2290
+ "special": true
2291
+ },
2292
+ {
2293
+ "id": 254,
2294
+ "content": "<0x94>",
2295
+ "single_word": false,
2296
+ "lstrip": false,
2297
+ "rstrip": false,
2298
+ "normalized": false,
2299
+ "special": true
2300
+ },
2301
+ {
2302
+ "id": 255,
2303
+ "content": "<0x95>",
2304
+ "single_word": false,
2305
+ "lstrip": false,
2306
+ "rstrip": false,
2307
+ "normalized": false,
2308
+ "special": true
2309
+ },
2310
+ {
2311
+ "id": 256,
2312
+ "content": "<0x96>",
2313
+ "single_word": false,
2314
+ "lstrip": false,
2315
+ "rstrip": false,
2316
+ "normalized": false,
2317
+ "special": true
2318
+ },
2319
+ {
2320
+ "id": 257,
2321
+ "content": "<0x97>",
2322
+ "single_word": false,
2323
+ "lstrip": false,
2324
+ "rstrip": false,
2325
+ "normalized": false,
2326
+ "special": true
2327
+ },
2328
+ {
2329
+ "id": 258,
2330
+ "content": "<0x98>",
2331
+ "single_word": false,
2332
+ "lstrip": false,
2333
+ "rstrip": false,
2334
+ "normalized": false,
2335
+ "special": true
2336
+ },
2337
+ {
2338
+ "id": 259,
2339
+ "content": "<0x99>",
2340
+ "single_word": false,
2341
+ "lstrip": false,
2342
+ "rstrip": false,
2343
+ "normalized": false,
2344
+ "special": true
2345
+ },
2346
+ {
2347
+ "id": 260,
2348
+ "content": "<0x9A>",
2349
+ "single_word": false,
2350
+ "lstrip": false,
2351
+ "rstrip": false,
2352
+ "normalized": false,
2353
+ "special": true
2354
+ },
2355
+ {
2356
+ "id": 261,
2357
+ "content": "<0x9B>",
2358
+ "single_word": false,
2359
+ "lstrip": false,
2360
+ "rstrip": false,
2361
+ "normalized": false,
2362
+ "special": true
2363
+ },
2364
+ {
2365
+ "id": 262,
2366
+ "content": "<0x9C>",
2367
+ "single_word": false,
2368
+ "lstrip": false,
2369
+ "rstrip": false,
2370
+ "normalized": false,
2371
+ "special": true
2372
+ },
2373
+ {
2374
+ "id": 263,
2375
+ "content": "<0x9D>",
2376
+ "single_word": false,
2377
+ "lstrip": false,
2378
+ "rstrip": false,
2379
+ "normalized": false,
2380
+ "special": true
2381
+ },
2382
+ {
2383
+ "id": 264,
2384
+ "content": "<0x9E>",
2385
+ "single_word": false,
2386
+ "lstrip": false,
2387
+ "rstrip": false,
2388
+ "normalized": false,
2389
+ "special": true
2390
+ },
2391
+ {
2392
+ "id": 265,
2393
+ "content": "<0x9F>",
2394
+ "single_word": false,
2395
+ "lstrip": false,
2396
+ "rstrip": false,
2397
+ "normalized": false,
2398
+ "special": true
2399
+ },
2400
+ {
2401
+ "id": 266,
2402
+ "content": "<0xA0>",
2403
+ "single_word": false,
2404
+ "lstrip": false,
2405
+ "rstrip": false,
2406
+ "normalized": false,
2407
+ "special": true
2408
+ },
2409
+ {
2410
+ "id": 267,
2411
+ "content": "<0xA1>",
2412
+ "single_word": false,
2413
+ "lstrip": false,
2414
+ "rstrip": false,
2415
+ "normalized": false,
2416
+ "special": true
2417
+ },
2418
+ {
2419
+ "id": 268,
2420
+ "content": "<0xA2>",
2421
+ "single_word": false,
2422
+ "lstrip": false,
2423
+ "rstrip": false,
2424
+ "normalized": false,
2425
+ "special": true
2426
+ },
2427
+ {
2428
+ "id": 269,
2429
+ "content": "<0xA3>",
2430
+ "single_word": false,
2431
+ "lstrip": false,
2432
+ "rstrip": false,
2433
+ "normalized": false,
2434
+ "special": true
2435
+ },
2436
+ {
2437
+ "id": 270,
2438
+ "content": "<0xA4>",
2439
+ "single_word": false,
2440
+ "lstrip": false,
2441
+ "rstrip": false,
2442
+ "normalized": false,
2443
+ "special": true
2444
+ },
2445
+ {
2446
+ "id": 271,
2447
+ "content": "<0xA5>",
2448
+ "single_word": false,
2449
+ "lstrip": false,
2450
+ "rstrip": false,
2451
+ "normalized": false,
2452
+ "special": true
2453
+ },
2454
+ {
2455
+ "id": 272,
2456
+ "content": "<0xA6>",
2457
+ "single_word": false,
2458
+ "lstrip": false,
2459
+ "rstrip": false,
2460
+ "normalized": false,
2461
+ "special": true
2462
+ },
2463
+ {
2464
+ "id": 273,
2465
+ "content": "<0xA7>",
2466
+ "single_word": false,
2467
+ "lstrip": false,
2468
+ "rstrip": false,
2469
+ "normalized": false,
2470
+ "special": true
2471
+ },
2472
+ {
2473
+ "id": 274,
2474
+ "content": "<0xA8>",
2475
+ "single_word": false,
2476
+ "lstrip": false,
2477
+ "rstrip": false,
2478
+ "normalized": false,
2479
+ "special": true
2480
+ },
2481
+ {
2482
+ "id": 275,
2483
+ "content": "<0xA9>",
2484
+ "single_word": false,
2485
+ "lstrip": false,
2486
+ "rstrip": false,
2487
+ "normalized": false,
2488
+ "special": true
2489
+ },
2490
+ {
2491
+ "id": 276,
2492
+ "content": "<0xAA>",
2493
+ "single_word": false,
2494
+ "lstrip": false,
2495
+ "rstrip": false,
2496
+ "normalized": false,
2497
+ "special": true
2498
+ },
2499
+ {
2500
+ "id": 277,
2501
+ "content": "<0xAB>",
2502
+ "single_word": false,
2503
+ "lstrip": false,
2504
+ "rstrip": false,
2505
+ "normalized": false,
2506
+ "special": true
2507
+ },
2508
+ {
2509
+ "id": 278,
2510
+ "content": "<0xAC>",
2511
+ "single_word": false,
2512
+ "lstrip": false,
2513
+ "rstrip": false,
2514
+ "normalized": false,
2515
+ "special": true
2516
+ },
2517
+ {
2518
+ "id": 279,
2519
+ "content": "<0xAD>",
2520
+ "single_word": false,
2521
+ "lstrip": false,
2522
+ "rstrip": false,
2523
+ "normalized": false,
2524
+ "special": true
2525
+ },
2526
+ {
2527
+ "id": 280,
2528
+ "content": "<0xAE>",
2529
+ "single_word": false,
2530
+ "lstrip": false,
2531
+ "rstrip": false,
2532
+ "normalized": false,
2533
+ "special": true
2534
+ },
2535
+ {
2536
+ "id": 281,
2537
+ "content": "<0xAF>",
2538
+ "single_word": false,
2539
+ "lstrip": false,
2540
+ "rstrip": false,
2541
+ "normalized": false,
2542
+ "special": true
2543
+ },
2544
+ {
2545
+ "id": 282,
2546
+ "content": "<0xB0>",
2547
+ "single_word": false,
2548
+ "lstrip": false,
2549
+ "rstrip": false,
2550
+ "normalized": false,
2551
+ "special": true
2552
+ },
2553
+ {
2554
+ "id": 283,
2555
+ "content": "<0xB1>",
2556
+ "single_word": false,
2557
+ "lstrip": false,
2558
+ "rstrip": false,
2559
+ "normalized": false,
2560
+ "special": true
2561
+ },
2562
+ {
2563
+ "id": 284,
2564
+ "content": "<0xB2>",
2565
+ "single_word": false,
2566
+ "lstrip": false,
2567
+ "rstrip": false,
2568
+ "normalized": false,
2569
+ "special": true
2570
+ },
2571
+ {
2572
+ "id": 285,
2573
+ "content": "<0xB3>",
2574
+ "single_word": false,
2575
+ "lstrip": false,
2576
+ "rstrip": false,
2577
+ "normalized": false,
2578
+ "special": true
2579
+ },
2580
+ {
2581
+ "id": 286,
2582
+ "content": "<0xB4>",
2583
+ "single_word": false,
2584
+ "lstrip": false,
2585
+ "rstrip": false,
2586
+ "normalized": false,
2587
+ "special": true
2588
+ },
2589
+ {
2590
+ "id": 287,
2591
+ "content": "<0xB5>",
2592
+ "single_word": false,
2593
+ "lstrip": false,
2594
+ "rstrip": false,
2595
+ "normalized": false,
2596
+ "special": true
2597
+ },
2598
+ {
2599
+ "id": 288,
2600
+ "content": "<0xB6>",
2601
+ "single_word": false,
2602
+ "lstrip": false,
2603
+ "rstrip": false,
2604
+ "normalized": false,
2605
+ "special": true
2606
+ },
2607
+ {
2608
+ "id": 289,
2609
+ "content": "<0xB7>",
2610
+ "single_word": false,
2611
+ "lstrip": false,
2612
+ "rstrip": false,
2613
+ "normalized": false,
2614
+ "special": true
2615
+ },
2616
+ {
2617
+ "id": 290,
2618
+ "content": "<0xB8>",
2619
+ "single_word": false,
2620
+ "lstrip": false,
2621
+ "rstrip": false,
2622
+ "normalized": false,
2623
+ "special": true
2624
+ },
2625
+ {
2626
+ "id": 291,
2627
+ "content": "<0xB9>",
2628
+ "single_word": false,
2629
+ "lstrip": false,
2630
+ "rstrip": false,
2631
+ "normalized": false,
2632
+ "special": true
2633
+ },
2634
+ {
2635
+ "id": 292,
2636
+ "content": "<0xBA>",
2637
+ "single_word": false,
2638
+ "lstrip": false,
2639
+ "rstrip": false,
2640
+ "normalized": false,
2641
+ "special": true
2642
+ },
2643
+ {
2644
+ "id": 293,
2645
+ "content": "<0xBB>",
2646
+ "single_word": false,
2647
+ "lstrip": false,
2648
+ "rstrip": false,
2649
+ "normalized": false,
2650
+ "special": true
2651
+ },
2652
+ {
2653
+ "id": 294,
2654
+ "content": "<0xBC>",
2655
+ "single_word": false,
2656
+ "lstrip": false,
2657
+ "rstrip": false,
2658
+ "normalized": false,
2659
+ "special": true
2660
+ },
2661
+ {
2662
+ "id": 295,
2663
+ "content": "<0xBD>",
2664
+ "single_word": false,
2665
+ "lstrip": false,
2666
+ "rstrip": false,
2667
+ "normalized": false,
2668
+ "special": true
2669
+ },
2670
+ {
2671
+ "id": 296,
2672
+ "content": "<0xBE>",
2673
+ "single_word": false,
2674
+ "lstrip": false,
2675
+ "rstrip": false,
2676
+ "normalized": false,
2677
+ "special": true
2678
+ },
2679
+ {
2680
+ "id": 297,
2681
+ "content": "<0xBF>",
2682
+ "single_word": false,
2683
+ "lstrip": false,
2684
+ "rstrip": false,
2685
+ "normalized": false,
2686
+ "special": true
2687
+ },
2688
+ {
2689
+ "id": 298,
2690
+ "content": "<0xC0>",
2691
+ "single_word": false,
2692
+ "lstrip": false,
2693
+ "rstrip": false,
2694
+ "normalized": false,
2695
+ "special": true
2696
+ },
2697
+ {
2698
+ "id": 299,
2699
+ "content": "<0xC1>",
2700
+ "single_word": false,
2701
+ "lstrip": false,
2702
+ "rstrip": false,
2703
+ "normalized": false,
2704
+ "special": true
2705
+ },
2706
+ {
2707
+ "id": 300,
2708
+ "content": "<0xC2>",
2709
+ "single_word": false,
2710
+ "lstrip": false,
2711
+ "rstrip": false,
2712
+ "normalized": false,
2713
+ "special": true
2714
+ },
2715
+ {
2716
+ "id": 301,
2717
+ "content": "<0xC3>",
2718
+ "single_word": false,
2719
+ "lstrip": false,
2720
+ "rstrip": false,
2721
+ "normalized": false,
2722
+ "special": true
2723
+ },
2724
+ {
2725
+ "id": 302,
2726
+ "content": "<0xC4>",
2727
+ "single_word": false,
2728
+ "lstrip": false,
2729
+ "rstrip": false,
2730
+ "normalized": false,
2731
+ "special": true
2732
+ },
2733
+ {
2734
+ "id": 303,
2735
+ "content": "<0xC5>",
2736
+ "single_word": false,
2737
+ "lstrip": false,
2738
+ "rstrip": false,
2739
+ "normalized": false,
2740
+ "special": true
2741
+ },
2742
+ {
2743
+ "id": 304,
2744
+ "content": "<0xC6>",
2745
+ "single_word": false,
2746
+ "lstrip": false,
2747
+ "rstrip": false,
2748
+ "normalized": false,
2749
+ "special": true
2750
+ },
2751
+ {
2752
+ "id": 305,
2753
+ "content": "<0xC7>",
2754
+ "single_word": false,
2755
+ "lstrip": false,
2756
+ "rstrip": false,
2757
+ "normalized": false,
2758
+ "special": true
2759
+ },
2760
+ {
2761
+ "id": 306,
2762
+ "content": "<0xC8>",
2763
+ "single_word": false,
2764
+ "lstrip": false,
2765
+ "rstrip": false,
2766
+ "normalized": false,
2767
+ "special": true
2768
+ },
2769
+ {
2770
+ "id": 307,
2771
+ "content": "<0xC9>",
2772
+ "single_word": false,
2773
+ "lstrip": false,
2774
+ "rstrip": false,
2775
+ "normalized": false,
2776
+ "special": true
2777
+ },
2778
+ {
2779
+ "id": 308,
2780
+ "content": "<0xCA>",
2781
+ "single_word": false,
2782
+ "lstrip": false,
2783
+ "rstrip": false,
2784
+ "normalized": false,
2785
+ "special": true
2786
+ },
2787
+ {
2788
+ "id": 309,
2789
+ "content": "<0xCB>",
2790
+ "single_word": false,
2791
+ "lstrip": false,
2792
+ "rstrip": false,
2793
+ "normalized": false,
2794
+ "special": true
2795
+ },
2796
+ {
2797
+ "id": 310,
2798
+ "content": "<0xCC>",
2799
+ "single_word": false,
2800
+ "lstrip": false,
2801
+ "rstrip": false,
2802
+ "normalized": false,
2803
+ "special": true
2804
+ },
2805
+ {
2806
+ "id": 311,
2807
+ "content": "<0xCD>",
2808
+ "single_word": false,
2809
+ "lstrip": false,
2810
+ "rstrip": false,
2811
+ "normalized": false,
2812
+ "special": true
2813
+ },
2814
+ {
2815
+ "id": 312,
2816
+ "content": "<0xCE>",
2817
+ "single_word": false,
2818
+ "lstrip": false,
2819
+ "rstrip": false,
2820
+ "normalized": false,
2821
+ "special": true
2822
+ },
2823
+ {
2824
+ "id": 313,
2825
+ "content": "<0xCF>",
2826
+ "single_word": false,
2827
+ "lstrip": false,
2828
+ "rstrip": false,
2829
+ "normalized": false,
2830
+ "special": true
2831
+ },
2832
+ {
2833
+ "id": 314,
2834
+ "content": "<0xD0>",
2835
+ "single_word": false,
2836
+ "lstrip": false,
2837
+ "rstrip": false,
2838
+ "normalized": false,
2839
+ "special": true
2840
+ },
2841
+ {
2842
+ "id": 315,
2843
+ "content": "<0xD1>",
2844
+ "single_word": false,
2845
+ "lstrip": false,
2846
+ "rstrip": false,
2847
+ "normalized": false,
2848
+ "special": true
2849
+ },
2850
+ {
2851
+ "id": 316,
2852
+ "content": "<0xD2>",
2853
+ "single_word": false,
2854
+ "lstrip": false,
2855
+ "rstrip": false,
2856
+ "normalized": false,
2857
+ "special": true
2858
+ },
2859
+ {
2860
+ "id": 317,
2861
+ "content": "<0xD3>",
2862
+ "single_word": false,
2863
+ "lstrip": false,
2864
+ "rstrip": false,
2865
+ "normalized": false,
2866
+ "special": true
2867
+ },
2868
+ {
2869
+ "id": 318,
2870
+ "content": "<0xD4>",
2871
+ "single_word": false,
2872
+ "lstrip": false,
2873
+ "rstrip": false,
2874
+ "normalized": false,
2875
+ "special": true
2876
+ },
2877
+ {
2878
+ "id": 319,
2879
+ "content": "<0xD5>",
2880
+ "single_word": false,
2881
+ "lstrip": false,
2882
+ "rstrip": false,
2883
+ "normalized": false,
2884
+ "special": true
2885
+ },
2886
+ {
2887
+ "id": 320,
2888
+ "content": "<0xD6>",
2889
+ "single_word": false,
2890
+ "lstrip": false,
2891
+ "rstrip": false,
2892
+ "normalized": false,
2893
+ "special": true
2894
+ },
2895
+ {
2896
+ "id": 321,
2897
+ "content": "<0xD7>",
2898
+ "single_word": false,
2899
+ "lstrip": false,
2900
+ "rstrip": false,
2901
+ "normalized": false,
2902
+ "special": true
2903
+ },
2904
+ {
2905
+ "id": 322,
2906
+ "content": "<0xD8>",
2907
+ "single_word": false,
2908
+ "lstrip": false,
2909
+ "rstrip": false,
2910
+ "normalized": false,
2911
+ "special": true
2912
+ },
2913
+ {
2914
+ "id": 323,
2915
+ "content": "<0xD9>",
2916
+ "single_word": false,
2917
+ "lstrip": false,
2918
+ "rstrip": false,
2919
+ "normalized": false,
2920
+ "special": true
2921
+ },
2922
+ {
2923
+ "id": 324,
2924
+ "content": "<0xDA>",
2925
+ "single_word": false,
2926
+ "lstrip": false,
2927
+ "rstrip": false,
2928
+ "normalized": false,
2929
+ "special": true
2930
+ },
2931
+ {
2932
+ "id": 325,
2933
+ "content": "<0xDB>",
2934
+ "single_word": false,
2935
+ "lstrip": false,
2936
+ "rstrip": false,
2937
+ "normalized": false,
2938
+ "special": true
2939
+ },
2940
+ {
2941
+ "id": 326,
2942
+ "content": "<0xDC>",
2943
+ "single_word": false,
2944
+ "lstrip": false,
2945
+ "rstrip": false,
2946
+ "normalized": false,
2947
+ "special": true
2948
+ },
2949
+ {
2950
+ "id": 327,
2951
+ "content": "<0xDD>",
2952
+ "single_word": false,
2953
+ "lstrip": false,
2954
+ "rstrip": false,
2955
+ "normalized": false,
2956
+ "special": true
2957
+ },
2958
+ {
2959
+ "id": 328,
2960
+ "content": "<0xDE>",
2961
+ "single_word": false,
2962
+ "lstrip": false,
2963
+ "rstrip": false,
2964
+ "normalized": false,
2965
+ "special": true
2966
+ },
2967
+ {
2968
+ "id": 329,
2969
+ "content": "<0xDF>",
2970
+ "single_word": false,
2971
+ "lstrip": false,
2972
+ "rstrip": false,
2973
+ "normalized": false,
2974
+ "special": true
2975
+ },
2976
+ {
2977
+ "id": 330,
2978
+ "content": "<0xE0>",
2979
+ "single_word": false,
2980
+ "lstrip": false,
2981
+ "rstrip": false,
2982
+ "normalized": false,
2983
+ "special": true
2984
+ },
2985
+ {
2986
+ "id": 331,
2987
+ "content": "<0xE1>",
2988
+ "single_word": false,
2989
+ "lstrip": false,
2990
+ "rstrip": false,
2991
+ "normalized": false,
2992
+ "special": true
2993
+ },
2994
+ {
2995
+ "id": 332,
2996
+ "content": "<0xE2>",
2997
+ "single_word": false,
2998
+ "lstrip": false,
2999
+ "rstrip": false,
3000
+ "normalized": false,
3001
+ "special": true
3002
+ },
3003
+ {
3004
+ "id": 333,
3005
+ "content": "<0xE3>",
3006
+ "single_word": false,
3007
+ "lstrip": false,
3008
+ "rstrip": false,
3009
+ "normalized": false,
3010
+ "special": true
3011
+ },
3012
+ {
3013
+ "id": 334,
3014
+ "content": "<0xE4>",
3015
+ "single_word": false,
3016
+ "lstrip": false,
3017
+ "rstrip": false,
3018
+ "normalized": false,
3019
+ "special": true
3020
+ },
3021
+ {
3022
+ "id": 335,
3023
+ "content": "<0xE5>",
3024
+ "single_word": false,
3025
+ "lstrip": false,
3026
+ "rstrip": false,
3027
+ "normalized": false,
3028
+ "special": true
3029
+ },
3030
+ {
3031
+ "id": 336,
3032
+ "content": "<0xE6>",
3033
+ "single_word": false,
3034
+ "lstrip": false,
3035
+ "rstrip": false,
3036
+ "normalized": false,
3037
+ "special": true
3038
+ },
3039
+ {
3040
+ "id": 337,
3041
+ "content": "<0xE7>",
3042
+ "single_word": false,
3043
+ "lstrip": false,
3044
+ "rstrip": false,
3045
+ "normalized": false,
3046
+ "special": true
3047
+ },
3048
+ {
3049
+ "id": 338,
3050
+ "content": "<0xE8>",
3051
+ "single_word": false,
3052
+ "lstrip": false,
3053
+ "rstrip": false,
3054
+ "normalized": false,
3055
+ "special": true
3056
+ },
3057
+ {
3058
+ "id": 339,
3059
+ "content": "<0xE9>",
3060
+ "single_word": false,
3061
+ "lstrip": false,
3062
+ "rstrip": false,
3063
+ "normalized": false,
3064
+ "special": true
3065
+ },
3066
+ {
3067
+ "id": 340,
3068
+ "content": "<0xEA>",
3069
+ "single_word": false,
3070
+ "lstrip": false,
3071
+ "rstrip": false,
3072
+ "normalized": false,
3073
+ "special": true
3074
+ },
3075
+ {
3076
+ "id": 341,
3077
+ "content": "<0xEB>",
3078
+ "single_word": false,
3079
+ "lstrip": false,
3080
+ "rstrip": false,
3081
+ "normalized": false,
3082
+ "special": true
3083
+ },
3084
+ {
3085
+ "id": 342,
3086
+ "content": "<0xEC>",
3087
+ "single_word": false,
3088
+ "lstrip": false,
3089
+ "rstrip": false,
3090
+ "normalized": false,
3091
+ "special": true
3092
+ },
3093
+ {
3094
+ "id": 343,
3095
+ "content": "<0xED>",
3096
+ "single_word": false,
3097
+ "lstrip": false,
3098
+ "rstrip": false,
3099
+ "normalized": false,
3100
+ "special": true
3101
+ },
3102
+ {
3103
+ "id": 344,
3104
+ "content": "<0xEE>",
3105
+ "single_word": false,
3106
+ "lstrip": false,
3107
+ "rstrip": false,
3108
+ "normalized": false,
3109
+ "special": true
3110
+ },
3111
+ {
3112
+ "id": 345,
3113
+ "content": "<0xEF>",
3114
+ "single_word": false,
3115
+ "lstrip": false,
3116
+ "rstrip": false,
3117
+ "normalized": false,
3118
+ "special": true
3119
+ },
3120
+ {
3121
+ "id": 346,
3122
+ "content": "<0xF0>",
3123
+ "single_word": false,
3124
+ "lstrip": false,
3125
+ "rstrip": false,
3126
+ "normalized": false,
3127
+ "special": true
3128
+ },
3129
+ {
3130
+ "id": 347,
3131
+ "content": "<0xF1>",
3132
+ "single_word": false,
3133
+ "lstrip": false,
3134
+ "rstrip": false,
3135
+ "normalized": false,
3136
+ "special": true
3137
+ },
3138
+ {
3139
+ "id": 348,
3140
+ "content": "<0xF2>",
3141
+ "single_word": false,
3142
+ "lstrip": false,
3143
+ "rstrip": false,
3144
+ "normalized": false,
3145
+ "special": true
3146
+ },
3147
+ {
3148
+ "id": 349,
3149
+ "content": "<0xF3>",
3150
+ "single_word": false,
3151
+ "lstrip": false,
3152
+ "rstrip": false,
3153
+ "normalized": false,
3154
+ "special": true
3155
+ },
3156
+ {
3157
+ "id": 350,
3158
+ "content": "<0xF4>",
3159
+ "single_word": false,
3160
+ "lstrip": false,
3161
+ "rstrip": false,
3162
+ "normalized": false,
3163
+ "special": true
3164
+ },
3165
+ {
3166
+ "id": 351,
3167
+ "content": "<0xF5>",
3168
+ "single_word": false,
3169
+ "lstrip": false,
3170
+ "rstrip": false,
3171
+ "normalized": false,
3172
+ "special": true
3173
+ },
3174
+ {
3175
+ "id": 352,
3176
+ "content": "<0xF6>",
3177
+ "single_word": false,
3178
+ "lstrip": false,
3179
+ "rstrip": false,
3180
+ "normalized": false,
3181
+ "special": true
3182
+ },
3183
+ {
3184
+ "id": 353,
3185
+ "content": "<0xF7>",
3186
+ "single_word": false,
3187
+ "lstrip": false,
3188
+ "rstrip": false,
3189
+ "normalized": false,
3190
+ "special": true
3191
+ },
3192
+ {
3193
+ "id": 354,
3194
+ "content": "<0xF8>",
3195
+ "single_word": false,
3196
+ "lstrip": false,
3197
+ "rstrip": false,
3198
+ "normalized": false,
3199
+ "special": true
3200
+ },
3201
+ {
3202
+ "id": 355,
3203
+ "content": "<0xF9>",
3204
+ "single_word": false,
3205
+ "lstrip": false,
3206
+ "rstrip": false,
3207
+ "normalized": false,
3208
+ "special": true
3209
+ },
3210
+ {
3211
+ "id": 356,
3212
+ "content": "<0xFA>",
3213
+ "single_word": false,
3214
+ "lstrip": false,
3215
+ "rstrip": false,
3216
+ "normalized": false,
3217
+ "special": true
3218
+ },
3219
+ {
3220
+ "id": 357,
3221
+ "content": "<0xFB>",
3222
+ "single_word": false,
3223
+ "lstrip": false,
3224
+ "rstrip": false,
3225
+ "normalized": false,
3226
+ "special": true
3227
+ },
3228
+ {
3229
+ "id": 358,
3230
+ "content": "<0xFC>",
3231
+ "single_word": false,
3232
+ "lstrip": false,
3233
+ "rstrip": false,
3234
+ "normalized": false,
3235
+ "special": true
3236
+ },
3237
+ {
3238
+ "id": 359,
3239
+ "content": "<0xFD>",
3240
+ "single_word": false,
3241
+ "lstrip": false,
3242
+ "rstrip": false,
3243
+ "normalized": false,
3244
+ "special": true
3245
+ },
3246
+ {
3247
+ "id": 360,
3248
+ "content": "<0xFE>",
3249
+ "single_word": false,
3250
+ "lstrip": false,
3251
+ "rstrip": false,
3252
+ "normalized": false,
3253
+ "special": true
3254
+ },
3255
+ {
3256
+ "id": 361,
3257
+ "content": "<0xFF>",
3258
+ "single_word": false,
3259
+ "lstrip": false,
3260
+ "rstrip": false,
3261
+ "normalized": false,
3262
+ "special": true
3263
+ }
3264
+ ],
3265
+ "normalizer": {
3266
+ "type": "Sequence",
3267
+ "normalizers": [
3268
+ {
3269
+ "type": "Prepend",
3270
+ "prepend": "▁"
3271
+ },
3272
+ {
3273
+ "type": "Replace",
3274
+ "pattern": {
3275
+ "String": " "
3276
+ },
3277
+ "content": "▁"
3278
+ }
3279
+ ]
3280
+ },
3281
+ "pre_tokenizer": null,
3282
+ "post_processor": {
3283
+ "type": "TemplateProcessing",
3284
+ "single": [
3285
+ {
3286
+ "SpecialToken": {
3287
+ "id": "<s>",
3288
+ "type_id": 0
3289
+ }
3290
+ },
3291
+ {
3292
+ "Sequence": {
3293
+ "id": "A",
3294
+ "type_id": 0
3295
+ }
3296
+ }
3297
+ ],
3298
+ "pair": [
3299
+ {
3300
+ "SpecialToken": {
3301
+ "id": "<s>",
3302
+ "type_id": 0
3303
+ }
3304
+ },
3305
+ {
3306
+ "Sequence": {
3307
+ "id": "A",
3308
+ "type_id": 0
3309
+ }
3310
+ },
3311
+ {
3312
+ "SpecialToken": {
3313
+ "id": "<s>",
3314
+ "type_id": 0
3315
  }
3316
  },
3317
  {
tokenizer_config.json CHANGED
@@ -1,867 +1,12 @@
1
  {
2
- "add_bos_token": true,
3
- "add_eos_token": false,
4
- "add_prefix_space": null,
5
- "added_tokens_decoder": {
6
- "0": {
7
- "content": "<unk>",
8
- "lstrip": false,
9
- "normalized": false,
10
- "rstrip": false,
11
- "single_word": false,
12
- "special": true
13
- },
14
- "1": {
15
- "content": "<s>",
16
- "lstrip": false,
17
- "normalized": false,
18
- "rstrip": false,
19
- "single_word": false,
20
- "special": true
21
- },
22
- "2": {
23
- "content": "</s>",
24
- "lstrip": false,
25
- "normalized": false,
26
- "rstrip": false,
27
- "single_word": false,
28
- "special": true
29
- },
30
- "3": {
31
- "content": "<pad>",
32
- "lstrip": false,
33
- "normalized": false,
34
- "rstrip": false,
35
- "single_word": false,
36
- "special": true
37
- },
38
- "4": {
39
- "content": "<|im_start|>",
40
- "lstrip": false,
41
- "normalized": false,
42
- "rstrip": false,
43
- "single_word": false,
44
- "special": false
45
- },
46
- "5": {
47
- "content": "<|im_end|>",
48
- "lstrip": false,
49
- "normalized": false,
50
- "rstrip": false,
51
- "single_word": false,
52
- "special": false
53
- },
54
- "6": {
55
- "content": "<|im_sp_00|>",
56
- "lstrip": false,
57
- "normalized": false,
58
- "rstrip": false,
59
- "single_word": false,
60
- "special": false
61
- },
62
- "7": {
63
- "content": "<|im_sp_01|>",
64
- "lstrip": false,
65
- "normalized": false,
66
- "rstrip": false,
67
- "single_word": false,
68
- "special": false
69
- },
70
- "8": {
71
- "content": "<|im_sp_02|>",
72
- "lstrip": false,
73
- "normalized": false,
74
- "rstrip": false,
75
- "single_word": false,
76
- "special": false
77
- },
78
- "9": {
79
- "content": "<|im_sp_03|>",
80
- "lstrip": false,
81
- "normalized": false,
82
- "rstrip": false,
83
- "single_word": false,
84
- "special": false
85
- },
86
- "10": {
87
- "content": "<|im_sp_04|>",
88
- "lstrip": false,
89
- "normalized": false,
90
- "rstrip": false,
91
- "single_word": false,
92
- "special": false
93
- },
94
- "11": {
95
- "content": "<|im_sp_05|>",
96
- "lstrip": false,
97
- "normalized": false,
98
- "rstrip": false,
99
- "single_word": false,
100
- "special": false
101
- },
102
- "12": {
103
- "content": "<|im_sp_06|>",
104
- "lstrip": false,
105
- "normalized": false,
106
- "rstrip": false,
107
- "single_word": false,
108
- "special": false
109
- },
110
- "13": {
111
- "content": "<|im_sp_07|>",
112
- "lstrip": false,
113
- "normalized": false,
114
- "rstrip": false,
115
- "single_word": false,
116
- "special": false
117
- },
118
- "14": {
119
- "content": "<|im_sp_08|>",
120
- "lstrip": false,
121
- "normalized": false,
122
- "rstrip": false,
123
- "single_word": false,
124
- "special": false
125
- },
126
- "15": {
127
- "content": "<|im_sp_09|>",
128
- "lstrip": false,
129
- "normalized": false,
130
- "rstrip": false,
131
- "single_word": false,
132
- "special": false
133
- },
134
- "16": {
135
- "content": "<|im_sp_10|>",
136
- "lstrip": false,
137
- "normalized": false,
138
- "rstrip": false,
139
- "single_word": false,
140
- "special": false
141
- },
142
- "17": {
143
- "content": "<|im_sp_11|>",
144
- "lstrip": false,
145
- "normalized": false,
146
- "rstrip": false,
147
- "single_word": false,
148
- "special": false
149
- },
150
- "18": {
151
- "content": "<|im_sp_12|>",
152
- "lstrip": false,
153
- "normalized": false,
154
- "rstrip": false,
155
- "single_word": false,
156
- "special": false
157
- },
158
- "19": {
159
- "content": "<|im_sp_13|>",
160
- "lstrip": false,
161
- "normalized": false,
162
- "rstrip": false,
163
- "single_word": false,
164
- "special": false
165
- },
166
- "20": {
167
- "content": "<|im_sp_14|>",
168
- "lstrip": false,
169
- "normalized": false,
170
- "rstrip": false,
171
- "single_word": false,
172
- "special": false
173
- },
174
- "21": {
175
- "content": "<|im_sp_15|>",
176
- "lstrip": false,
177
- "normalized": false,
178
- "rstrip": false,
179
- "single_word": false,
180
- "special": false
181
- },
182
- "22": {
183
- "content": "<|im_sp_16|>",
184
- "lstrip": false,
185
- "normalized": false,
186
- "rstrip": false,
187
- "single_word": false,
188
- "special": false
189
- },
190
- "23": {
191
- "content": "<|im_sp_17|>",
192
- "lstrip": false,
193
- "normalized": false,
194
- "rstrip": false,
195
- "single_word": false,
196
- "special": false
197
- },
198
- "24": {
199
- "content": "<|im_sp_18|>",
200
- "lstrip": false,
201
- "normalized": false,
202
- "rstrip": false,
203
- "single_word": false,
204
- "special": false
205
- },
206
- "25": {
207
- "content": "<|im_sp_19|>",
208
- "lstrip": false,
209
- "normalized": false,
210
- "rstrip": false,
211
- "single_word": false,
212
- "special": false
213
- },
214
- "26": {
215
- "content": "<|im_sp_20|>",
216
- "lstrip": false,
217
- "normalized": false,
218
- "rstrip": false,
219
- "single_word": false,
220
- "special": false
221
- },
222
- "27": {
223
- "content": "<|im_sp_21|>",
224
- "lstrip": false,
225
- "normalized": false,
226
- "rstrip": false,
227
- "single_word": false,
228
- "special": false
229
- },
230
- "28": {
231
- "content": "<|im_sp_22|>",
232
- "lstrip": false,
233
- "normalized": false,
234
- "rstrip": false,
235
- "single_word": false,
236
- "special": false
237
- },
238
- "29": {
239
- "content": "<|im_sp_23|>",
240
- "lstrip": false,
241
- "normalized": false,
242
- "rstrip": false,
243
- "single_word": false,
244
- "special": false
245
- },
246
- "30": {
247
- "content": "<|im_sp_24|>",
248
- "lstrip": false,
249
- "normalized": false,
250
- "rstrip": false,
251
- "single_word": false,
252
- "special": false
253
- },
254
- "31": {
255
- "content": "<|im_sp_25|>",
256
- "lstrip": false,
257
- "normalized": false,
258
- "rstrip": false,
259
- "single_word": false,
260
- "special": false
261
- },
262
- "32": {
263
- "content": "<|im_sp_26|>",
264
- "lstrip": false,
265
- "normalized": false,
266
- "rstrip": false,
267
- "single_word": false,
268
- "special": false
269
- },
270
- "33": {
271
- "content": "<|im_sp_27|>",
272
- "lstrip": false,
273
- "normalized": false,
274
- "rstrip": false,
275
- "single_word": false,
276
- "special": false
277
- },
278
- "34": {
279
- "content": "<|im_sp_28|>",
280
- "lstrip": false,
281
- "normalized": false,
282
- "rstrip": false,
283
- "single_word": false,
284
- "special": false
285
- },
286
- "35": {
287
- "content": "<|im_sp_29|>",
288
- "lstrip": false,
289
- "normalized": false,
290
- "rstrip": false,
291
- "single_word": false,
292
- "special": false
293
- },
294
- "36": {
295
- "content": "<|im_sp_30|>",
296
- "lstrip": false,
297
- "normalized": false,
298
- "rstrip": false,
299
- "single_word": false,
300
- "special": false
301
- },
302
- "37": {
303
- "content": "<|im_sp_31|>",
304
- "lstrip": false,
305
- "normalized": false,
306
- "rstrip": false,
307
- "single_word": false,
308
- "special": false
309
- },
310
- "38": {
311
- "content": "<|im_sp_32|>",
312
- "lstrip": false,
313
- "normalized": false,
314
- "rstrip": false,
315
- "single_word": false,
316
- "special": false
317
- },
318
- "39": {
319
- "content": "<|im_sp_33|>",
320
- "lstrip": false,
321
- "normalized": false,
322
- "rstrip": false,
323
- "single_word": false,
324
- "special": false
325
- },
326
- "40": {
327
- "content": "<|im_sp_34|>",
328
- "lstrip": false,
329
- "normalized": false,
330
- "rstrip": false,
331
- "single_word": false,
332
- "special": false
333
- },
334
- "41": {
335
- "content": "<|im_sp_35|>",
336
- "lstrip": false,
337
- "normalized": false,
338
- "rstrip": false,
339
- "single_word": false,
340
- "special": false
341
- },
342
- "42": {
343
- "content": "<|im_sp_36|>",
344
- "lstrip": false,
345
- "normalized": false,
346
- "rstrip": false,
347
- "single_word": false,
348
- "special": false
349
- },
350
- "43": {
351
- "content": "<|im_sp_37|>",
352
- "lstrip": false,
353
- "normalized": false,
354
- "rstrip": false,
355
- "single_word": false,
356
- "special": false
357
- },
358
- "44": {
359
- "content": "<|im_sp_38|>",
360
- "lstrip": false,
361
- "normalized": false,
362
- "rstrip": false,
363
- "single_word": false,
364
- "special": false
365
- },
366
- "45": {
367
- "content": "<|im_sp_39|>",
368
- "lstrip": false,
369
- "normalized": false,
370
- "rstrip": false,
371
- "single_word": false,
372
- "special": false
373
- },
374
- "46": {
375
- "content": "<|im_sp_40|>",
376
- "lstrip": false,
377
- "normalized": false,
378
- "rstrip": false,
379
- "single_word": false,
380
- "special": false
381
- },
382
- "47": {
383
- "content": "<|im_sp_41|>",
384
- "lstrip": false,
385
- "normalized": false,
386
- "rstrip": false,
387
- "single_word": false,
388
- "special": false
389
- },
390
- "48": {
391
- "content": "<|im_sp_42|>",
392
- "lstrip": false,
393
- "normalized": false,
394
- "rstrip": false,
395
- "single_word": false,
396
- "special": false
397
- },
398
- "49": {
399
- "content": "<|im_sp_43|>",
400
- "lstrip": false,
401
- "normalized": false,
402
- "rstrip": false,
403
- "single_word": false,
404
- "special": false
405
- },
406
- "50": {
407
- "content": "<|im_sp_44|>",
408
- "lstrip": false,
409
- "normalized": false,
410
- "rstrip": false,
411
- "single_word": false,
412
- "special": false
413
- },
414
- "51": {
415
- "content": "<|im_sp_45|>",
416
- "lstrip": false,
417
- "normalized": false,
418
- "rstrip": false,
419
- "single_word": false,
420
- "special": false
421
- },
422
- "52": {
423
- "content": "<|im_sp_46|>",
424
- "lstrip": false,
425
- "normalized": false,
426
- "rstrip": false,
427
- "single_word": false,
428
- "special": false
429
- },
430
- "53": {
431
- "content": "<|im_sp_47|>",
432
- "lstrip": false,
433
- "normalized": false,
434
- "rstrip": false,
435
- "single_word": false,
436
- "special": false
437
- },
438
- "54": {
439
- "content": "<|im_sp_48|>",
440
- "lstrip": false,
441
- "normalized": false,
442
- "rstrip": false,
443
- "single_word": false,
444
- "special": false
445
- },
446
- "55": {
447
- "content": "<|im_sp_49|>",
448
- "lstrip": false,
449
- "normalized": false,
450
- "rstrip": false,
451
- "single_word": false,
452
- "special": false
453
- },
454
- "56": {
455
- "content": "<|im_sp_50|>",
456
- "lstrip": false,
457
- "normalized": false,
458
- "rstrip": false,
459
- "single_word": false,
460
- "special": false
461
- },
462
- "57": {
463
- "content": "<|im_sp_51|>",
464
- "lstrip": false,
465
- "normalized": false,
466
- "rstrip": false,
467
- "single_word": false,
468
- "special": false
469
- },
470
- "58": {
471
- "content": "<|im_sp_52|>",
472
- "lstrip": false,
473
- "normalized": false,
474
- "rstrip": false,
475
- "single_word": false,
476
- "special": false
477
- },
478
- "59": {
479
- "content": "<|im_sp_53|>",
480
- "lstrip": false,
481
- "normalized": false,
482
- "rstrip": false,
483
- "single_word": false,
484
- "special": false
485
- },
486
- "60": {
487
- "content": "<|im_sp_54|>",
488
- "lstrip": false,
489
- "normalized": false,
490
- "rstrip": false,
491
- "single_word": false,
492
- "special": false
493
- },
494
- "61": {
495
- "content": "<|im_sp_55|>",
496
- "lstrip": false,
497
- "normalized": false,
498
- "rstrip": false,
499
- "single_word": false,
500
- "special": false
501
- },
502
- "62": {
503
- "content": "<|im_sp_56|>",
504
- "lstrip": false,
505
- "normalized": false,
506
- "rstrip": false,
507
- "single_word": false,
508
- "special": false
509
- },
510
- "63": {
511
- "content": "<|im_sp_57|>",
512
- "lstrip": false,
513
- "normalized": false,
514
- "rstrip": false,
515
- "single_word": false,
516
- "special": false
517
- },
518
- "64": {
519
- "content": "<|im_sp_58|>",
520
- "lstrip": false,
521
- "normalized": false,
522
- "rstrip": false,
523
- "single_word": false,
524
- "special": false
525
- },
526
- "65": {
527
- "content": "<|im_sp_59|>",
528
- "lstrip": false,
529
- "normalized": false,
530
- "rstrip": false,
531
- "single_word": false,
532
- "special": false
533
- },
534
- "66": {
535
- "content": "<|im_sp_60|>",
536
- "lstrip": false,
537
- "normalized": false,
538
- "rstrip": false,
539
- "single_word": false,
540
- "special": false
541
- },
542
- "67": {
543
- "content": "<|im_sp_61|>",
544
- "lstrip": false,
545
- "normalized": false,
546
- "rstrip": false,
547
- "single_word": false,
548
- "special": false
549
- },
550
- "68": {
551
- "content": "<|im_sp_62|>",
552
- "lstrip": false,
553
- "normalized": false,
554
- "rstrip": false,
555
- "single_word": false,
556
- "special": false
557
- },
558
- "69": {
559
- "content": "<|im_sp_63|>",
560
- "lstrip": false,
561
- "normalized": false,
562
- "rstrip": false,
563
- "single_word": false,
564
- "special": false
565
- },
566
- "70": {
567
- "content": "<|im_sp_64|>",
568
- "lstrip": false,
569
- "normalized": false,
570
- "rstrip": false,
571
- "single_word": false,
572
- "special": false
573
- },
574
- "71": {
575
- "content": "<|im_sp_65|>",
576
- "lstrip": false,
577
- "normalized": false,
578
- "rstrip": false,
579
- "single_word": false,
580
- "special": false
581
- },
582
- "72": {
583
- "content": "<|im_sp_66|>",
584
- "lstrip": false,
585
- "normalized": false,
586
- "rstrip": false,
587
- "single_word": false,
588
- "special": false
589
- },
590
- "73": {
591
- "content": "<|im_sp_67|>",
592
- "lstrip": false,
593
- "normalized": false,
594
- "rstrip": false,
595
- "single_word": false,
596
- "special": false
597
- },
598
- "74": {
599
- "content": "<|im_sp_68|>",
600
- "lstrip": false,
601
- "normalized": false,
602
- "rstrip": false,
603
- "single_word": false,
604
- "special": false
605
- },
606
- "75": {
607
- "content": "<|im_sp_69|>",
608
- "lstrip": false,
609
- "normalized": false,
610
- "rstrip": false,
611
- "single_word": false,
612
- "special": false
613
- },
614
- "76": {
615
- "content": "<|im_sp_70|>",
616
- "lstrip": false,
617
- "normalized": false,
618
- "rstrip": false,
619
- "single_word": false,
620
- "special": false
621
- },
622
- "77": {
623
- "content": "<|im_sp_71|>",
624
- "lstrip": false,
625
- "normalized": false,
626
- "rstrip": false,
627
- "single_word": false,
628
- "special": false
629
- },
630
- "78": {
631
- "content": "<|im_sp_72|>",
632
- "lstrip": false,
633
- "normalized": false,
634
- "rstrip": false,
635
- "single_word": false,
636
- "special": false
637
- },
638
- "79": {
639
- "content": "<|im_sp_73|>",
640
- "lstrip": false,
641
- "normalized": false,
642
- "rstrip": false,
643
- "single_word": false,
644
- "special": false
645
- },
646
- "80": {
647
- "content": "<|im_sp_74|>",
648
- "lstrip": false,
649
- "normalized": false,
650
- "rstrip": false,
651
- "single_word": false,
652
- "special": false
653
- },
654
- "81": {
655
- "content": "<|im_sp_75|>",
656
- "lstrip": false,
657
- "normalized": false,
658
- "rstrip": false,
659
- "single_word": false,
660
- "special": false
661
- },
662
- "82": {
663
- "content": "<|im_sp_76|>",
664
- "lstrip": false,
665
- "normalized": false,
666
- "rstrip": false,
667
- "single_word": false,
668
- "special": false
669
- },
670
- "83": {
671
- "content": "<|im_sp_77|>",
672
- "lstrip": false,
673
- "normalized": false,
674
- "rstrip": false,
675
- "single_word": false,
676
- "special": false
677
- },
678
- "84": {
679
- "content": "<|im_sp_78|>",
680
- "lstrip": false,
681
- "normalized": false,
682
- "rstrip": false,
683
- "single_word": false,
684
- "special": false
685
- },
686
- "85": {
687
- "content": "<|im_sp_79|>",
688
- "lstrip": false,
689
- "normalized": false,
690
- "rstrip": false,
691
- "single_word": false,
692
- "special": false
693
- },
694
- "86": {
695
- "content": "<|im_sp_80|>",
696
- "lstrip": false,
697
- "normalized": false,
698
- "rstrip": false,
699
- "single_word": false,
700
- "special": false
701
- },
702
- "87": {
703
- "content": "<|im_sp_81|>",
704
- "lstrip": false,
705
- "normalized": false,
706
- "rstrip": false,
707
- "single_word": false,
708
- "special": false
709
- },
710
- "88": {
711
- "content": "<|im_sp_82|>",
712
- "lstrip": false,
713
- "normalized": false,
714
- "rstrip": false,
715
- "single_word": false,
716
- "special": false
717
- },
718
- "89": {
719
- "content": "<|im_sp_83|>",
720
- "lstrip": false,
721
- "normalized": false,
722
- "rstrip": false,
723
- "single_word": false,
724
- "special": false
725
- },
726
- "90": {
727
- "content": "<|im_sp_84|>",
728
- "lstrip": false,
729
- "normalized": false,
730
- "rstrip": false,
731
- "single_word": false,
732
- "special": false
733
- },
734
- "91": {
735
- "content": "<|im_sp_85|>",
736
- "lstrip": false,
737
- "normalized": false,
738
- "rstrip": false,
739
- "single_word": false,
740
- "special": false
741
- },
742
- "92": {
743
- "content": "<|im_sp_86|>",
744
- "lstrip": false,
745
- "normalized": false,
746
- "rstrip": false,
747
- "single_word": false,
748
- "special": false
749
- },
750
- "93": {
751
- "content": "<|im_sp_87|>",
752
- "lstrip": false,
753
- "normalized": false,
754
- "rstrip": false,
755
- "single_word": false,
756
- "special": false
757
- },
758
- "94": {
759
- "content": "<|im_sp_88|>",
760
- "lstrip": false,
761
- "normalized": false,
762
- "rstrip": false,
763
- "single_word": false,
764
- "special": false
765
- },
766
- "95": {
767
- "content": "<|im_sp_89|>",
768
- "lstrip": false,
769
- "normalized": false,
770
- "rstrip": false,
771
- "single_word": false,
772
- "special": false
773
- },
774
- "96": {
775
- "content": "<|im_sp_90|>",
776
- "lstrip": false,
777
- "normalized": false,
778
- "rstrip": false,
779
- "single_word": false,
780
- "special": false
781
- },
782
- "97": {
783
- "content": "<|im_sp_91|>",
784
- "lstrip": false,
785
- "normalized": false,
786
- "rstrip": false,
787
- "single_word": false,
788
- "special": false
789
- },
790
- "98": {
791
- "content": "<|im_sp_92|>",
792
- "lstrip": false,
793
- "normalized": false,
794
- "rstrip": false,
795
- "single_word": false,
796
- "special": false
797
- },
798
- "99": {
799
- "content": "<|im_sp_93|>",
800
- "lstrip": false,
801
- "normalized": false,
802
- "rstrip": false,
803
- "single_word": false,
804
- "special": false
805
- },
806
- "100": {
807
- "content": "<|im_sp_94|>",
808
- "lstrip": false,
809
- "normalized": false,
810
- "rstrip": false,
811
- "single_word": false,
812
- "special": false
813
- },
814
- "101": {
815
- "content": "<|im_sp_95|>",
816
- "lstrip": false,
817
- "normalized": false,
818
- "rstrip": false,
819
- "single_word": false,
820
- "special": false
821
- },
822
- "102": {
823
- "content": "<|im_sp_96|>",
824
- "lstrip": false,
825
- "normalized": false,
826
- "rstrip": false,
827
- "single_word": false,
828
- "special": false
829
- },
830
- "103": {
831
- "content": "<|im_sp_97|>",
832
- "lstrip": false,
833
- "normalized": false,
834
- "rstrip": false,
835
- "single_word": false,
836
- "special": false
837
- },
838
- "104": {
839
- "content": "<|im_sp_98|>",
840
- "lstrip": false,
841
- "normalized": false,
842
- "rstrip": false,
843
- "single_word": false,
844
- "special": false
845
- },
846
- "105": {
847
- "content": "<|im_sp_99|>",
848
- "lstrip": false,
849
- "normalized": false,
850
- "rstrip": false,
851
- "single_word": false,
852
- "special": false
853
- }
854
- },
855
  "bos_token": "<s>",
856
- "clean_up_tokenization_spaces": false,
857
  "eos_token": "</s>",
858
- "extra_special_tokens": {},
859
- "legacy": true,
860
- "model_max_length": 1000000000000000019884624838656,
861
- "pad_token": null,
862
- "sp_model_kwargs": {},
863
- "spaces_between_special_tokens": false,
864
- "tokenizer_class": "LlamaTokenizer",
865
  "unk_token": "<unk>",
866
- "use_default_system_prompt": false
867
- }
 
 
 
 
1
  {
2
+ "tokenizer_class": "PreTrainedTokenizerFast",
3
+ "legacy": false,
4
+ "model_max_length": 4096,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  "bos_token": "<s>",
 
6
  "eos_token": "</s>",
 
 
 
 
 
 
 
7
  "unk_token": "<unk>",
8
+ "pad_token": "<pad>",
9
+ "add_bos_token": true,
10
+ "add_eos_token": false,
11
+ "clean_up_tokenization_spaces": false
12
+ }