iszoke commited on
Commit
7688296
·
verified ·
1 Parent(s): baabc5d

Upload tokenizer

Browse files
Files changed (4) hide show
  1. README.md +199 -0
  2. special_tokens_map.json +7 -0
  3. tokenizer.json +2130 -0
  4. tokenizer_config.json +52 -0
README.md ADDED
@@ -0,0 +1,199 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags: []
4
+ ---
5
+
6
+ # Model Card for Model ID
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ <!-- Provide a longer summary of what this model is. -->
17
+
18
+ This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
+
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
+
28
+ ### Model Sources [optional]
29
+
30
+ <!-- Provide the basic links for the model. -->
31
+
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
+
76
+ ## Training Details
77
+
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "eos_token": "</s>",
4
+ "mask_token": "<mask>",
5
+ "pad_token": "<pad>",
6
+ "unk_token": "<unk>"
7
+ }
tokenizer.json ADDED
@@ -0,0 +1,2130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
5
+ "added_tokens": [
6
+ {
7
+ "id": 0,
8
+ "content": "<s>",
9
+ "single_word": false,
10
+ "lstrip": false,
11
+ "rstrip": false,
12
+ "normalized": false,
13
+ "special": true
14
+ },
15
+ {
16
+ "id": 1,
17
+ "content": "</s>",
18
+ "single_word": false,
19
+ "lstrip": false,
20
+ "rstrip": false,
21
+ "normalized": false,
22
+ "special": true
23
+ },
24
+ {
25
+ "id": 2,
26
+ "content": "<unk>",
27
+ "single_word": false,
28
+ "lstrip": false,
29
+ "rstrip": false,
30
+ "normalized": false,
31
+ "special": true
32
+ },
33
+ {
34
+ "id": 3,
35
+ "content": "<pad>",
36
+ "single_word": false,
37
+ "lstrip": false,
38
+ "rstrip": false,
39
+ "normalized": false,
40
+ "special": true
41
+ },
42
+ {
43
+ "id": 4,
44
+ "content": "<mask>",
45
+ "single_word": false,
46
+ "lstrip": false,
47
+ "rstrip": false,
48
+ "normalized": false,
49
+ "special": true
50
+ }
51
+ ],
52
+ "normalizer": null,
53
+ "pre_tokenizer": {
54
+ "type": "ByteLevel",
55
+ "add_prefix_space": true,
56
+ "trim_offsets": true,
57
+ "use_regex": true
58
+ },
59
+ "post_processor": {
60
+ "type": "TemplateProcessing",
61
+ "single": [
62
+ {
63
+ "Sequence": {
64
+ "id": "A",
65
+ "type_id": 0
66
+ }
67
+ },
68
+ {
69
+ "SpecialToken": {
70
+ "id": "</s>",
71
+ "type_id": 0
72
+ }
73
+ }
74
+ ],
75
+ "pair": [
76
+ {
77
+ "Sequence": {
78
+ "id": "A",
79
+ "type_id": 0
80
+ }
81
+ },
82
+ {
83
+ "SpecialToken": {
84
+ "id": "</s>",
85
+ "type_id": 0
86
+ }
87
+ },
88
+ {
89
+ "Sequence": {
90
+ "id": "B",
91
+ "type_id": 1
92
+ }
93
+ },
94
+ {
95
+ "SpecialToken": {
96
+ "id": "</s>",
97
+ "type_id": 1
98
+ }
99
+ }
100
+ ],
101
+ "special_tokens": {
102
+ "</s>": {
103
+ "id": "</s>",
104
+ "ids": [
105
+ 1
106
+ ],
107
+ "tokens": [
108
+ "</s>"
109
+ ]
110
+ },
111
+ "<s>": {
112
+ "id": "<s>",
113
+ "ids": [
114
+ 0
115
+ ],
116
+ "tokens": [
117
+ "<s>"
118
+ ]
119
+ }
120
+ }
121
+ },
122
+ "decoder": {
123
+ "type": "ByteLevel",
124
+ "add_prefix_space": true,
125
+ "trim_offsets": true,
126
+ "use_regex": true
127
+ },
128
+ "model": {
129
+ "type": "BPE",
130
+ "dropout": null,
131
+ "unk_token": "<unk>",
132
+ "continuing_subword_prefix": null,
133
+ "end_of_word_suffix": null,
134
+ "fuse_unk": false,
135
+ "byte_fallback": false,
136
+ "ignore_merges": false,
137
+ "vocab": {
138
+ "<s>": 0,
139
+ "</s>": 1,
140
+ "<unk>": 2,
141
+ "<pad>": 3,
142
+ "<mask>": 4,
143
+ "!": 5,
144
+ "%": 6,
145
+ "'": 7,
146
+ "(": 8,
147
+ ")": 9,
148
+ "+": 10,
149
+ ",": 11,
150
+ "-": 12,
151
+ ".": 13,
152
+ "/": 14,
153
+ "0": 15,
154
+ "1": 16,
155
+ "2": 17,
156
+ "3": 18,
157
+ "4": 19,
158
+ "5": 20,
159
+ "6": 21,
160
+ "7": 22,
161
+ "8": 23,
162
+ "9": 24,
163
+ ":": 25,
164
+ "<": 26,
165
+ ">": 27,
166
+ "?": 28,
167
+ "A": 29,
168
+ "B": 30,
169
+ "C": 31,
170
+ "D": 32,
171
+ "E": 33,
172
+ "F": 34,
173
+ "G": 35,
174
+ "H": 36,
175
+ "I": 37,
176
+ "J": 38,
177
+ "K": 39,
178
+ "L": 40,
179
+ "M": 41,
180
+ "N": 42,
181
+ "O": 43,
182
+ "P": 44,
183
+ "Q": 45,
184
+ "R": 46,
185
+ "S": 47,
186
+ "T": 48,
187
+ "U": 49,
188
+ "V": 50,
189
+ "W": 51,
190
+ "X": 52,
191
+ "Y": 53,
192
+ "Z": 54,
193
+ "a": 55,
194
+ "b": 56,
195
+ "c": 57,
196
+ "d": 58,
197
+ "e": 59,
198
+ "f": 60,
199
+ "g": 61,
200
+ "h": 62,
201
+ "i": 63,
202
+ "j": 64,
203
+ "k": 65,
204
+ "l": 66,
205
+ "m": 67,
206
+ "n": 68,
207
+ "o": 69,
208
+ "p": 70,
209
+ "q": 71,
210
+ "r": 72,
211
+ "s": 73,
212
+ "t": 74,
213
+ "u": 75,
214
+ "v": 76,
215
+ "w": 77,
216
+ "x": 78,
217
+ "y": 79,
218
+ "z": 80,
219
+ "¡": 81,
220
+ "£": 82,
221
+ "¤": 83,
222
+ "¥": 84,
223
+ "§": 85,
224
+ "¨": 86,
225
+ "©": 87,
226
+ "ª": 88,
227
+ "«": 89,
228
+ "®": 90,
229
+ "¯": 91,
230
+ "±": 92,
231
+ "³": 93,
232
+ "´": 94,
233
+ "¶": 95,
234
+ "º": 96,
235
+ "»": 97,
236
+ "¼": 98,
237
+ "½": 99,
238
+ "¾": 100,
239
+ "Â": 101,
240
+ "Ã": 102,
241
+ "Ä": 103,
242
+ "Å": 104,
243
+ "È": 105,
244
+ "â": 106,
245
+ "Ġ": 107,
246
+ "Ģ": 108,
247
+ "ģ": 109,
248
+ "Ĥ": 110,
249
+ "ĥ": 111,
250
+ "Ħ": 112,
251
+ "ĩ": 113,
252
+ "Ī": 114,
253
+ "Į": 115,
254
+ "į": 116,
255
+ "İ": 117,
256
+ "ı": 118,
257
+ "ĵ": 119,
258
+ "ĺ": 120,
259
+ "Ļ": 121,
260
+ "ļ": 122,
261
+ "Ľ": 123,
262
+ "ŀ": 124,
263
+ "Ł": 125,
264
+ "ł": 126,
265
+ "Ń": 127,
266
+ "ÃŃ": 128,
267
+ "á": 129,
268
+ "Ġp": 130,
269
+ "Ġ,": 131,
270
+ "Ġt": 132,
271
+ "Ġs": 133,
272
+ "ÄĽ": 134,
273
+ "Ġn": 135,
274
+ "Ġv": 136,
275
+ "Ġa": 137,
276
+ "Ġj": 138,
277
+ "é": 139,
278
+ "ÅĻ": 140,
279
+ "ro": 141,
280
+ "ž": 142,
281
+ "nÃŃ": 143,
282
+ "ch": 144,
283
+ "Ġz": 145,
284
+ "Ġd": 146,
285
+ "Ġm": 147,
286
+ "st": 148,
287
+ "Ġk": 149,
288
+ "ov": 150,
289
+ "le": 151,
290
+ "Ġo": 152,
291
+ "Å¡": 153,
292
+ "Äį": 154,
293
+ "ý": 155,
294
+ "li": 156,
295
+ "Ġpo": 157,
296
+ "me": 158,
297
+ "at": 159,
298
+ "Ġb": 160,
299
+ "Ġ.": 161,
300
+ "Ġje": 162,
301
+ "Ġto": 163,
302
+ "ou": 164,
303
+ "ÅĻe": 165,
304
+ "Ġna": 166,
305
+ "že": 167,
306
+ "te": 168,
307
+ "ak": 169,
308
+ "an": 170,
309
+ "ů": 171,
310
+ "la": 172,
311
+ "Ġpro": 173,
312
+ "it": 174,
313
+ "Ġne": 175,
314
+ "ce": 176,
315
+ "sk": 177,
316
+ "Ġse": 178,
317
+ "to": 179,
318
+ "in": 180,
319
+ "ho": 181,
320
+ "no": 182,
321
+ "ra": 183,
322
+ "rá": 184,
323
+ "de": 185,
324
+ "ku": 186,
325
+ "nÄĽ": 187,
326
+ "un": 188,
327
+ "em": 189,
328
+ "ci": 190,
329
+ "Ġza": 191,
330
+ "Ġže": 192,
331
+ "Ġdo": 193,
332
+ "Ġtak": 194,
333
+ "ÅĻÃŃ": 195,
334
+ "dy": 196,
335
+ "po": 197,
336
+ "mi": 198,
337
+ "en": 199,
338
+ "Ġu": 200,
339
+ "Ġko": 201,
340
+ "Ġby": 202,
341
+ "Ġkte": 203,
342
+ "ji": 204,
343
+ "unk": 205,
344
+ "Ġ<": 206,
345
+ "lo": 207,
346
+ "Ġjs": 208,
347
+ "rop": 209,
348
+ "je": 210,
349
+ "vrop": 211,
350
+ "by": 212,
351
+ "ĠE": 213,
352
+ "ĠpÅĻe": 214,
353
+ "sÃŃ": 215,
354
+ "Ġst": 216,
355
+ "jÃŃ": 217,
356
+ "ti": 218,
357
+ "Å¡ÃŃ": 219,
358
+ "Ġvy": 220,
359
+ "se": 221,
360
+ "ĠdÄĽ": 222,
361
+ "ých": 223,
362
+ "ni": 224,
363
+ "Ġná": 225,
364
+ "né": 226,
365
+ "Ġbu": 227,
366
+ "ÅĻi": 228,
367
+ "Ġve": 229,
368
+ "ky": 230,
369
+ "Ġaby": 231,
370
+ "Ġob": 232,
371
+ "ná": 233,
372
+ "še": 234,
373
+ "re": 235,
374
+ "ovat": 236,
375
+ "ĠÄį": 237,
376
+ "cÃŃ": 238,
377
+ "ĠEvrop": 239,
378
+ "ru": 240,
379
+ "Ġh": 241,
380
+ "Ġro": 242,
381
+ "na": 243,
382
+ "vo": 244,
383
+ "Ġe": 245,
384
+ "ko": 246,
385
+ "Ġmu": 247,
386
+ "Ġjak": 248,
387
+ "nost": 249,
388
+ "ské": 250,
389
+ "ĠdÄĽku": 251,
390
+ "Ġkter": 252,
391
+ "Ġpa": 253,
392
+ "Ġpod": 254,
393
+ "vá": 255,
394
+ "ck": 256,
395
+ "ĠdÄĽkuji": 257,
396
+ "al": 258,
397
+ "ĠpÅĻed": 259,
398
+ "Ġvá": 260,
399
+ "ne": 261,
400
+ "cho": 262,
401
+ "ová": 263,
402
+ "mÄĽ": 264,
403
+ "ĠmusÃŃ": 265,
404
+ "du": 266,
405
+ "Ġzá": 267,
406
+ "Ġta": 268,
407
+ "dÄĽ": 269,
408
+ "ráv": 270,
409
+ "Ġch": 271,
410
+ "mu": 272,
411
+ "Ġte": 273,
412
+ "tu": 274,
413
+ "or": 275,
414
+ "Ġmo": 276,
415
+ "ka": 277,
416
+ "lá": 278,
417
+ "tÄĽ": 279,
418
+ "Ġale": 280,
419
+ "ran": 281,
420
+ "da": 282,
421
+ "ÃŃm": 283,
422
+ "ĠpÅĻi": 284,
423
+ "sl": 285,
424
+ "Ġpan": 286,
425
+ "do": 287,
426
+ "Ġtaké": 288,
427
+ "Ġsi": 289,
428
+ "Ġbude": 290,
429
+ "Ġmá": 291,
430
+ "uje": 292,
431
+ "Ġi": 293,
432
+ "ú": 294,
433
+ "mo": 295,
434
+ "ve": 296,
435
+ "ál": 297,
436
+ "co": 298,
437
+ "ny": 299,
438
+ "vÄĽ": 300,
439
+ "Ġkteré": 301,
440
+ "Ġod": 302,
441
+ "dnÃŃ": 303,
442
+ "Ġú": 304,
443
+ "Ġstá": 305,
444
+ "Ġspo": 306,
445
+ "ze": 307,
446
+ "Ġjsou": 308,
447
+ "Ġmy": 309,
448
+ "ÅĻeb": 310,
449
+ "vÃŃ": 311,
450
+ "ĠpÅĻÃŃ": 312,
451
+ "Ġre": 313,
452
+ "ĠmÄĽ": 314,
453
+ "Ġf": 315,
454
+ "ĠnÄĽ": 316,
455
+ "lu": 317,
456
+ "leg": 318,
457
+ "vi": 319,
458
+ "bo": 320,
459
+ "ovo": 321,
460
+ "Ġroz": 322,
461
+ "va": 323,
462
+ "chom": 324,
463
+ "Ġjsme": 325,
464
+ "ly": 326,
465
+ "ové": 327,
466
+ "Ġin": 328,
467
+ "sti": 329,
468
+ "Ġjed": 330,
469
+ "ez": 331,
470
+ "Ġpráv": 332,
471
+ "pra": 333,
472
+ "žit": 334,
473
+ "er": 335,
474
+ "ty": 336,
475
+ "men": 337,
476
+ "ovánÃŃ": 338,
477
+ "ši": 339,
478
+ "ým": 340,
479
+ "Ġproto": 341,
480
+ "hod": 342,
481
+ "ži": 343,
482
+ "Ġco": 344,
483
+ "ĠmusÃŃme": 345,
484
+ "Ġvý": 346,
485
+ "Ġli": 347,
486
+ "ova": 348,
487
+ "ĠpanÃŃ": 349,
488
+ "Ġpot": 350,
489
+ "dá": 351,
490
+ "sa": 352,
491
+ "nu": 353,
492
+ "ká": 354,
493
+ "ÃŃt": 355,
494
+ "ar": 356,
495
+ "lé": 357,
496
+ "Ġkon": 358,
497
+ "Ġté": 359,
498
+ "Ġtady": 360,
499
+ "pe": 361,
500
+ "ĠtÄĽ": 362,
501
+ "si": 363,
502
+ "dou": 364,
503
+ "ri": 365,
504
+ "Ġjá": 366,
505
+ "álnÃŃ": 367,
506
+ "nou": 368,
507
+ "ie": 369,
508
+ "Äįe": 370,
509
+ "Ġdi": 371,
510
+ "ati": 372,
511
+ "vnÃŃ": 373,
512
+ "ĠvÄĽ": 374,
513
+ "Ġce": 375,
514
+ "Ġty": 376,
515
+ "Ġdů": 377,
516
+ "Ġ2": 378,
517
+ "sta": 379,
518
+ "ĠpÅĻedse": 380,
519
+ "Ġkoleg": 381,
520
+ "lÃŃ": 382,
521
+ "dÃŃ": 383,
522
+ "ÅĪ": 384,
523
+ "ma": 385,
524
+ "kla": 386,
525
+ "Ġzem": 387,
526
+ "Ġevrop": 388,
527
+ "ĠvÅ¡e": 389,
528
+ "len": 390,
529
+ "ležit": 391,
530
+ "Ġpane": 392,
531
+ "ĠÅĻe": 393,
532
+ "di": 394,
533
+ "Ġun": 395,
534
+ "stu": 396,
535
+ "Ġsou": 397,
536
+ "ný": 398,
537
+ "Ġpra": 399,
538
+ "Ġnej": 400,
539
+ "nosti": 401,
540
+ "ůže": 402,
541
+ "Ġkdy": 403,
542
+ "cké": 404,
543
+ "vr": 405,
544
+ "my": 406,
545
+ "itu": 407,
546
+ "so": 408,
547
+ "Ġabychom": 409,
548
+ "Äı": 410,
549
+ "Ġ1": 411,
550
+ "Ġjako": 412,
551
+ "ta": 413,
552
+ "ĠpotÅĻeb": 414,
553
+ "ÄįnÃŃ": 415,
554
+ "Ġbý": 416,
555
+ "tÃŃ": 417,
556
+ "ĠP": 418,
557
+ "Ġtedy": 419,
558
+ "Ġsl": 420,
559
+ "pa": 421,
560
+ "Ġpoli": 422,
561
+ "Ġbudou": 423,
562
+ "Ġtom": 424,
563
+ "kra": 425,
564
+ "ĠvÅ¡ech": 426,
565
+ "vé": 427,
566
+ "Ġdal": 428,
567
+ "Ġbýt": 429,
568
+ "ĠK": 430,
569
+ "Ġmi": 431,
570
+ "as": 432,
571
+ "Ġbo": 433,
572
+ "ĠdalÅ¡ÃŃ": 434,
573
+ "leÄį": 435,
574
+ "Äįan": 436,
575
+ "ům": 437,
576
+ "Ġtoho": 438,
577
+ "Ġpoliti": 439,
578
+ "ská": 440,
579
+ "ate": 441,
580
+ "žÃŃ": 442,
581
+ "Ġjsem": 443,
582
+ "sle": 444,
583
+ "Å¡tÄĽ": 445,
584
+ "ĠEvropské": 446,
585
+ "Ġten": 447,
586
+ "za": 448,
587
+ "ĠvÃŃ": 449,
588
+ "mÃŃ": 450,
589
+ "ry": 451,
590
+ "zi": 452,
591
+ "ĠobÄįan": 453,
592
+ "Ġzp": 454,
593
+ "Ġkomi": 455,
594
+ "ĠspoleÄį": 456,
595
+ "Ġpr": 457,
596
+ "Ġtu": 458,
597
+ "tnÃŃ": 459,
598
+ "ĠÅĻÃŃ": 460,
599
+ "sku": 461,
600
+ "Ġmin": 462,
601
+ "tá": 463,
602
+ "Ġvel": 464,
603
+ "lamen": 465,
604
+ "Ġhla": 466,
605
+ "stup": 467,
606
+ "Ġprotože": 468,
607
+ "Ġdá": 469,
608
+ "ských": 470,
609
+ "gi": 471,
610
+ "nit": 472,
611
+ "Ġpar": 473,
612
+ "dky": 474,
613
+ "ynÃŃ": 475,
614
+ "Ġsku": 476,
615
+ "Ġmáme": 477,
616
+ "ĠnynÃŃ": 478,
617
+ "rov": 479,
618
+ "Ġdob": 480,
619
+ "ĠÄįlen": 481,
620
+ "ste": 482,
621
+ "Ġrá": 483,
622
+ "for": 484,
623
+ "ĠS": 485,
624
+ "ĠnaÅ¡e": 486,
625
+ "Ġvám": 487,
626
+ "lat": 488,
627
+ "Ġpodpo": 489,
628
+ "Ġni": 490,
629
+ "Ġkterá": 491,
630
+ "Ġpoku": 492,
631
+ "ĠR": 493,
632
+ "Ġzd": 494,
633
+ "Ġmno": 495,
634
+ "Ġváže": 496,
635
+ "sto": 497,
636
+ "ĠkteÅĻÃŃ": 498,
637
+ "Ġdůležit": 499
638
+ },
639
+ "merges": [
640
+ [
641
+ "Ã",
642
+ "Ń"
643
+ ],
644
+ [
645
+ "Ã",
646
+ "¡"
647
+ ],
648
+ [
649
+ "Ġ",
650
+ "p"
651
+ ],
652
+ [
653
+ "Ġ",
654
+ ","
655
+ ],
656
+ [
657
+ "Ġ",
658
+ "t"
659
+ ],
660
+ [
661
+ "Ġ",
662
+ "s"
663
+ ],
664
+ [
665
+ "Ä",
666
+ "Ľ"
667
+ ],
668
+ [
669
+ "Ġ",
670
+ "n"
671
+ ],
672
+ [
673
+ "Ġ",
674
+ "v"
675
+ ],
676
+ [
677
+ "Ġ",
678
+ "a"
679
+ ],
680
+ [
681
+ "Ġ",
682
+ "j"
683
+ ],
684
+ [
685
+ "Ã",
686
+ "©"
687
+ ],
688
+ [
689
+ "Å",
690
+ "Ļ"
691
+ ],
692
+ [
693
+ "r",
694
+ "o"
695
+ ],
696
+ [
697
+ "Å",
698
+ "¾"
699
+ ],
700
+ [
701
+ "n",
702
+ "ÃŃ"
703
+ ],
704
+ [
705
+ "c",
706
+ "h"
707
+ ],
708
+ [
709
+ "Ġ",
710
+ "z"
711
+ ],
712
+ [
713
+ "Ġ",
714
+ "d"
715
+ ],
716
+ [
717
+ "Ġ",
718
+ "m"
719
+ ],
720
+ [
721
+ "s",
722
+ "t"
723
+ ],
724
+ [
725
+ "Ġ",
726
+ "k"
727
+ ],
728
+ [
729
+ "o",
730
+ "v"
731
+ ],
732
+ [
733
+ "l",
734
+ "e"
735
+ ],
736
+ [
737
+ "Ġ",
738
+ "o"
739
+ ],
740
+ [
741
+ "Å",
742
+ "¡"
743
+ ],
744
+ [
745
+ "Ä",
746
+ "į"
747
+ ],
748
+ [
749
+ "Ã",
750
+ "½"
751
+ ],
752
+ [
753
+ "l",
754
+ "i"
755
+ ],
756
+ [
757
+ "Ġp",
758
+ "o"
759
+ ],
760
+ [
761
+ "m",
762
+ "e"
763
+ ],
764
+ [
765
+ "a",
766
+ "t"
767
+ ],
768
+ [
769
+ "Ġ",
770
+ "b"
771
+ ],
772
+ [
773
+ "Ġ",
774
+ "."
775
+ ],
776
+ [
777
+ "Ġj",
778
+ "e"
779
+ ],
780
+ [
781
+ "Ġt",
782
+ "o"
783
+ ],
784
+ [
785
+ "o",
786
+ "u"
787
+ ],
788
+ [
789
+ "ÅĻ",
790
+ "e"
791
+ ],
792
+ [
793
+ "Ġn",
794
+ "a"
795
+ ],
796
+ [
797
+ "ž",
798
+ "e"
799
+ ],
800
+ [
801
+ "t",
802
+ "e"
803
+ ],
804
+ [
805
+ "a",
806
+ "k"
807
+ ],
808
+ [
809
+ "a",
810
+ "n"
811
+ ],
812
+ [
813
+ "Å",
814
+ "¯"
815
+ ],
816
+ [
817
+ "l",
818
+ "a"
819
+ ],
820
+ [
821
+ "Ġp",
822
+ "ro"
823
+ ],
824
+ [
825
+ "i",
826
+ "t"
827
+ ],
828
+ [
829
+ "Ġn",
830
+ "e"
831
+ ],
832
+ [
833
+ "c",
834
+ "e"
835
+ ],
836
+ [
837
+ "s",
838
+ "k"
839
+ ],
840
+ [
841
+ "Ġs",
842
+ "e"
843
+ ],
844
+ [
845
+ "t",
846
+ "o"
847
+ ],
848
+ [
849
+ "i",
850
+ "n"
851
+ ],
852
+ [
853
+ "h",
854
+ "o"
855
+ ],
856
+ [
857
+ "n",
858
+ "o"
859
+ ],
860
+ [
861
+ "r",
862
+ "a"
863
+ ],
864
+ [
865
+ "r",
866
+ "á"
867
+ ],
868
+ [
869
+ "d",
870
+ "e"
871
+ ],
872
+ [
873
+ "k",
874
+ "u"
875
+ ],
876
+ [
877
+ "n",
878
+ "ÄĽ"
879
+ ],
880
+ [
881
+ "u",
882
+ "n"
883
+ ],
884
+ [
885
+ "e",
886
+ "m"
887
+ ],
888
+ [
889
+ "c",
890
+ "i"
891
+ ],
892
+ [
893
+ "Ġz",
894
+ "a"
895
+ ],
896
+ [
897
+ "Ġ",
898
+ "že"
899
+ ],
900
+ [
901
+ "Ġd",
902
+ "o"
903
+ ],
904
+ [
905
+ "Ġt",
906
+ "ak"
907
+ ],
908
+ [
909
+ "ÅĻ",
910
+ "ÃŃ"
911
+ ],
912
+ [
913
+ "d",
914
+ "y"
915
+ ],
916
+ [
917
+ "p",
918
+ "o"
919
+ ],
920
+ [
921
+ "m",
922
+ "i"
923
+ ],
924
+ [
925
+ "e",
926
+ "n"
927
+ ],
928
+ [
929
+ "Ġ",
930
+ "u"
931
+ ],
932
+ [
933
+ "Ġk",
934
+ "o"
935
+ ],
936
+ [
937
+ "Ġb",
938
+ "y"
939
+ ],
940
+ [
941
+ "Ġk",
942
+ "te"
943
+ ],
944
+ [
945
+ "j",
946
+ "i"
947
+ ],
948
+ [
949
+ "un",
950
+ "k"
951
+ ],
952
+ [
953
+ "Ġ",
954
+ "<"
955
+ ],
956
+ [
957
+ "l",
958
+ "o"
959
+ ],
960
+ [
961
+ "Ġj",
962
+ "s"
963
+ ],
964
+ [
965
+ "ro",
966
+ "p"
967
+ ],
968
+ [
969
+ "j",
970
+ "e"
971
+ ],
972
+ [
973
+ "v",
974
+ "rop"
975
+ ],
976
+ [
977
+ "b",
978
+ "y"
979
+ ],
980
+ [
981
+ "Ġ",
982
+ "E"
983
+ ],
984
+ [
985
+ "Ġp",
986
+ "ÅĻe"
987
+ ],
988
+ [
989
+ "s",
990
+ "ÃŃ"
991
+ ],
992
+ [
993
+ "Ġs",
994
+ "t"
995
+ ],
996
+ [
997
+ "j",
998
+ "ÃŃ"
999
+ ],
1000
+ [
1001
+ "t",
1002
+ "i"
1003
+ ],
1004
+ [
1005
+ "Å¡",
1006
+ "ÃŃ"
1007
+ ],
1008
+ [
1009
+ "Ġv",
1010
+ "y"
1011
+ ],
1012
+ [
1013
+ "s",
1014
+ "e"
1015
+ ],
1016
+ [
1017
+ "Ġd",
1018
+ "ÄĽ"
1019
+ ],
1020
+ [
1021
+ "ý",
1022
+ "ch"
1023
+ ],
1024
+ [
1025
+ "n",
1026
+ "i"
1027
+ ],
1028
+ [
1029
+ "Ġn",
1030
+ "á"
1031
+ ],
1032
+ [
1033
+ "n",
1034
+ "é"
1035
+ ],
1036
+ [
1037
+ "Ġb",
1038
+ "u"
1039
+ ],
1040
+ [
1041
+ "ÅĻ",
1042
+ "i"
1043
+ ],
1044
+ [
1045
+ "Ġv",
1046
+ "e"
1047
+ ],
1048
+ [
1049
+ "k",
1050
+ "y"
1051
+ ],
1052
+ [
1053
+ "Ġa",
1054
+ "by"
1055
+ ],
1056
+ [
1057
+ "Ġo",
1058
+ "b"
1059
+ ],
1060
+ [
1061
+ "n",
1062
+ "á"
1063
+ ],
1064
+ [
1065
+ "Å¡",
1066
+ "e"
1067
+ ],
1068
+ [
1069
+ "r",
1070
+ "e"
1071
+ ],
1072
+ [
1073
+ "ov",
1074
+ "at"
1075
+ ],
1076
+ [
1077
+ "Ġ",
1078
+ "Äį"
1079
+ ],
1080
+ [
1081
+ "c",
1082
+ "ÃŃ"
1083
+ ],
1084
+ [
1085
+ "ĠE",
1086
+ "vrop"
1087
+ ],
1088
+ [
1089
+ "r",
1090
+ "u"
1091
+ ],
1092
+ [
1093
+ "Ġ",
1094
+ "h"
1095
+ ],
1096
+ [
1097
+ "Ġ",
1098
+ "ro"
1099
+ ],
1100
+ [
1101
+ "n",
1102
+ "a"
1103
+ ],
1104
+ [
1105
+ "v",
1106
+ "o"
1107
+ ],
1108
+ [
1109
+ "Ġ",
1110
+ "e"
1111
+ ],
1112
+ [
1113
+ "k",
1114
+ "o"
1115
+ ],
1116
+ [
1117
+ "Ġm",
1118
+ "u"
1119
+ ],
1120
+ [
1121
+ "Ġj",
1122
+ "ak"
1123
+ ],
1124
+ [
1125
+ "no",
1126
+ "st"
1127
+ ],
1128
+ [
1129
+ "sk",
1130
+ "é"
1131
+ ],
1132
+ [
1133
+ "ĠdÄĽ",
1134
+ "ku"
1135
+ ],
1136
+ [
1137
+ "Ġkte",
1138
+ "r"
1139
+ ],
1140
+ [
1141
+ "Ġp",
1142
+ "a"
1143
+ ],
1144
+ [
1145
+ "Ġpo",
1146
+ "d"
1147
+ ],
1148
+ [
1149
+ "v",
1150
+ "á"
1151
+ ],
1152
+ [
1153
+ "c",
1154
+ "k"
1155
+ ],
1156
+ [
1157
+ "ĠdÄĽku",
1158
+ "ji"
1159
+ ],
1160
+ [
1161
+ "a",
1162
+ "l"
1163
+ ],
1164
+ [
1165
+ "ĠpÅĻe",
1166
+ "d"
1167
+ ],
1168
+ [
1169
+ "Ġv",
1170
+ "á"
1171
+ ],
1172
+ [
1173
+ "n",
1174
+ "e"
1175
+ ],
1176
+ [
1177
+ "ch",
1178
+ "o"
1179
+ ],
1180
+ [
1181
+ "ov",
1182
+ "á"
1183
+ ],
1184
+ [
1185
+ "m",
1186
+ "ÄĽ"
1187
+ ],
1188
+ [
1189
+ "Ġmu",
1190
+ "sÃŃ"
1191
+ ],
1192
+ [
1193
+ "d",
1194
+ "u"
1195
+ ],
1196
+ [
1197
+ "Ġz",
1198
+ "á"
1199
+ ],
1200
+ [
1201
+ "Ġt",
1202
+ "a"
1203
+ ],
1204
+ [
1205
+ "d",
1206
+ "ÄĽ"
1207
+ ],
1208
+ [
1209
+ "rá",
1210
+ "v"
1211
+ ],
1212
+ [
1213
+ "Ġ",
1214
+ "ch"
1215
+ ],
1216
+ [
1217
+ "m",
1218
+ "u"
1219
+ ],
1220
+ [
1221
+ "Ġt",
1222
+ "e"
1223
+ ],
1224
+ [
1225
+ "t",
1226
+ "u"
1227
+ ],
1228
+ [
1229
+ "o",
1230
+ "r"
1231
+ ],
1232
+ [
1233
+ "Ġm",
1234
+ "o"
1235
+ ],
1236
+ [
1237
+ "k",
1238
+ "a"
1239
+ ],
1240
+ [
1241
+ "l",
1242
+ "á"
1243
+ ],
1244
+ [
1245
+ "t",
1246
+ "ÄĽ"
1247
+ ],
1248
+ [
1249
+ "Ġa",
1250
+ "le"
1251
+ ],
1252
+ [
1253
+ "r",
1254
+ "an"
1255
+ ],
1256
+ [
1257
+ "d",
1258
+ "a"
1259
+ ],
1260
+ [
1261
+ "ÃŃ",
1262
+ "m"
1263
+ ],
1264
+ [
1265
+ "Ġp",
1266
+ "ÅĻi"
1267
+ ],
1268
+ [
1269
+ "s",
1270
+ "l"
1271
+ ],
1272
+ [
1273
+ "Ġp",
1274
+ "an"
1275
+ ],
1276
+ [
1277
+ "d",
1278
+ "o"
1279
+ ],
1280
+ [
1281
+ "Ġtak",
1282
+ "é"
1283
+ ],
1284
+ [
1285
+ "Ġs",
1286
+ "i"
1287
+ ],
1288
+ [
1289
+ "Ġbu",
1290
+ "de"
1291
+ ],
1292
+ [
1293
+ "Ġm",
1294
+ "á"
1295
+ ],
1296
+ [
1297
+ "u",
1298
+ "je"
1299
+ ],
1300
+ [
1301
+ "Ġ",
1302
+ "i"
1303
+ ],
1304
+ [
1305
+ "Ã",
1306
+ "º"
1307
+ ],
1308
+ [
1309
+ "m",
1310
+ "o"
1311
+ ],
1312
+ [
1313
+ "v",
1314
+ "e"
1315
+ ],
1316
+ [
1317
+ "á",
1318
+ "l"
1319
+ ],
1320
+ [
1321
+ "c",
1322
+ "o"
1323
+ ],
1324
+ [
1325
+ "n",
1326
+ "y"
1327
+ ],
1328
+ [
1329
+ "v",
1330
+ "ÄĽ"
1331
+ ],
1332
+ [
1333
+ "Ġkter",
1334
+ "é"
1335
+ ],
1336
+ [
1337
+ "Ġo",
1338
+ "d"
1339
+ ],
1340
+ [
1341
+ "d",
1342
+ "nÃŃ"
1343
+ ],
1344
+ [
1345
+ "Ġ",
1346
+ "ú"
1347
+ ],
1348
+ [
1349
+ "Ġst",
1350
+ "á"
1351
+ ],
1352
+ [
1353
+ "Ġs",
1354
+ "po"
1355
+ ],
1356
+ [
1357
+ "z",
1358
+ "e"
1359
+ ],
1360
+ [
1361
+ "Ġjs",
1362
+ "ou"
1363
+ ],
1364
+ [
1365
+ "Ġm",
1366
+ "y"
1367
+ ],
1368
+ [
1369
+ "ÅĻe",
1370
+ "b"
1371
+ ],
1372
+ [
1373
+ "v",
1374
+ "ÃŃ"
1375
+ ],
1376
+ [
1377
+ "Ġp",
1378
+ "ÅĻÃŃ"
1379
+ ],
1380
+ [
1381
+ "Ġ",
1382
+ "re"
1383
+ ],
1384
+ [
1385
+ "Ġm",
1386
+ "ÄĽ"
1387
+ ],
1388
+ [
1389
+ "Ġ",
1390
+ "f"
1391
+ ],
1392
+ [
1393
+ "Ġn",
1394
+ "ÄĽ"
1395
+ ],
1396
+ [
1397
+ "l",
1398
+ "u"
1399
+ ],
1400
+ [
1401
+ "le",
1402
+ "g"
1403
+ ],
1404
+ [
1405
+ "v",
1406
+ "i"
1407
+ ],
1408
+ [
1409
+ "b",
1410
+ "o"
1411
+ ],
1412
+ [
1413
+ "ov",
1414
+ "o"
1415
+ ],
1416
+ [
1417
+ "Ġro",
1418
+ "z"
1419
+ ],
1420
+ [
1421
+ "v",
1422
+ "a"
1423
+ ],
1424
+ [
1425
+ "cho",
1426
+ "m"
1427
+ ],
1428
+ [
1429
+ "Ġjs",
1430
+ "me"
1431
+ ],
1432
+ [
1433
+ "l",
1434
+ "y"
1435
+ ],
1436
+ [
1437
+ "ov",
1438
+ "é"
1439
+ ],
1440
+ [
1441
+ "Ġ",
1442
+ "in"
1443
+ ],
1444
+ [
1445
+ "st",
1446
+ "i"
1447
+ ],
1448
+ [
1449
+ "Ġje",
1450
+ "d"
1451
+ ],
1452
+ [
1453
+ "e",
1454
+ "z"
1455
+ ],
1456
+ [
1457
+ "Ġp",
1458
+ "ráv"
1459
+ ],
1460
+ [
1461
+ "p",
1462
+ "ra"
1463
+ ],
1464
+ [
1465
+ "ž",
1466
+ "it"
1467
+ ],
1468
+ [
1469
+ "e",
1470
+ "r"
1471
+ ],
1472
+ [
1473
+ "t",
1474
+ "y"
1475
+ ],
1476
+ [
1477
+ "me",
1478
+ "n"
1479
+ ],
1480
+ [
1481
+ "ová",
1482
+ "nÃŃ"
1483
+ ],
1484
+ [
1485
+ "Å¡",
1486
+ "i"
1487
+ ],
1488
+ [
1489
+ "ý",
1490
+ "m"
1491
+ ],
1492
+ [
1493
+ "Ġpro",
1494
+ "to"
1495
+ ],
1496
+ [
1497
+ "ho",
1498
+ "d"
1499
+ ],
1500
+ [
1501
+ "ž",
1502
+ "i"
1503
+ ],
1504
+ [
1505
+ "Ġ",
1506
+ "co"
1507
+ ],
1508
+ [
1509
+ "ĠmusÃŃ",
1510
+ "me"
1511
+ ],
1512
+ [
1513
+ "Ġv",
1514
+ "ý"
1515
+ ],
1516
+ [
1517
+ "Ġ",
1518
+ "li"
1519
+ ],
1520
+ [
1521
+ "ov",
1522
+ "a"
1523
+ ],
1524
+ [
1525
+ "Ġpa",
1526
+ "nÃŃ"
1527
+ ],
1528
+ [
1529
+ "Ġpo",
1530
+ "t"
1531
+ ],
1532
+ [
1533
+ "d",
1534
+ "á"
1535
+ ],
1536
+ [
1537
+ "s",
1538
+ "a"
1539
+ ],
1540
+ [
1541
+ "n",
1542
+ "u"
1543
+ ],
1544
+ [
1545
+ "k",
1546
+ "á"
1547
+ ],
1548
+ [
1549
+ "ÃŃ",
1550
+ "t"
1551
+ ],
1552
+ [
1553
+ "a",
1554
+ "r"
1555
+ ],
1556
+ [
1557
+ "l",
1558
+ "é"
1559
+ ],
1560
+ [
1561
+ "Ġko",
1562
+ "n"
1563
+ ],
1564
+ [
1565
+ "Ġt",
1566
+ "é"
1567
+ ],
1568
+ [
1569
+ "Ġta",
1570
+ "dy"
1571
+ ],
1572
+ [
1573
+ "p",
1574
+ "e"
1575
+ ],
1576
+ [
1577
+ "Ġt",
1578
+ "ÄĽ"
1579
+ ],
1580
+ [
1581
+ "s",
1582
+ "i"
1583
+ ],
1584
+ [
1585
+ "d",
1586
+ "ou"
1587
+ ],
1588
+ [
1589
+ "r",
1590
+ "i"
1591
+ ],
1592
+ [
1593
+ "Ġj",
1594
+ "á"
1595
+ ],
1596
+ [
1597
+ "ál",
1598
+ "nÃŃ"
1599
+ ],
1600
+ [
1601
+ "n",
1602
+ "ou"
1603
+ ],
1604
+ [
1605
+ "i",
1606
+ "e"
1607
+ ],
1608
+ [
1609
+ "Äį",
1610
+ "e"
1611
+ ],
1612
+ [
1613
+ "Ġd",
1614
+ "i"
1615
+ ],
1616
+ [
1617
+ "at",
1618
+ "i"
1619
+ ],
1620
+ [
1621
+ "v",
1622
+ "nÃŃ"
1623
+ ],
1624
+ [
1625
+ "Ġv",
1626
+ "ÄĽ"
1627
+ ],
1628
+ [
1629
+ "Ġ",
1630
+ "ce"
1631
+ ],
1632
+ [
1633
+ "Ġt",
1634
+ "y"
1635
+ ],
1636
+ [
1637
+ "Ġd",
1638
+ "ů"
1639
+ ],
1640
+ [
1641
+ "Ġ",
1642
+ "2"
1643
+ ],
1644
+ [
1645
+ "st",
1646
+ "a"
1647
+ ],
1648
+ [
1649
+ "ĠpÅĻed",
1650
+ "se"
1651
+ ],
1652
+ [
1653
+ "Ġko",
1654
+ "leg"
1655
+ ],
1656
+ [
1657
+ "l",
1658
+ "ÃŃ"
1659
+ ],
1660
+ [
1661
+ "d",
1662
+ "ÃŃ"
1663
+ ],
1664
+ [
1665
+ "Å",
1666
+ "Ī"
1667
+ ],
1668
+ [
1669
+ "m",
1670
+ "a"
1671
+ ],
1672
+ [
1673
+ "k",
1674
+ "la"
1675
+ ],
1676
+ [
1677
+ "Ġz",
1678
+ "em"
1679
+ ],
1680
+ [
1681
+ "Ġe",
1682
+ "vrop"
1683
+ ],
1684
+ [
1685
+ "Ġv",
1686
+ "še"
1687
+ ],
1688
+ [
1689
+ "le",
1690
+ "n"
1691
+ ],
1692
+ [
1693
+ "le",
1694
+ "žit"
1695
+ ],
1696
+ [
1697
+ "Ġpan",
1698
+ "e"
1699
+ ],
1700
+ [
1701
+ "Ġ",
1702
+ "ÅĻe"
1703
+ ],
1704
+ [
1705
+ "d",
1706
+ "i"
1707
+ ],
1708
+ [
1709
+ "Ġ",
1710
+ "un"
1711
+ ],
1712
+ [
1713
+ "st",
1714
+ "u"
1715
+ ],
1716
+ [
1717
+ "Ġs",
1718
+ "ou"
1719
+ ],
1720
+ [
1721
+ "n",
1722
+ "ý"
1723
+ ],
1724
+ [
1725
+ "Ġp",
1726
+ "ra"
1727
+ ],
1728
+ [
1729
+ "Ġne",
1730
+ "j"
1731
+ ],
1732
+ [
1733
+ "nost",
1734
+ "i"
1735
+ ],
1736
+ [
1737
+ "ů",
1738
+ "že"
1739
+ ],
1740
+ [
1741
+ "Ġk",
1742
+ "dy"
1743
+ ],
1744
+ [
1745
+ "ck",
1746
+ "é"
1747
+ ],
1748
+ [
1749
+ "v",
1750
+ "r"
1751
+ ],
1752
+ [
1753
+ "m",
1754
+ "y"
1755
+ ],
1756
+ [
1757
+ "it",
1758
+ "u"
1759
+ ],
1760
+ [
1761
+ "s",
1762
+ "o"
1763
+ ],
1764
+ [
1765
+ "Ġaby",
1766
+ "chom"
1767
+ ],
1768
+ [
1769
+ "Ä",
1770
+ "ı"
1771
+ ],
1772
+ [
1773
+ "Ġ",
1774
+ "1"
1775
+ ],
1776
+ [
1777
+ "Ġjak",
1778
+ "o"
1779
+ ],
1780
+ [
1781
+ "t",
1782
+ "a"
1783
+ ],
1784
+ [
1785
+ "Ġpot",
1786
+ "ÅĻeb"
1787
+ ],
1788
+ [
1789
+ "Äį",
1790
+ "nÃŃ"
1791
+ ],
1792
+ [
1793
+ "Ġb",
1794
+ "ý"
1795
+ ],
1796
+ [
1797
+ "t",
1798
+ "ÃŃ"
1799
+ ],
1800
+ [
1801
+ "Ġ",
1802
+ "P"
1803
+ ],
1804
+ [
1805
+ "Ġte",
1806
+ "dy"
1807
+ ],
1808
+ [
1809
+ "Ġs",
1810
+ "l"
1811
+ ],
1812
+ [
1813
+ "p",
1814
+ "a"
1815
+ ],
1816
+ [
1817
+ "Ġpo",
1818
+ "li"
1819
+ ],
1820
+ [
1821
+ "Ġbu",
1822
+ "dou"
1823
+ ],
1824
+ [
1825
+ "Ġto",
1826
+ "m"
1827
+ ],
1828
+ [
1829
+ "k",
1830
+ "ra"
1831
+ ],
1832
+ [
1833
+ "ĠvÅ¡e",
1834
+ "ch"
1835
+ ],
1836
+ [
1837
+ "v",
1838
+ "é"
1839
+ ],
1840
+ [
1841
+ "Ġd",
1842
+ "al"
1843
+ ],
1844
+ [
1845
+ "Ġbý",
1846
+ "t"
1847
+ ],
1848
+ [
1849
+ "Ġ",
1850
+ "K"
1851
+ ],
1852
+ [
1853
+ "Ġm",
1854
+ "i"
1855
+ ],
1856
+ [
1857
+ "a",
1858
+ "s"
1859
+ ],
1860
+ [
1861
+ "Ġb",
1862
+ "o"
1863
+ ],
1864
+ [
1865
+ "Ġdal",
1866
+ "Å¡ÃŃ"
1867
+ ],
1868
+ [
1869
+ "le",
1870
+ "Äį"
1871
+ ],
1872
+ [
1873
+ "Äį",
1874
+ "an"
1875
+ ],
1876
+ [
1877
+ "ů",
1878
+ "m"
1879
+ ],
1880
+ [
1881
+ "Ġto",
1882
+ "ho"
1883
+ ],
1884
+ [
1885
+ "Ġpoli",
1886
+ "ti"
1887
+ ],
1888
+ [
1889
+ "sk",
1890
+ "á"
1891
+ ],
1892
+ [
1893
+ "at",
1894
+ "e"
1895
+ ],
1896
+ [
1897
+ "ž",
1898
+ "ÃŃ"
1899
+ ],
1900
+ [
1901
+ "Ġjs",
1902
+ "em"
1903
+ ],
1904
+ [
1905
+ "s",
1906
+ "le"
1907
+ ],
1908
+ [
1909
+ "Å¡",
1910
+ "tÄĽ"
1911
+ ],
1912
+ [
1913
+ "ĠEvrop",
1914
+ "ské"
1915
+ ],
1916
+ [
1917
+ "Ġt",
1918
+ "en"
1919
+ ],
1920
+ [
1921
+ "z",
1922
+ "a"
1923
+ ],
1924
+ [
1925
+ "Ġv",
1926
+ "ÃŃ"
1927
+ ],
1928
+ [
1929
+ "m",
1930
+ "ÃŃ"
1931
+ ],
1932
+ [
1933
+ "r",
1934
+ "y"
1935
+ ],
1936
+ [
1937
+ "z",
1938
+ "i"
1939
+ ],
1940
+ [
1941
+ "Ġob",
1942
+ "Äįan"
1943
+ ],
1944
+ [
1945
+ "Ġz",
1946
+ "p"
1947
+ ],
1948
+ [
1949
+ "Ġko",
1950
+ "mi"
1951
+ ],
1952
+ [
1953
+ "Ġspo",
1954
+ "leÄį"
1955
+ ],
1956
+ [
1957
+ "Ġp",
1958
+ "r"
1959
+ ],
1960
+ [
1961
+ "Ġt",
1962
+ "u"
1963
+ ],
1964
+ [
1965
+ "t",
1966
+ "nÃŃ"
1967
+ ],
1968
+ [
1969
+ "Ġ",
1970
+ "ÅĻÃŃ"
1971
+ ],
1972
+ [
1973
+ "sk",
1974
+ "u"
1975
+ ],
1976
+ [
1977
+ "Ġm",
1978
+ "in"
1979
+ ],
1980
+ [
1981
+ "t",
1982
+ "á"
1983
+ ],
1984
+ [
1985
+ "Ġve",
1986
+ "l"
1987
+ ],
1988
+ [
1989
+ "la",
1990
+ "men"
1991
+ ],
1992
+ [
1993
+ "Ġh",
1994
+ "la"
1995
+ ],
1996
+ [
1997
+ "stu",
1998
+ "p"
1999
+ ],
2000
+ [
2001
+ "Ġproto",
2002
+ "že"
2003
+ ],
2004
+ [
2005
+ "Ġd",
2006
+ "á"
2007
+ ],
2008
+ [
2009
+ "sk",
2010
+ "ých"
2011
+ ],
2012
+ [
2013
+ "g",
2014
+ "i"
2015
+ ],
2016
+ [
2017
+ "n",
2018
+ "it"
2019
+ ],
2020
+ [
2021
+ "Ġpa",
2022
+ "r"
2023
+ ],
2024
+ [
2025
+ "d",
2026
+ "ky"
2027
+ ],
2028
+ [
2029
+ "y",
2030
+ "nÃŃ"
2031
+ ],
2032
+ [
2033
+ "Ġs",
2034
+ "ku"
2035
+ ],
2036
+ [
2037
+ "Ġmá",
2038
+ "me"
2039
+ ],
2040
+ [
2041
+ "Ġn",
2042
+ "ynÃŃ"
2043
+ ],
2044
+ [
2045
+ "ro",
2046
+ "v"
2047
+ ],
2048
+ [
2049
+ "Ġdo",
2050
+ "b"
2051
+ ],
2052
+ [
2053
+ "ĠÄį",
2054
+ "len"
2055
+ ],
2056
+ [
2057
+ "st",
2058
+ "e"
2059
+ ],
2060
+ [
2061
+ "Ġ",
2062
+ "rá"
2063
+ ],
2064
+ [
2065
+ "f",
2066
+ "or"
2067
+ ],
2068
+ [
2069
+ "Ġ",
2070
+ "S"
2071
+ ],
2072
+ [
2073
+ "Ġna",
2074
+ "še"
2075
+ ],
2076
+ [
2077
+ "Ġvá",
2078
+ "m"
2079
+ ],
2080
+ [
2081
+ "l",
2082
+ "at"
2083
+ ],
2084
+ [
2085
+ "Ġpod",
2086
+ "po"
2087
+ ],
2088
+ [
2089
+ "Ġn",
2090
+ "i"
2091
+ ],
2092
+ [
2093
+ "Ġkte",
2094
+ "rá"
2095
+ ],
2096
+ [
2097
+ "Ġpo",
2098
+ "ku"
2099
+ ],
2100
+ [
2101
+ "Ġ",
2102
+ "R"
2103
+ ],
2104
+ [
2105
+ "Ġz",
2106
+ "d"
2107
+ ],
2108
+ [
2109
+ "Ġm",
2110
+ "no"
2111
+ ],
2112
+ [
2113
+ "Ġvá",
2114
+ "že"
2115
+ ],
2116
+ [
2117
+ "st",
2118
+ "o"
2119
+ ],
2120
+ [
2121
+ "Ġkte",
2122
+ "ÅĻÃŃ"
2123
+ ],
2124
+ [
2125
+ "Ġdů",
2126
+ "ležit"
2127
+ ]
2128
+ ]
2129
+ }
2130
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "</s>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "<unk>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<pad>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "4": {
36
+ "content": "<mask>",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "<s>",
45
+ "clean_up_tokenization_spaces": false,
46
+ "eos_token": "</s>",
47
+ "mask_token": "<mask>",
48
+ "model_max_length": 1000000000000000019884624838656,
49
+ "pad_token": "<pad>",
50
+ "tokenizer_class": "PreTrainedTokenizerFast",
51
+ "unk_token": "<unk>"
52
+ }