E-katrin commited on
Commit
53b1bdc
·
verified ·
1 Parent(s): d1978f6

Model save

Browse files
README.md ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: xlm-roberta-base
3
+ datasets: E-katrin/train20
4
+ language: sv
5
+ library_name: transformers
6
+ license: gpl-3.0
7
+ metrics:
8
+ - accuracy
9
+ - f1
10
+ pipeline_tag: token-classification
11
+ tags:
12
+ - pytorch
13
+ model-index:
14
+ - name: E-katrin/train20_10e-5_10ep
15
+ results:
16
+ - task:
17
+ type: token-classification
18
+ dataset:
19
+ name: train20
20
+ type: E-katrin/train20
21
+ split: validation
22
+ metrics:
23
+ - type: f1
24
+ value: 0.7334744654028211
25
+ name: Null F1
26
+ - type: f1
27
+ value: 0.014846159776685144
28
+ name: Lemma F1
29
+ - type: f1
30
+ value: 0.04934241130226303
31
+ name: Morphology F1
32
+ - type: accuracy
33
+ value: 0.5646359583952452
34
+ name: Ud Jaccard
35
+ - type: accuracy
36
+ value: 0.39341205717837163
37
+ name: Eud Jaccard
38
+ - type: f1
39
+ value: 0.7448370725028419
40
+ name: Miscs F1
41
+ - type: f1
42
+ value: 0.427309181058314
43
+ name: Deepslot F1
44
+ - type: f1
45
+ value: 0.3632536407434294
46
+ name: Semclass F1
47
+ ---
48
+
49
+ # Model Card for train20_10e-5_10ep
50
+
51
+ A transformer-based multihead parser for CoBaLD annotation.
52
+
53
+ This model parses a pre-tokenized CoNLL-U text and jointly labels each token with three tiers of tags:
54
+ * Grammatical tags (lemma, UPOS, XPOS, morphological features),
55
+ * Syntactic tags (basic and enhanced Universal Dependencies),
56
+ * Semantic tags (deep slot and semantic class).
57
+
58
+ ## Model Sources
59
+
60
+ - **Repository:** https://github.com/CobaldAnnotation/CobaldParser
61
+ - **Paper:** https://dialogue-conf.org/wp-content/uploads/2025/04/BaiukIBaiukAPetrovaM.009.pdf
62
+ - **Demo:** [coming soon]
63
+
64
+ ## Citation
65
+
66
+ ```
67
+ @inproceedings{baiuk2025cobald,
68
+ title={CoBaLD Parser: Joint Morphosyntactic and Semantic Annotation},
69
+ author={Baiuk, Ilia and Baiuk, Alexandra and Petrova, Maria},
70
+ booktitle={Proceedings of the International Conference "Dialogue"},
71
+ volume={I},
72
+ year={2025}
73
+ }
74
+ ```
config.json ADDED
@@ -0,0 +1,1928 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "activation": "relu",
3
+ "architectures": [
4
+ "CobaldParser"
5
+ ],
6
+ "auto_map": {
7
+ "AutoConfig": "configuration.CobaldParserConfig",
8
+ "AutoModel": "modeling_parser.CobaldParser"
9
+ },
10
+ "consecutive_null_limit": 3,
11
+ "deepslot_classifier_hidden_size": 256,
12
+ "dependency_classifier_hidden_size": 128,
13
+ "dropout": 0.1,
14
+ "encoder_model_name": "xlm-roberta-base",
15
+ "lemma_classifier_hidden_size": 512,
16
+ "misc_classifier_hidden_size": 512,
17
+ "model_type": "cobald_parser",
18
+ "morphology_classifier_hidden_size": 512,
19
+ "null_classifier_hidden_size": 512,
20
+ "semclass_classifier_hidden_size": 512,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.52.2",
23
+ "vocabulary": {
24
+ "deepslot": {
25
+ "0": "$Dislocation",
26
+ "1": "Addition",
27
+ "2": "AdditionalParticipant",
28
+ "3": "Addressee",
29
+ "4": "Addressee_Metaphoric",
30
+ "5": "Agent",
31
+ "6": "Agent_Metaphoric",
32
+ "7": "AttachedProperty",
33
+ "8": "BehalfOfEntity",
34
+ "9": "BeneMalefactive",
35
+ "10": "Causator",
36
+ "11": "Cause",
37
+ "12": "Ch_Parameter",
38
+ "13": "Ch_Reference",
39
+ "14": "Characteristic",
40
+ "15": "Chemical_Composite",
41
+ "16": "ClassifiedEntity",
42
+ "17": "Comparison",
43
+ "18": "ComparisonBase",
44
+ "19": "Comparison_Symmetrical",
45
+ "20": "Composition",
46
+ "21": "Concession",
47
+ "22": "ConcessiveCondition",
48
+ "23": "Concurrent",
49
+ "24": "Concurrent_Complement",
50
+ "25": "Condition",
51
+ "26": "Consequence",
52
+ "27": "ContentOfContainer",
53
+ "28": "ContrAgent",
54
+ "29": "ContrAgent_Metaphoric",
55
+ "30": "ContrObject",
56
+ "31": "Core_Hyphen_Component",
57
+ "32": "Correlative",
58
+ "33": "Criterion",
59
+ "34": "Degree",
60
+ "35": "DegreeNumerative",
61
+ "36": "Dependent_Hyphen_Component",
62
+ "37": "Elective",
63
+ "38": "Empty_Subject_It",
64
+ "39": "Experiencer",
65
+ "40": "Experiencer_Metaphoric",
66
+ "41": "Explication",
67
+ "42": "Fabricative",
68
+ "43": "FormOfRepresentation",
69
+ "44": "Function",
70
+ "45": "GappingRemnant",
71
+ "46": "Instrument",
72
+ "47": "Instrument_Situation",
73
+ "48": "Interval_Beginning",
74
+ "49": "Interval_End",
75
+ "50": "Landmark",
76
+ "51": "Limitation",
77
+ "52": "Locative",
78
+ "53": "Locative_Distance",
79
+ "54": "Locative_FinalPoint",
80
+ "55": "Locative_InitialPoint",
81
+ "56": "Locative_Route",
82
+ "57": "Manner",
83
+ "58": "MannerOfPositionAndMotion",
84
+ "59": "Manner_Configuration",
85
+ "60": "Manner_Reduplication",
86
+ "61": "MathCharacteristic",
87
+ "62": "MeasureSpecification",
88
+ "63": "Member",
89
+ "64": "MetaphoricLocative",
90
+ "65": "Metaphoric_FinalPoint",
91
+ "66": "Metaphoric_InitialPoint",
92
+ "67": "Metaphoric_Route",
93
+ "68": "Motive",
94
+ "69": "Motive_Warranty",
95
+ "70": "MovingLandmark",
96
+ "71": "Name_Title",
97
+ "72": "Object",
98
+ "73": "Object_Relation",
99
+ "74": "Object_Situation",
100
+ "75": "OneAnother",
101
+ "76": "Opposition",
102
+ "77": "OrderInTimeAndSpace",
103
+ "78": "Original_Object",
104
+ "79": "Original_Situation",
105
+ "80": "Parenthetical",
106
+ "81": "Part",
107
+ "82": "PartAsOrientation",
108
+ "83": "Part_Situation",
109
+ "84": "ParticipleRelativeClause",
110
+ "85": "Particles_Accentuation",
111
+ "86": "PaymentBy_NonMonetaryUnits",
112
+ "87": "PersonImplicit",
113
+ "88": "PlaceOfContact",
114
+ "89": "Possessor",
115
+ "90": "Possessor_Locative",
116
+ "91": "Possessor_Metaphoric",
117
+ "92": "Possessor_Situational",
118
+ "93": "PragmaticEvaluation",
119
+ "94": "Predicate",
120
+ "95": "Predicate_Adverb",
121
+ "96": "Predicate_DiscoursiveUnits",
122
+ "97": "Predicate_Noun",
123
+ "98": "PrincipleOfOrganization",
124
+ "99": "Proportion_FirstComponent",
125
+ "100": "Proportion_To",
126
+ "101": "Purpose",
127
+ "102": "Purpose_Distributive",
128
+ "103": "QuantifiedEntity",
129
+ "104": "Quantity",
130
+ "105": "Quantity_Pragmatic",
131
+ "106": "Raising_Target",
132
+ "107": "Relative",
133
+ "108": "Resultative",
134
+ "109": "Route_Situation",
135
+ "110": "SetEnvironment",
136
+ "111": "Set_Classification",
137
+ "112": "Set_General",
138
+ "113": "Source",
139
+ "114": "Specification",
140
+ "115": "Specifier_Number",
141
+ "116": "Spectator",
142
+ "117": "SpeechEtiquette",
143
+ "118": "Sphere",
144
+ "119": "StaffOfPossessors",
145
+ "120": "Standpoint",
146
+ "121": "State",
147
+ "122": "Stimulus",
148
+ "123": "SupportedEntity",
149
+ "124": "TagQuestion",
150
+ "125": "TagSubject",
151
+ "126": "Theme",
152
+ "127": "ThemeRhematic",
153
+ "128": "Time",
154
+ "129": "Vocative",
155
+ "130": "Vocative_Metaphoric",
156
+ "131": "Whole",
157
+ "132": "Whole_Complement",
158
+ "133": "_"
159
+ },
160
+ "eud_deprel": {
161
+ "0": "acl",
162
+ "1": "acl:about",
163
+ "2": "acl:about_whether",
164
+ "3": "acl:after",
165
+ "4": "acl:against",
166
+ "5": "acl:as",
167
+ "6": "acl:as_if",
168
+ "7": "acl:as_to",
169
+ "8": "acl:at",
170
+ "9": "acl:att",
171
+ "10": "acl:before",
172
+ "11": "acl:behind",
173
+ "12": "acl:between",
174
+ "13": "acl:beyond",
175
+ "14": "acl:but",
176
+ "15": "acl:but_to",
177
+ "16": "acl:cleft",
178
+ "17": "acl:concerning",
179
+ "18": "acl:except_that",
180
+ "19": "acl:for",
181
+ "20": "acl:for_to",
182
+ "21": "acl:from",
183
+ "22": "acl:if",
184
+ "23": "acl:in",
185
+ "24": "acl:including",
186
+ "25": "acl:including_whether",
187
+ "26": "acl:inside",
188
+ "27": "acl:instead_of",
189
+ "28": "acl:into",
190
+ "29": "acl:like",
191
+ "30": "acl:med",
192
+ "31": "acl:mot",
193
+ "32": "acl:of",
194
+ "33": "acl:of_if",
195
+ "34": "acl:of_why",
196
+ "35": "acl:om",
197
+ "36": "acl:on",
198
+ "37": "acl:once",
199
+ "38": "acl:over",
200
+ "39": "acl:prior_to",
201
+ "40": "acl:p\u00e5",
202
+ "41": "acl:regarding",
203
+ "42": "acl:relcl",
204
+ "43": "acl:relcl:to",
205
+ "44": "acl:since",
206
+ "45": "acl:som",
207
+ "46": "acl:such_as",
208
+ "47": "acl:than",
209
+ "48": "acl:that",
210
+ "49": "acl:though",
211
+ "50": "acl:to",
212
+ "51": "acl:toward",
213
+ "52": "acl:towards",
214
+ "53": "acl:under",
215
+ "54": "acl:until",
216
+ "55": "acl:upon",
217
+ "56": "acl:when",
218
+ "57": "acl:where",
219
+ "58": "acl:whether",
220
+ "59": "acl:why",
221
+ "60": "acl:with",
222
+ "61": "acl:\u00e4n",
223
+ "62": "advcl",
224
+ "63": "advcl:about",
225
+ "64": "advcl:about_whether",
226
+ "65": "advcl:after",
227
+ "66": "advcl:against",
228
+ "67": "advcl:albeit",
229
+ "68": "advcl:along_with",
230
+ "69": "advcl:although",
231
+ "70": "advcl:as",
232
+ "71": "advcl:as_if",
233
+ "72": "advcl:as_in",
234
+ "73": "advcl:as_long_as",
235
+ "74": "advcl:as_soon_as",
236
+ "75": "advcl:as_though",
237
+ "76": "advcl:as_to",
238
+ "77": "advcl:as_well_as",
239
+ "78": "advcl:as_with",
240
+ "79": "advcl:at",
241
+ "80": "advcl:att",
242
+ "81": "advcl:because",
243
+ "82": "advcl:before",
244
+ "83": "advcl:behind",
245
+ "84": "advcl:besides",
246
+ "85": "advcl:between",
247
+ "86": "advcl:beyond",
248
+ "87": "advcl:but",
249
+ "88": "advcl:by",
250
+ "89": "advcl:cause",
251
+ "90": "advcl:despite",
252
+ "91": "advcl:due_to",
253
+ "92": "advcl:d\u00e4rf\u00f6r_att",
254
+ "93": "advcl:d\u00e5",
255
+ "94": "advcl:eftersom",
256
+ "95": "advcl:except",
257
+ "96": "advcl:except_for",
258
+ "97": "advcl:except_that",
259
+ "98": "advcl:for",
260
+ "99": "advcl:for_if",
261
+ "100": "advcl:for_to",
262
+ "101": "advcl:from",
263
+ "102": "advcl:f\u00f6r_att",
264
+ "103": "advcl:f\u00f6rutsatt_att",
265
+ "104": "advcl:given",
266
+ "105": "advcl:if",
267
+ "106": "advcl:if_to",
268
+ "107": "advcl:in",
269
+ "108": "advcl:in_between",
270
+ "109": "advcl:in_case",
271
+ "110": "advcl:in_order",
272
+ "111": "advcl:in_order_for",
273
+ "112": "advcl:in_order_to",
274
+ "113": "advcl:in_that",
275
+ "114": "advcl:including_by",
276
+ "115": "advcl:innan",
277
+ "116": "advcl:inside",
278
+ "117": "advcl:insofar_as",
279
+ "118": "advcl:instead_of",
280
+ "119": "advcl:into",
281
+ "120": "advcl:lest",
282
+ "121": "advcl:like",
283
+ "122": "advcl:liksom",
284
+ "123": "advcl:med_att",
285
+ "124": "advcl:n\u00e4r",
286
+ "125": "advcl:of",
287
+ "126": "advcl:of_whether",
288
+ "127": "advcl:om",
289
+ "128": "advcl:on",
290
+ "129": "advcl:on_whether",
291
+ "130": "advcl:once",
292
+ "131": "advcl:out",
293
+ "132": "advcl:over",
294
+ "133": "advcl:past",
295
+ "134": "advcl:prior_to",
296
+ "135": "advcl:provided",
297
+ "136": "advcl:p\u00e5",
298
+ "137": "advcl:rather_than",
299
+ "138": "advcl:relcl",
300
+ "139": "advcl:relcl:because",
301
+ "140": "advcl:samtidigt_som",
302
+ "141": "advcl:sedan",
303
+ "142": "advcl:since",
304
+ "143": "advcl:so",
305
+ "144": "advcl:so_as_to",
306
+ "145": "advcl:so_that",
307
+ "146": "advcl:som",
308
+ "147": "advcl:such_as",
309
+ "148": "advcl:than",
310
+ "149": "advcl:than_if",
311
+ "150": "advcl:that",
312
+ "151": "advcl:the",
313
+ "152": "advcl:though",
314
+ "153": "advcl:through",
315
+ "154": "advcl:till",
316
+ "155": "advcl:to",
317
+ "156": "advcl:toward",
318
+ "157": "advcl:towards",
319
+ "158": "advcl:under",
320
+ "159": "advcl:unless",
321
+ "160": "advcl:until",
322
+ "161": "advcl:upon",
323
+ "162": "advcl:when",
324
+ "163": "advcl:where",
325
+ "164": "advcl:whereas",
326
+ "165": "advcl:whether",
327
+ "166": "advcl:while",
328
+ "167": "advcl:whilst",
329
+ "168": "advcl:whither",
330
+ "169": "advcl:with",
331
+ "170": "advcl:without",
332
+ "171": "advcl:\u00e4n",
333
+ "172": "advmod",
334
+ "173": "amod",
335
+ "174": "appos",
336
+ "175": "aux",
337
+ "176": "aux:pass",
338
+ "177": "case",
339
+ "178": "case:of",
340
+ "179": "cc",
341
+ "180": "cc:preconj",
342
+ "181": "ccomp",
343
+ "182": "ccomp:whether",
344
+ "183": "compound",
345
+ "184": "compound:prt",
346
+ "185": "conj",
347
+ "186": "conj:and",
348
+ "187": "conj:and_or",
349
+ "188": "conj:and_yet",
350
+ "189": "conj:as_well_as",
351
+ "190": "conj:but",
352
+ "191": "conj:eller",
353
+ "192": "conj:et",
354
+ "193": "conj:fast",
355
+ "194": "conj:for",
356
+ "195": "conj:let_alone",
357
+ "196": "conj:men",
358
+ "197": "conj:minus",
359
+ "198": "conj:nor",
360
+ "199": "conj:not",
361
+ "200": "conj:not_to_mention",
362
+ "201": "conj:och",
363
+ "202": "conj:or",
364
+ "203": "conj:plus",
365
+ "204": "conj:plus_minus",
366
+ "205": "conj:rather_than",
367
+ "206": "conj:respektive",
368
+ "207": "conj:samt",
369
+ "208": "conj:slash",
370
+ "209": "conj:som",
371
+ "210": "conj:though",
372
+ "211": "conj:ty",
373
+ "212": "conj:utan",
374
+ "213": "conj:yet",
375
+ "214": "cop",
376
+ "215": "csubj",
377
+ "216": "csubj:outer",
378
+ "217": "csubj:pass",
379
+ "218": "csubj:xsubj",
380
+ "219": "dep",
381
+ "220": "det",
382
+ "221": "det:predet",
383
+ "222": "discourse",
384
+ "223": "dislocated",
385
+ "224": "expl",
386
+ "225": "fixed",
387
+ "226": "flat",
388
+ "227": "flat:foreign",
389
+ "228": "flat:name",
390
+ "229": "flatname",
391
+ "230": "goeswith",
392
+ "231": "iobj",
393
+ "232": "list",
394
+ "233": "mark",
395
+ "234": "nmod",
396
+ "235": "nmod:a_la",
397
+ "236": "nmod:aboard",
398
+ "237": "nmod:about",
399
+ "238": "nmod:above",
400
+ "239": "nmod:according_to",
401
+ "240": "nmod:across",
402
+ "241": "nmod:after",
403
+ "242": "nmod:against",
404
+ "243": "nmod:along",
405
+ "244": "nmod:alongside",
406
+ "245": "nmod:amidst",
407
+ "246": "nmod:among",
408
+ "247": "nmod:amongst",
409
+ "248": "nmod:around",
410
+ "249": "nmod:as",
411
+ "250": "nmod:as_for",
412
+ "251": "nmod:as_in",
413
+ "252": "nmod:as_opposed_to",
414
+ "253": "nmod:as_to",
415
+ "254": "nmod:astride",
416
+ "255": "nmod:at",
417
+ "256": "nmod:atop",
418
+ "257": "nmod:av",
419
+ "258": "nmod:barring",
420
+ "259": "nmod:because_of",
421
+ "260": "nmod:before",
422
+ "261": "nmod:behind",
423
+ "262": "nmod:below",
424
+ "263": "nmod:besides",
425
+ "264": "nmod:between",
426
+ "265": "nmod:beyond",
427
+ "266": "nmod:but",
428
+ "267": "nmod:by",
429
+ "268": "nmod:circa",
430
+ "269": "nmod:colon",
431
+ "270": "nmod:concerning",
432
+ "271": "nmod:desc",
433
+ "272": "nmod:despite",
434
+ "273": "nmod:down",
435
+ "274": "nmod:due_to",
436
+ "275": "nmod:during",
437
+ "276": "nmod:efter",
438
+ "277": "nmod:except",
439
+ "278": "nmod:except_for",
440
+ "279": "nmod:excluding",
441
+ "280": "nmod:following",
442
+ "281": "nmod:for",
443
+ "282": "nmod:from",
444
+ "283": "nmod:from_across",
445
+ "284": "nmod:from_below",
446
+ "285": "nmod:from_outside",
447
+ "286": "nmod:from_over",
448
+ "287": "nmod:fr\u00e5n",
449
+ "288": "nmod:f\u00f6r",
450
+ "289": "nmod:hos",
451
+ "290": "nmod:i",
452
+ "291": "nmod:in",
453
+ "292": "nmod:in_front_of",
454
+ "293": "nmod:include",
455
+ "294": "nmod:including",
456
+ "295": "nmod:inom",
457
+ "296": "nmod:inside",
458
+ "297": "nmod:instead_of",
459
+ "298": "nmod:into",
460
+ "299": "nmod:like",
461
+ "300": "nmod:med",
462
+ "301": "nmod:mellan",
463
+ "302": "nmod:minus",
464
+ "303": "nmod:mot",
465
+ "304": "nmod:near",
466
+ "305": "nmod:next_to",
467
+ "306": "nmod:npmod",
468
+ "307": "nmod:oavsett",
469
+ "308": "nmod:of",
470
+ "309": "nmod:off",
471
+ "310": "nmod:om",
472
+ "311": "nmod:on",
473
+ "312": "nmod:onto",
474
+ "313": "nmod:opposite",
475
+ "314": "nmod:other_than",
476
+ "315": "nmod:out",
477
+ "316": "nmod:out_of",
478
+ "317": "nmod:outside",
479
+ "318": "nmod:over",
480
+ "319": "nmod:past",
481
+ "320": "nmod:per",
482
+ "321": "nmod:plus",
483
+ "322": "nmod:poss",
484
+ "323": "nmod:post",
485
+ "324": "nmod:prior_to",
486
+ "325": "nmod:pro",
487
+ "326": "nmod:p\u00e5",
488
+ "327": "nmod:rather_than",
489
+ "328": "nmod:re",
490
+ "329": "nmod:regarding",
491
+ "330": "nmod:round",
492
+ "331": "nmod:save",
493
+ "332": "nmod:since",
494
+ "333": "nmod:slash",
495
+ "334": "nmod:such_as",
496
+ "335": "nmod:than",
497
+ "336": "nmod:through",
498
+ "337": "nmod:throughout",
499
+ "338": "nmod:thru",
500
+ "339": "nmod:till",
501
+ "340": "nmod:times",
502
+ "341": "nmod:tmod",
503
+ "342": "nmod:to",
504
+ "343": "nmod:toward",
505
+ "344": "nmod:towards",
506
+ "345": "nmod:under",
507
+ "346": "nmod:unlike",
508
+ "347": "nmod:unmarked",
509
+ "348": "nmod:until",
510
+ "349": "nmod:up",
511
+ "350": "nmod:up_to",
512
+ "351": "nmod:up_until",
513
+ "352": "nmod:upon",
514
+ "353": "nmod:utanf\u00f6r",
515
+ "354": "nmod:versus",
516
+ "355": "nmod:via",
517
+ "356": "nmod:vid",
518
+ "357": "nmod:whether",
519
+ "358": "nmod:with",
520
+ "359": "nmod:within",
521
+ "360": "nmod:without",
522
+ "361": "nmod:x",
523
+ "362": "nmod:\u00e5t",
524
+ "363": "nsubj",
525
+ "364": "nsubj:outer",
526
+ "365": "nsubj:pass",
527
+ "366": "nsubj:pass:xsubj",
528
+ "367": "nsubj:xsubj",
529
+ "368": "nummod",
530
+ "369": "nummod:gov",
531
+ "370": "obj",
532
+ "371": "obl",
533
+ "372": "obl:aboard",
534
+ "373": "obl:about",
535
+ "374": "obl:above",
536
+ "375": "obl:according_to",
537
+ "376": "obl:across",
538
+ "377": "obl:after",
539
+ "378": "obl:against",
540
+ "379": "obl:agent",
541
+ "380": "obl:along",
542
+ "381": "obl:along_with",
543
+ "382": "obl:alongside",
544
+ "383": "obl:amid",
545
+ "384": "obl:amidst",
546
+ "385": "obl:among",
547
+ "386": "obl:amongst",
548
+ "387": "obl:apart_from",
549
+ "388": "obl:around",
550
+ "389": "obl:as",
551
+ "390": "obl:as_for",
552
+ "391": "obl:as_in",
553
+ "392": "obl:as_of",
554
+ "393": "obl:as_opposed_to",
555
+ "394": "obl:as_to",
556
+ "395": "obl:aside",
557
+ "396": "obl:aside_from",
558
+ "397": "obl:at",
559
+ "398": "obl:atop",
560
+ "399": "obl:av",
561
+ "400": "obl:because_of",
562
+ "401": "obl:before",
563
+ "402": "obl:behind",
564
+ "403": "obl:below",
565
+ "404": "obl:beneath",
566
+ "405": "obl:beside",
567
+ "406": "obl:besides",
568
+ "407": "obl:between",
569
+ "408": "obl:beyond",
570
+ "409": "obl:bland",
571
+ "410": "obl:but",
572
+ "411": "obl:by",
573
+ "412": "obl:circa",
574
+ "413": "obl:concerning",
575
+ "414": "obl:depending",
576
+ "415": "obl:depending_on",
577
+ "416": "obl:depending_upon",
578
+ "417": "obl:despite",
579
+ "418": "obl:down",
580
+ "419": "obl:due_to",
581
+ "420": "obl:during",
582
+ "421": "obl:efter",
583
+ "422": "obl:enligt",
584
+ "423": "obl:except",
585
+ "424": "obl:except_for",
586
+ "425": "obl:excluding",
587
+ "426": "obl:following",
588
+ "427": "obl:for",
589
+ "428": "obl:for_post",
590
+ "429": "obl:from",
591
+ "430": "obl:from_across",
592
+ "431": "obl:from_among",
593
+ "432": "obl:from_behind",
594
+ "433": "obl:from_over",
595
+ "434": "obl:fr\u00e5n",
596
+ "435": "obl:f\u00f6r",
597
+ "436": "obl:genom",
598
+ "437": "obl:given",
599
+ "438": "obl:hos",
600
+ "439": "obl:i",
601
+ "440": "obl:in",
602
+ "441": "obl:in_between",
603
+ "442": "obl:in_case_of",
604
+ "443": "obl:in_front_of",
605
+ "444": "obl:in_lieu_of",
606
+ "445": "obl:in_to",
607
+ "446": "obl:including",
608
+ "447": "obl:including_before",
609
+ "448": "obl:including_for",
610
+ "449": "obl:including_in",
611
+ "450": "obl:inom",
612
+ "451": "obl:inside",
613
+ "452": "obl:instead_of",
614
+ "453": "obl:into",
615
+ "454": "obl:like",
616
+ "455": "obl:med",
617
+ "456": "obl:med_avseende_p\u00e5",
618
+ "457": "obl:mellan",
619
+ "458": "obl:minus",
620
+ "459": "obl:mot",
621
+ "460": "obl:near",
622
+ "461": "obl:nearby",
623
+ "462": "obl:nigh",
624
+ "463": "obl:notwithstanding",
625
+ "464": "obl:npmod",
626
+ "465": "obl:of",
627
+ "466": "obl:off",
628
+ "467": "obl:off_of",
629
+ "468": "obl:om",
630
+ "469": "obl:omkring",
631
+ "470": "obl:on",
632
+ "471": "obl:on_board",
633
+ "472": "obl:on_to",
634
+ "473": "obl:onto",
635
+ "474": "obl:opposite",
636
+ "475": "obl:other_than",
637
+ "476": "obl:out",
638
+ "477": "obl:out_of",
639
+ "478": "obl:outside",
640
+ "479": "obl:over",
641
+ "480": "obl:past",
642
+ "481": "obl:per",
643
+ "482": "obl:plus",
644
+ "483": "obl:post",
645
+ "484": "obl:prior_to",
646
+ "485": "obl:p\u00e5",
647
+ "486": "obl:rather_than",
648
+ "487": "obl:re",
649
+ "488": "obl:regarding",
650
+ "489": "obl:round",
651
+ "490": "obl:runtomkring",
652
+ "491": "obl:since",
653
+ "492": "obl:som",
654
+ "493": "obl:such_as",
655
+ "494": "obl:than",
656
+ "495": "obl:through",
657
+ "496": "obl:throughout",
658
+ "497": "obl:thru",
659
+ "498": "obl:till",
660
+ "499": "obl:tmod",
661
+ "500": "obl:to",
662
+ "501": "obl:to_before",
663
+ "502": "obl:toward",
664
+ "503": "obl:towards",
665
+ "504": "obl:trots",
666
+ "505": "obl:under",
667
+ "506": "obl:underneath",
668
+ "507": "obl:unlike",
669
+ "508": "obl:unmarked",
670
+ "509": "obl:until",
671
+ "510": "obl:unto",
672
+ "511": "obl:up",
673
+ "512": "obl:up_on",
674
+ "513": "obl:up_to",
675
+ "514": "obl:up_until",
676
+ "515": "obl:upon",
677
+ "516": "obl:ur",
678
+ "517": "obl:utan",
679
+ "518": "obl:utanf\u00f6r",
680
+ "519": "obl:versus",
681
+ "520": "obl:via",
682
+ "521": "obl:vid",
683
+ "522": "obl:with",
684
+ "523": "obl:within",
685
+ "524": "obl:without",
686
+ "525": "obl:\u00e4n",
687
+ "526": "obl:\u00e5",
688
+ "527": "obl:\u00e5t",
689
+ "528": "parataxis",
690
+ "529": "punct",
691
+ "530": "ref",
692
+ "531": "reparandum",
693
+ "532": "root",
694
+ "533": "vocative",
695
+ "534": "xcomp"
696
+ },
697
+ "joint_feats": {
698
+ "0": "ADJ#Adjective#Abbr=Yes",
699
+ "1": "ADJ#Adjective#Abbr=Yes|Degree=Pos",
700
+ "2": "ADJ#Adjective#Case=Nom|Definite=Def|Degree=Pos",
701
+ "3": "ADJ#Adjective#Case=Nom|Definite=Def|Degree=Pos|Gender=Com|Number=Sing",
702
+ "4": "ADJ#Adjective#Case=Nom|Definite=Def|Degree=Pos|Tense=Past|VerbForm=Part",
703
+ "5": "ADJ#Adjective#Case=Nom|Definite=Def|Degree=Sup",
704
+ "6": "ADJ#Adjective#Case=Nom|Definite=Ind|Degree=Pos",
705
+ "7": "ADJ#Adjective#Case=Nom|Definite=Ind|Degree=Pos|Gender=Com|Number=Sing",
706
+ "8": "ADJ#Adjective#Case=Nom|Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|Tense=Past|VerbForm=Part",
707
+ "9": "ADJ#Adjective#Case=Nom|Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing",
708
+ "10": "ADJ#Adjective#Case=Nom|Definite=Ind|Degree=Pos|Number=Plur",
709
+ "11": "ADJ#Adjective#Case=Nom|Definite=Ind|Degree=Pos|Number=Sing",
710
+ "12": "ADJ#Adjective#Case=Nom|Degree=Cmp",
711
+ "13": "ADJ#Adjective#Case=Nom|Degree=Pos",
712
+ "14": "ADJ#Adjective#Case=Nom|Degree=Pos|Number=Plur",
713
+ "15": "ADJ#Adjective#Case=Nom|Degree=Pos|Tense=Pres|VerbForm=Part",
714
+ "16": "ADJ#Adjective#Case=Nom|Number=Plur|Tense=Past|VerbForm=Part",
715
+ "17": "ADJ#Adjective#Degree=Cmp",
716
+ "18": "ADJ#Adjective#Degree=Pos",
717
+ "19": "ADJ#Adjective#Degree=Pos|Foreign=Yes",
718
+ "20": "ADJ#Adjective#Degree=Sup",
719
+ "21": "ADJ#Adverb#Case=Nom|Definite=Ind|Degree=Pos|Gender=Com|Number=Sing",
720
+ "22": "ADJ#Adverb#Case=Nom|Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing",
721
+ "23": "ADJ#Adverb#Case=Nom|Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|Tense=Past|VerbForm=Part",
722
+ "24": "ADJ#Adverb#Case=Nom|Definite=Ind|Degree=Pos|Number=Plur",
723
+ "25": "ADJ#Noun#Case=Nom|Definite=Def|Degree=Pos",
724
+ "26": "ADJ#Noun#Case=Nom|Degree=Pos",
725
+ "27": "ADJ#Numeral#Case=Nom|Definite=Def|Degree=Pos",
726
+ "28": "ADJ#Numeral#Case=Nom|NumType=Ord",
727
+ "29": "ADJ#Numeral#Degree=Pos|NumForm=Digit|NumType=Ord",
728
+ "30": "ADJ#Numeral#Degree=Pos|NumForm=Word|NumType=Ord",
729
+ "31": "ADJ#Prefixoid#_",
730
+ "32": "ADJ#Verb#Case=Nom|Definite=Def|Degree=Pos|Tense=Past|VerbForm=Part",
731
+ "33": "ADJ#Verb#Case=Nom|Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|Tense=Past|VerbForm=Part",
732
+ "34": "ADJ#Verb#Case=Nom|Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|Tense=Past|VerbForm=Part",
733
+ "35": "ADJ#Verb#Case=Nom|Definite=Ind|Degree=Pos|Number=Plur",
734
+ "36": "ADJ#Verb#Case=Nom|Definite=Ind|Degree=Pos|Number=Plur|Tense=Past|VerbForm=Part",
735
+ "37": "ADJ#Verb#Case=Nom|Definite=Ind|Gender=Neut|Number=Sing|Tense=Past|VerbForm=Part",
736
+ "38": "ADJ#Verb#Case=Nom|Degree=Pos|Tense=Pres|VerbForm=Part",
737
+ "39": "ADJ#_#Case=Nom|Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing",
738
+ "40": "ADJ#_#Case=Nom|Definite=Ind|Degree=Pos|Number=Plur",
739
+ "41": "ADJ#_#Case=Nom|Definite=Ind|Degree=Pos|Number=Sing|Tense=Past|VerbForm=Part",
740
+ "42": "ADJ#_#Case=Nom|Degree=Pos",
741
+ "43": "ADJ#_#Degree=Cmp",
742
+ "44": "ADJ#_#Degree=Pos",
743
+ "45": "ADJ#_#Degree=Pos|NumType=Ord",
744
+ "46": "ADJ#_#Degree=Sup",
745
+ "47": "ADJ#_#_",
746
+ "48": "ADP#Adjective#_",
747
+ "49": "ADP#Adverb#_",
748
+ "50": "ADP#Conjunction#_",
749
+ "51": "ADP#Preposition#_",
750
+ "52": "ADP#_#_",
751
+ "53": "ADV#Adjective#Degree=Pos",
752
+ "54": "ADV#Adjective#_",
753
+ "55": "ADV#Adverb#Abbr=Yes",
754
+ "56": "ADV#Adverb#Degree=Cmp",
755
+ "57": "ADV#Adverb#Degree=Pos",
756
+ "58": "ADV#Adverb#Degree=Pos|NumType=Mult",
757
+ "59": "ADV#Adverb#Degree=Sup",
758
+ "60": "ADV#Adverb#Degree=Sup|Polarity=Neg",
759
+ "61": "ADV#Adverb#NumType=Mult",
760
+ "62": "ADV#Adverb#Polarity=Neg",
761
+ "63": "ADV#Adverb#PronType=Dem",
762
+ "64": "ADV#Adverb#_",
763
+ "65": "ADV#Conjunction#_",
764
+ "66": "ADV#Invariable#Degree=Cmp",
765
+ "67": "ADV#Invariable#Degree=Sup",
766
+ "68": "ADV#Invariable#_",
767
+ "69": "ADV#Noun#_",
768
+ "70": "ADV#Prefixoid#_",
769
+ "71": "ADV#Pronoun#Definite=Ind|Gender=Neut|Number=Sing|PronType=Prs",
770
+ "72": "ADV#Pronoun#_",
771
+ "73": "ADV#_#Degree=Cmp",
772
+ "74": "ADV#_#Degree=Pos",
773
+ "75": "ADV#_#Degree=Sup",
774
+ "76": "ADV#_#NumType=Mult",
775
+ "77": "ADV#_#PronType=Dem",
776
+ "78": "ADV#_#PronType=Int",
777
+ "79": "ADV#_#_",
778
+ "80": "AUX#Verb#Mood=Ind|Number=Plur|Person=1|Tense=Past|VerbForm=Fin",
779
+ "81": "AUX#Verb#Mood=Ind|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
780
+ "82": "AUX#Verb#Mood=Ind|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
781
+ "83": "AUX#Verb#Mood=Ind|Number=Plur|Person=3|Tense=Past|VerbForm=Fin",
782
+ "84": "AUX#Verb#Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
783
+ "85": "AUX#Verb#Mood=Ind|Number=Sing|Person=1|Tense=Past|VerbForm=Fin",
784
+ "86": "AUX#Verb#Mood=Ind|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
785
+ "87": "AUX#Verb#Mood=Ind|Number=Sing|Person=2|Tense=Past|VerbForm=Fin",
786
+ "88": "AUX#Verb#Mood=Ind|Number=Sing|Person=2|Tense=Pres|VerbForm=Fin",
787
+ "89": "AUX#Verb#Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin",
788
+ "90": "AUX#Verb#Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
789
+ "91": "AUX#Verb#Mood=Ind|Tense=Past|VerbForm=Fin|Voice=Act",
790
+ "92": "AUX#Verb#Mood=Ind|Tense=Pres|VerbForm=Fin|Voice=Act",
791
+ "93": "AUX#Verb#Mood=Sub|Number=Plur|Person=1|Tense=Past|VerbForm=Fin",
792
+ "94": "AUX#Verb#Mood=Sub|Number=Plur|Tense=Past|VerbForm=Part",
793
+ "95": "AUX#Verb#Number=Plur|Tense=Past|VerbForm=Part",
794
+ "96": "AUX#Verb#Number=Plur|Tense=Pres|VerbForm=Part",
795
+ "97": "AUX#Verb#VerbForm=Fin",
796
+ "98": "AUX#Verb#VerbForm=Ger",
797
+ "99": "AUX#Verb#VerbForm=Inf",
798
+ "100": "AUX#Verb#VerbForm=Inf|Voice=Act",
799
+ "101": "AUX#Verb#VerbForm=Sup|Voice=Act",
800
+ "102": "AUX#_#Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
801
+ "103": "CCONJ#Conjunction#_",
802
+ "104": "CCONJ#_#_",
803
+ "105": "DET#Adjective#Gender=Com|Number=Sing|PronType=Tot",
804
+ "106": "DET#Adjective#Gender=Neut|Number=Sing|PronType=Tot",
805
+ "107": "DET#Adjective#Number=Plur|PronType=Tot",
806
+ "108": "DET#Adjective#PronType=Tot",
807
+ "109": "DET#Article#Definite=Def|Gender=Com|Number=Sing|PronType=Art",
808
+ "110": "DET#Article#Definite=Def|Gender=Neut|Number=Sing|PronType=Art",
809
+ "111": "DET#Article#Definite=Def|Number=Plur|PronType=Art",
810
+ "112": "DET#Article#Definite=Def|PronType=Art",
811
+ "113": "DET#Article#Definite=Ind|Gender=Com|Number=Sing|PronType=Art",
812
+ "114": "DET#Article#Definite=Ind|Gender=Neut|Number=Sing|PronType=Art",
813
+ "115": "DET#Article#Definite=Ind|Gender=Neut|Number=Sing|PronType=Artt",
814
+ "116": "DET#Article#Definite=Ind|PronType=Art",
815
+ "117": "DET#Conjunction#Definite=Def|PronType=Art",
816
+ "118": "DET#Numeral#Definite=Ind|Gender=Neut|Number=Sing|PronType=Art",
817
+ "119": "DET#Prefixoid#_",
818
+ "120": "DET#Pronoun#Definite=Def|Gender=Com|Number=Sing|PronType=Art",
819
+ "121": "DET#Pronoun#Definite=Def|Gender=Com|Number=Sing|PronType=Dem",
820
+ "122": "DET#Pronoun#Definite=Def|Gender=Neut|Number=Sing|PronType=Art",
821
+ "123": "DET#Pronoun#Definite=Def|Gender=Neut|Number=Sing|PronType=Dem",
822
+ "124": "DET#Pronoun#Definite=Def|Number=Plur|PronType=Art",
823
+ "125": "DET#Pronoun#Definite=Def|Number=Plur|PronType=Dem",
824
+ "126": "DET#Pronoun#Definite=Def|Number=Plur|PronType=Tot",
825
+ "127": "DET#Pronoun#Definite=Ind|Gender=Com|Number=Sing|PronType=Ind",
826
+ "128": "DET#Pronoun#Definite=Ind|Gender=Com|Number=Sing|PronType=Int",
827
+ "129": "DET#Pronoun#Definite=Ind|Gender=Neut|Number=Sing|PronType=Ind",
828
+ "130": "DET#Pronoun#Definite=Ind|Gender=Neut|Number=Sing|PronType=Int",
829
+ "131": "DET#Pronoun#Definite=Ind|Gender=Neut|Number=Sing|PronType=Tot",
830
+ "132": "DET#Pronoun#Definite=Ind|Number=Plur|PronType=Ind",
831
+ "133": "DET#Pronoun#Definite=Ind|Number=Sing|PronType=Tot",
832
+ "134": "DET#Pronoun#Number=Plur|PronType=Dem",
833
+ "135": "DET#Pronoun#Number=Sing|PronType=Dem",
834
+ "136": "DET#Pronoun#Polarity=Neg",
835
+ "137": "DET#Pronoun#PronType=Ind",
836
+ "138": "DET#Pronoun#PronType=Int",
837
+ "139": "DET#Pronoun#PronType=Rel",
838
+ "140": "DET#Pronoun#PronType=Tot",
839
+ "141": "DET#Pronoun#_",
840
+ "142": "DET#_#Definite=Def|PronType=Art",
841
+ "143": "DET#_#Definite=EMPTY",
842
+ "144": "DET#_#Definite=Ind|PronType=Art",
843
+ "145": "DET#_#Gender=Neut|Number=Sing|PronType=Tot",
844
+ "146": "DET#_#Number=Sing|PronType=Dem",
845
+ "147": "DET#_#PronType=Int",
846
+ "148": "DET#_#PronType=Neg",
847
+ "149": "DET#_#PronType=Rcp",
848
+ "150": "DET#_#PronType=Tot",
849
+ "151": "DET#_#_",
850
+ "152": "INTJ#Interjection#_",
851
+ "153": "NOUN#Adverb#Number=Sing",
852
+ "154": "NOUN#Noun#Abbr=Yes",
853
+ "155": "NOUN#Noun#Abbr=Yes|Number=Plur",
854
+ "156": "NOUN#Noun#Abbr=Yes|Number=Sing",
855
+ "157": "NOUN#Noun#Case=Gen|Definite=Def|Gender=Com|Number=Plur",
856
+ "158": "NOUN#Noun#Case=Gen|Definite=Def|Gender=Com|Number=Sing",
857
+ "159": "NOUN#Noun#Case=Gen|Definite=Def|Gender=Neut|Number=Plur",
858
+ "160": "NOUN#Noun#Case=Gen|Definite=Def|Gender=Neut|Number=Sing",
859
+ "161": "NOUN#Noun#Case=Gen|Definite=Ind|Gender=Com|Number=Plur",
860
+ "162": "NOUN#Noun#Case=Gen|Definite=Ind|Gender=Neut|Number=Plur",
861
+ "163": "NOUN#Noun#Case=Gen|Definite=Ind|Gender=Neut|Number=Sing",
862
+ "164": "NOUN#Noun#Case=Nom|Definite=Def|Gender=Com|Number=Plur",
863
+ "165": "NOUN#Noun#Case=Nom|Definite=Def|Gender=Com|Number=Sing",
864
+ "166": "NOUN#Noun#Case=Nom|Definite=Def|Gender=Neut|Number=Plur",
865
+ "167": "NOUN#Noun#Case=Nom|Definite=Def|Gender=Neut|Number=Sing",
866
+ "168": "NOUN#Noun#Case=Nom|Definite=Ind|Gender=Com|Number=Plur",
867
+ "169": "NOUN#Noun#Case=Nom|Definite=Ind|Gender=Com|Number=Sing",
868
+ "170": "NOUN#Noun#Case=Nom|Definite=Ind|Gender=Neut|Number=Plur",
869
+ "171": "NOUN#Noun#Case=Nom|Definite=Ind|Gender=Neut|Number=Sing",
870
+ "172": "NOUN#Noun#Case=Nom|Definite=Ind|Gender=Neut|Number=Singg",
871
+ "173": "NOUN#Noun#Gender=Com",
872
+ "174": "NOUN#Noun#NumType=Frac|Number=Sing",
873
+ "175": "NOUN#Noun#Number=Plur",
874
+ "176": "NOUN#Noun#Number=Sing",
875
+ "177": "NOUN#Noun#Number=Sing|Polarity=Neg",
876
+ "178": "NOUN#Noun#VerbForm=Fin",
877
+ "179": "NOUN#Noun#_",
878
+ "180": "NOUN#Prefixoid#Number=Sing",
879
+ "181": "NOUN#Prefixoid#_",
880
+ "182": "NOUN#_#Case=Nom|Definite=Def|Gender=Com|Number=Sing",
881
+ "183": "NOUN#_#Case=Nom|Definite=Def|Gender=Neut|Number=Sing",
882
+ "184": "NOUN#_#Case=Nom|Definite=Ind|Gender=Com|Number=Sing",
883
+ "185": "NOUN#_#Case=Nom|Definite=Ind|Gender=Neut|Number=Sing",
884
+ "186": "NOUN#_#Number=Plur",
885
+ "187": "NOUN#_#Number=Sing",
886
+ "188": "NUM#Article#Case=Nom|Definite=Ind|Gender=Com|Number=Sing|NumType=Card",
887
+ "189": "NUM#Noun#Case=Nom|NumType=Card",
888
+ "190": "NUM#Noun#NumForm=Word|NumType=Card",
889
+ "191": "NUM#Numeral#Case=Nom|Definite=Ind|Gender=Com|Number=Sing|NumType=Card",
890
+ "192": "NUM#Numeral#Case=Nom|NumType=Card",
891
+ "193": "NUM#Numeral#NumForm=Digit|NumType=Card",
892
+ "194": "NUM#Numeral#NumForm=Digit|NumType=Frac",
893
+ "195": "NUM#Numeral#NumForm=Roman|NumType=Card",
894
+ "196": "NUM#Numeral#NumForm=Word|NumType=Card",
895
+ "197": "NUM#Numeral#NumType=Card",
896
+ "198": "NUM#Numeral#_",
897
+ "199": "NUM#_#Degree=Pos|NumType=Ord",
898
+ "200": "NUM#_#NumType=Card",
899
+ "201": "PART#Particle#Polarity=Neg",
900
+ "202": "PART#Particle#_",
901
+ "203": "PART#Preposition#_",
902
+ "204": "PART#_#Polarity=Neg",
903
+ "205": "PART#_#_",
904
+ "206": "PPROPN#_#Number=Plur",
905
+ "207": "PRON#Adjective#Definite=Ind|Number=Plur|PronType=Ind",
906
+ "208": "PRON#Adjective#Definite=Ind|Number=Plur|PronType=Tot",
907
+ "209": "PRON#Adverb#Definite=Def|Gender=Neut|Number=Sing|PronType=Prs",
908
+ "210": "PRON#Adverb#Definite=Ind|Gender=Neut|Number=Sing|PronType=Ind",
909
+ "211": "PRON#Adverb#_",
910
+ "212": "PRON#Article#Case=Nom|Definite=Def|Number=Plur|PronType=Prs",
911
+ "213": "PRON#Conjunction#Definite=Ind|Gender=Neut|Number=Sing|PronType=Int",
912
+ "214": "PRON#Conjunction#PronType=Rel",
913
+ "215": "PRON#Noun#Case=Nom|Definite=Ind|Gender=Com|Number=Sing|PronType=Ind",
914
+ "216": "PRON#Noun#Definite=Def|Gender=Com|Number=Sing|PronType=Prs",
915
+ "217": "PRON#Noun#Definite=Def|Number=Plur|PronType=Prs",
916
+ "218": "PRON#Noun#Definite=Ind|Number=Plur|PronType=Ind",
917
+ "219": "PRON#Numeral#Definite=Ind|Gender=Com|Number=Sing|PronType=Prs",
918
+ "220": "PRON#Numeral#Definite=Ind|Gender=Neut|Number=Sing|PronType=Prs",
919
+ "221": "PRON#Pronoun#Case=Acc|Definite=Def|Gender=Com|Number=Plur|PronType=Prs",
920
+ "222": "PRON#Pronoun#Case=Acc|Definite=Def|Gender=Com|Number=Sing|PronType=Prs",
921
+ "223": "PRON#Pronoun#Case=Acc|Definite=Def|Number=Plur|PronType=Prs",
922
+ "224": "PRON#Pronoun#Case=Acc|Definite=Def|PronType=Prs",
923
+ "225": "PRON#Pronoun#Case=Acc|Gender=Fem|Number=Sing|Person=3|PronType=Prs",
924
+ "226": "PRON#Pronoun#Case=Acc|Gender=Fem|Number=Sing|Person=3|PronType=Prs|Reflex=Yes",
925
+ "227": "PRON#Pronoun#Case=Acc|Gender=Masc|Number=Sing|Person=3|PronType=Prs",
926
+ "228": "PRON#Pronoun#Case=Acc|Gender=Masc|Number=Sing|Person=3|PronType=Prs|Reflex=Yes",
927
+ "229": "PRON#Pronoun#Case=Acc|Gender=Neut|Number=Sing|Person=3|PronType=Prs",
928
+ "230": "PRON#Pronoun#Case=Acc|Gender=Neut|Number=Sing|Person=3|PronType=Prs|Reflex=Yes",
929
+ "231": "PRON#Pronoun#Case=Acc|Number=Plur|Person=1|PronType=Prs",
930
+ "232": "PRON#Pronoun#Case=Acc|Number=Plur|Person=1|PronType=Prs|Reflex=Yes",
931
+ "233": "PRON#Pronoun#Case=Acc|Number=Plur|Person=2|PronType=Prs",
932
+ "234": "PRON#Pronoun#Case=Acc|Number=Plur|Person=3|PronType=Prs",
933
+ "235": "PRON#Pronoun#Case=Acc|Number=Plur|Person=3|PronType=Prs|Reflex=Yes",
934
+ "236": "PRON#Pronoun#Case=Acc|Number=Sing|Person=1|PronType=Prs",
935
+ "237": "PRON#Pronoun#Case=Acc|Number=Sing|Person=2|PronType=Prs",
936
+ "238": "PRON#Pronoun#Case=Acc|Number=Sing|Person=2|PronType=Prs|Reflex=Yes",
937
+ "239": "PRON#Pronoun#Case=Gen|Definite=Def|Gender=Com|Number=Sing|Poss=Yes|PronType=Prs",
938
+ "240": "PRON#Pronoun#Case=Gen|Gender=Fem|Number=Sing|Person=3|Poss=Yes|PronType=Prs",
939
+ "241": "PRON#Pronoun#Case=Gen|Gender=Masc|Number=Sing|Person=3|Poss=Yes|PronType=Prs",
940
+ "242": "PRON#Pronoun#Case=Gen|Gender=Neut|Number=Sing|Person=3|Poss=Yes|PronType=Prs",
941
+ "243": "PRON#Pronoun#Case=Gen|Number=Plur|Person=1|Poss=Yes|PronType=Prs",
942
+ "244": "PRON#Pronoun#Case=Gen|Number=Plur|Person=3|Poss=Yes|PronType=Prs",
943
+ "245": "PRON#Pronoun#Case=Gen|Number=Sing|Person=1|Poss=Yes|PronType=Prs",
944
+ "246": "PRON#Pronoun#Case=Gen|Number=Sing|Person=2|Poss=Yes|PronType=Prs",
945
+ "247": "PRON#Pronoun#Case=Nom|Definite=Def|Gender=Com|Number=Plur|PronType=Prs",
946
+ "248": "PRON#Pronoun#Case=Nom|Definite=Def|Gender=Com|Number=Sing|PronType=Prs",
947
+ "249": "PRON#Pronoun#Case=Nom|Definite=Def|Number=Plur|PronType=Prs",
948
+ "250": "PRON#Pronoun#Case=Nom|Definite=Ind|Gender=Com|Number=Sing|PronType=Ind",
949
+ "251": "PRON#Pronoun#Case=Nom|Definite=Ind|Gender=Com|Number=Sing|PronType=Rel",
950
+ "252": "PRON#Pronoun#Case=Nom|Gender=Fem|Number=Sing|Person=3|PronType=Prs",
951
+ "253": "PRON#Pronoun#Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs",
952
+ "254": "PRON#Pronoun#Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs|Reflex=Yes",
953
+ "255": "PRON#Pronoun#Case=Nom|Gender=Neut|Number=Sing|Person=3|PronType=Prs",
954
+ "256": "PRON#Pronoun#Case=Nom|Gender=Neut|Number=Sing|Person=3|PronType=Prs|Reflex=Yes",
955
+ "257": "PRON#Pronoun#Case=Nom|Number=Plur|Person=1|PronType=Prs",
956
+ "258": "PRON#Pronoun#Case=Nom|Number=Plur|Person=2|PronType=Prs",
957
+ "259": "PRON#Pronoun#Case=Nom|Number=Plur|Person=3|PronType=Prs",
958
+ "260": "PRON#Pronoun#Case=Nom|Number=Plur|Person=3|PronType=Prs|Reflex=Yes",
959
+ "261": "PRON#Pronoun#Case=Nom|Number=Sing|Person=1|PronType=Prs",
960
+ "262": "PRON#Pronoun#Case=Nom|Number=Sing|Person=2|PronType=Prs",
961
+ "263": "PRON#Pronoun#Definite=Def|Gender=Com|Number=Sing|Poss=Yes|PronType=Prs",
962
+ "264": "PRON#Pronoun#Definite=Def|Gender=Com|Number=Sing|PronType=Prs",
963
+ "265": "PRON#Pronoun#Definite=Def|Gender=Neut|Number=Sing|Poss=Yes|PronType=Prs",
964
+ "266": "PRON#Pronoun#Definite=Def|Gender=Neut|Number=Sing|PronType=Dem",
965
+ "267": "PRON#Pronoun#Definite=Def|Gender=Neut|Number=Sing|PronType=Prs",
966
+ "268": "PRON#Pronoun#Definite=Def|Number=Plur|Poss=Yes|PronType=Prs",
967
+ "269": "PRON#Pronoun#Definite=Def|Number=Plur|PronType=Dem",
968
+ "270": "PRON#Pronoun#Definite=Def|Number=Plur|PronType=Prs",
969
+ "271": "PRON#Pronoun#Definite=Def|Poss=Yes|PronType=Prs",
970
+ "272": "PRON#Pronoun#Definite=Ind|Gender=Com|Number=Sing|PronType=Ind",
971
+ "273": "PRON#Pronoun#Definite=Ind|Gender=Neut|Number=Sing|PronType=Ind",
972
+ "274": "PRON#Pronoun#Definite=Ind|Gender=Neut|Number=Sing|PronType=Int",
973
+ "275": "PRON#Pronoun#Definite=Ind|Gender=Neut|Number=Sing|PronType=Neg",
974
+ "276": "PRON#Pronoun#Definite=Ind|Gender=Neut|Number=Sing|PronType=Prs",
975
+ "277": "PRON#Pronoun#Definite=Ind|Gender=Neut|Number=Sing|PronType=Rel",
976
+ "278": "PRON#Pronoun#Definite=Ind|Gender=Neut|Number=Sing|PronType=Tot",
977
+ "279": "PRON#Pronoun#Definite=Ind|Number=Plur|PronType=Rel",
978
+ "280": "PRON#Pronoun#Number=Plur",
979
+ "281": "PRON#Pronoun#Number=Plur|PronType=Dem",
980
+ "282": "PRON#Pronoun#Number=Plur|PronType=Tot",
981
+ "283": "PRON#Pronoun#Number=Sing",
982
+ "284": "PRON#Pronoun#Number=Sing|Polarity=Neg|PronType=Neg",
983
+ "285": "PRON#Pronoun#Number=Sing|PronType=Dem",
984
+ "286": "PRON#Pronoun#Number=Sing|PronType=Ind",
985
+ "287": "PRON#Pronoun#Number=Sing|PronType=Neg",
986
+ "288": "PRON#Pronoun#Number=Sing|Reflex=Yes",
987
+ "289": "PRON#Pronoun#PronType=Ind",
988
+ "290": "PRON#Pronoun#PronType=Int",
989
+ "291": "PRON#Pronoun#PronType=Rel",
990
+ "292": "PRON#Pronoun#_",
991
+ "293": "PRON#Verb#Definite=Def|Gender=Neut|Number=Sing|Poss=Yes|PronType=Prs",
992
+ "294": "PRON#_#Case=Acc|Definite=Def|PronType=Prs",
993
+ "295": "PRON#_#Definite=Ind|Gender=Neut|Number=Sing|PronType=Ind",
994
+ "296": "PRON#_#Definite=Ind|Gender=Neut|Number=Sing|PronType=Prs",
995
+ "297": "PRON#_#Gender=Neut|Number=Sing|Person=3|Poss=Yes|PronType=Prs",
996
+ "298": "PRON#_#Number=Sing",
997
+ "299": "PRON#_#Number=Sing|PronType=Dem",
998
+ "300": "PRON#_#Number=Sing|PronType=Ind",
999
+ "301": "PRON#_#PronType=Int",
1000
+ "302": "PRON#_#PronType=Rel",
1001
+ "303": "PROPN#Noun#Abbr=Yes|Number=Plur",
1002
+ "304": "PROPN#Noun#Abbr=Yes|Number=Sing",
1003
+ "305": "PROPN#Noun#Case=Gen",
1004
+ "306": "PROPN#Noun#Case=Nom",
1005
+ "307": "PROPN#Noun#Case=Nom|Definite=Ind|Gender=Com|Number=Sing",
1006
+ "308": "PROPN#Noun#Number=Plur",
1007
+ "309": "PROPN#Noun#Number=Sing",
1008
+ "310": "PROPN#Noun#Number=Sing|Polarity=Neg",
1009
+ "311": "PROPN#Noun#PronType=Dem",
1010
+ "312": "PROPN#Noun#VerbForm=Fin",
1011
+ "313": "PROPN#Prefixoid#Number=Sing",
1012
+ "314": "PROPN#_#Abbr=Yes",
1013
+ "315": "PROPN#_#Number=Plur",
1014
+ "316": "PROPN#_#Number=Sing",
1015
+ "317": "PUNCT#PUNCT#_",
1016
+ "318": "PUNCT#_#_",
1017
+ "319": "Prefixoid#Prefixoid#_",
1018
+ "320": "SCONJ#Conjunction#_",
1019
+ "321": "SCONJ#Preposition#_",
1020
+ "322": "SCONJ#Pronoun#Definite=Ind|Gender=Neut|Number=Sing|PronType=Int",
1021
+ "323": "SCONJ#_#_",
1022
+ "324": "SYM#Conjunction#_",
1023
+ "325": "SYM#Noun#Number=Sing",
1024
+ "326": "SYM#Noun#_",
1025
+ "327": "VERB#Adjective#Case=Nom|Number=Plur|Tense=Past|VerbForm=Part|Voice=Pass",
1026
+ "328": "VERB#Verb#Case=Nom|Number=Plur|Tense=Past|VerbForm=Part|Voice=Pass",
1027
+ "329": "VERB#Verb#Mood=Imp|VerbForm=Fin|Voice=Act",
1028
+ "330": "VERB#Verb#Mood=Imp|VerbForm=Inf",
1029
+ "331": "VERB#Verb#Mood=Ind|Number=Plur|Person=1|Tense=Past|VerbForm=Fin",
1030
+ "332": "VERB#Verb#Mood=Ind|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
1031
+ "333": "VERB#Verb#Mood=Ind|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
1032
+ "334": "VERB#Verb#Mood=Ind|Number=Plur|Person=3|Tense=Past|VerbForm=Fin",
1033
+ "335": "VERB#Verb#Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
1034
+ "336": "VERB#Verb#Mood=Ind|Number=Sing|Person=1|Tense=Past|VerbForm=Fin",
1035
+ "337": "VERB#Verb#Mood=Ind|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin",
1036
+ "338": "VERB#Verb#Mood=Ind|Number=Sing|Person=2|Tense=Past|VerbForm=Fin",
1037
+ "339": "VERB#Verb#Mood=Ind|Number=Sing|Person=2|Tense=Pres|VerbForm=Fin",
1038
+ "340": "VERB#Verb#Mood=Ind|Number=Sing|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin",
1039
+ "341": "VERB#Verb#Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin",
1040
+ "342": "VERB#Verb#Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin",
1041
+ "343": "VERB#Verb#Mood=Ind|Tense=Past|VerbForm=Fin",
1042
+ "344": "VERB#Verb#Mood=Ind|Tense=Past|VerbForm=Fin|Voice=Act",
1043
+ "345": "VERB#Verb#Mood=Ind|Tense=Past|VerbForm=Fin|Voice=Pass",
1044
+ "346": "VERB#Verb#Mood=Ind|Tense=Pres|VerbForm=Fin",
1045
+ "347": "VERB#Verb#Mood=Ind|Tense=Pres|VerbForm=Fin|Voice=Act",
1046
+ "348": "VERB#Verb#Mood=Ind|Tense=Pres|VerbForm=Fin|Voice=Pass",
1047
+ "349": "VERB#Verb#Mood=Sub|Number=Plur|Person=1|Tense=Past|VerbForm=Fin",
1048
+ "350": "VERB#Verb#Mood=Sub|Tense=Past|VerbForm=Part",
1049
+ "351": "VERB#Verb#Mood=Sub|Tense=Past|VerbForm=Part|Voice=Pass",
1050
+ "352": "VERB#Verb#Mood=Sub|VerbForm=Inf",
1051
+ "353": "VERB#Verb#Person=1|Tense=Past|VerbForm=Part",
1052
+ "354": "VERB#Verb#Person=1|Tense=Past|VerbForm=Part|Voice=Pass",
1053
+ "355": "VERB#Verb#Person=1|Tense=Pres|VerbForm=Ger",
1054
+ "356": "VERB#Verb#Person=1|Tense=Pres|VerbForm=Inf",
1055
+ "357": "VERB#Verb#Person=1|Tense=Pres|VerbForm=Part",
1056
+ "358": "VERB#Verb#Person=2|Tense=Pres|VerbForm=Inf",
1057
+ "359": "VERB#Verb#Tense=Past|VerbForm=Part",
1058
+ "360": "VERB#Verb#Tense=Past|VerbForm=Part|Voice=Pass",
1059
+ "361": "VERB#Verb#Tense=Pres|VerbForm=Part",
1060
+ "362": "VERB#Verb#VerbForm=Fin",
1061
+ "363": "VERB#Verb#VerbForm=Ger",
1062
+ "364": "VERB#Verb#VerbForm=Inf",
1063
+ "365": "VERB#Verb#VerbForm=Inf|Voice=Act",
1064
+ "366": "VERB#Verb#VerbForm=Inf|Voice=Pass",
1065
+ "367": "VERB#Verb#VerbForm=Sup",
1066
+ "368": "VERB#Verb#VerbForm=Sup|Voice=Act",
1067
+ "369": "VERB#Verb#VerbForm=Sup|Voice=Pass",
1068
+ "370": "VERB#_#Mood=Ind|Tense=Past|VerbForm=Fin",
1069
+ "371": "VERB#_#Tense=Past|VerbForm=Part",
1070
+ "372": "VERB#_#VerbForm=Ger",
1071
+ "373": "VERB#_#VerbForm=Inf",
1072
+ "374": "X#_#Foreign=Yes",
1073
+ "375": "X#_#Typo=Yes",
1074
+ "376": "X#_#_",
1075
+ "377": "X#_#foreign=Yes"
1076
+ },
1077
+ "lemma_rule": {
1078
+ "0": "cut_prefix=0|cut_suffix=0|append_suffix=",
1079
+ "1": "cut_prefix=0|cut_suffix=0|append_suffix='",
1080
+ "2": "cut_prefix=0|cut_suffix=0|append_suffix=.",
1081
+ "3": "cut_prefix=0|cut_suffix=0|append_suffix=a",
1082
+ "4": "cut_prefix=0|cut_suffix=0|append_suffix=d",
1083
+ "5": "cut_prefix=0|cut_suffix=0|append_suffix=e",
1084
+ "6": "cut_prefix=0|cut_suffix=0|append_suffix=ma",
1085
+ "7": "cut_prefix=0|cut_suffix=0|append_suffix=n",
1086
+ "8": "cut_prefix=0|cut_suffix=0|append_suffix=o",
1087
+ "9": "cut_prefix=0|cut_suffix=0|append_suffix=s",
1088
+ "10": "cut_prefix=0|cut_suffix=0|append_suffix=t",
1089
+ "11": "cut_prefix=0|cut_suffix=0|append_suffix=y",
1090
+ "12": "cut_prefix=0|cut_suffix=11|append_suffix=#url",
1091
+ "13": "cut_prefix=0|cut_suffix=12|append_suffix=#url",
1092
+ "14": "cut_prefix=0|cut_suffix=14|append_suffix=#url",
1093
+ "15": "cut_prefix=0|cut_suffix=1|append_suffix=",
1094
+ "16": "cut_prefix=0|cut_suffix=1|append_suffix=a",
1095
+ "17": "cut_prefix=0|cut_suffix=1|append_suffix=ad",
1096
+ "18": "cut_prefix=0|cut_suffix=1|append_suffix=as",
1097
+ "19": "cut_prefix=0|cut_suffix=1|append_suffix=be",
1098
+ "20": "cut_prefix=0|cut_suffix=1|append_suffix=d",
1099
+ "21": "cut_prefix=0|cut_suffix=1|append_suffix=e",
1100
+ "22": "cut_prefix=0|cut_suffix=1|append_suffix=ed",
1101
+ "23": "cut_prefix=0|cut_suffix=1|append_suffix=en",
1102
+ "24": "cut_prefix=0|cut_suffix=1|append_suffix=et",
1103
+ "25": "cut_prefix=0|cut_suffix=1|append_suffix=g",
1104
+ "26": "cut_prefix=0|cut_suffix=1|append_suffix=ght",
1105
+ "27": "cut_prefix=0|cut_suffix=1|append_suffix=have",
1106
+ "28": "cut_prefix=0|cut_suffix=1|append_suffix=ill",
1107
+ "29": "cut_prefix=0|cut_suffix=1|append_suffix=ja",
1108
+ "30": "cut_prefix=0|cut_suffix=1|append_suffix=n",
1109
+ "31": "cut_prefix=0|cut_suffix=1|append_suffix=na",
1110
+ "32": "cut_prefix=0|cut_suffix=1|append_suffix=o",
1111
+ "33": "cut_prefix=0|cut_suffix=1|append_suffix=ola",
1112
+ "34": "cut_prefix=0|cut_suffix=1|append_suffix=on",
1113
+ "35": "cut_prefix=0|cut_suffix=1|append_suffix=ot",
1114
+ "36": "cut_prefix=0|cut_suffix=1|append_suffix=um",
1115
+ "37": "cut_prefix=0|cut_suffix=1|append_suffix=ve",
1116
+ "38": "cut_prefix=0|cut_suffix=1|append_suffix=y",
1117
+ "39": "cut_prefix=0|cut_suffix=1|append_suffix=ym",
1118
+ "40": "cut_prefix=0|cut_suffix=1|append_suffix=\u00e9",
1119
+ "41": "cut_prefix=0|cut_suffix=1|append_suffix=\u014d",
1120
+ "42": "cut_prefix=0|cut_suffix=20|append_suffix=",
1121
+ "43": "cut_prefix=0|cut_suffix=2|append_suffix=",
1122
+ "44": "cut_prefix=0|cut_suffix=2|append_suffix=$",
1123
+ "45": "cut_prefix=0|cut_suffix=2|append_suffix=a",
1124
+ "46": "cut_prefix=0|cut_suffix=2|append_suffix=an",
1125
+ "47": "cut_prefix=0|cut_suffix=2|append_suffix=ara",
1126
+ "48": "cut_prefix=0|cut_suffix=2|append_suffix=ave",
1127
+ "49": "cut_prefix=0|cut_suffix=2|append_suffix=aw",
1128
+ "50": "cut_prefix=0|cut_suffix=2|append_suffix=be",
1129
+ "51": "cut_prefix=0|cut_suffix=2|append_suffix=dd",
1130
+ "52": "cut_prefix=0|cut_suffix=2|append_suffix=e",
1131
+ "53": "cut_prefix=0|cut_suffix=2|append_suffix=ee",
1132
+ "54": "cut_prefix=0|cut_suffix=2|append_suffix=el",
1133
+ "55": "cut_prefix=0|cut_suffix=2|append_suffix=en",
1134
+ "56": "cut_prefix=0|cut_suffix=2|append_suffix=ep",
1135
+ "57": "cut_prefix=0|cut_suffix=2|append_suffix=er",
1136
+ "58": "cut_prefix=0|cut_suffix=2|append_suffix=et",
1137
+ "59": "cut_prefix=0|cut_suffix=2|append_suffix=g",
1138
+ "60": "cut_prefix=0|cut_suffix=2|append_suffix=have",
1139
+ "61": "cut_prefix=0|cut_suffix=2|append_suffix=i",
1140
+ "62": "cut_prefix=0|cut_suffix=2|append_suffix=ig",
1141
+ "63": "cut_prefix=0|cut_suffix=2|append_suffix=igga",
1142
+ "64": "cut_prefix=0|cut_suffix=2|append_suffix=in",
1143
+ "65": "cut_prefix=0|cut_suffix=2|append_suffix=is",
1144
+ "66": "cut_prefix=0|cut_suffix=2|append_suffix=it",
1145
+ "67": "cut_prefix=0|cut_suffix=2|append_suffix=ja",
1146
+ "68": "cut_prefix=0|cut_suffix=2|append_suffix=ke",
1147
+ "69": "cut_prefix=0|cut_suffix=2|append_suffix=l",
1148
+ "70": "cut_prefix=0|cut_suffix=2|append_suffix=mal",
1149
+ "71": "cut_prefix=0|cut_suffix=2|append_suffix=n",
1150
+ "72": "cut_prefix=0|cut_suffix=2|append_suffix=na",
1151
+ "73": "cut_prefix=0|cut_suffix=2|append_suffix=ny",
1152
+ "74": "cut_prefix=0|cut_suffix=2|append_suffix=o",
1153
+ "75": "cut_prefix=0|cut_suffix=2|append_suffix=on",
1154
+ "76": "cut_prefix=0|cut_suffix=2|append_suffix=ose",
1155
+ "77": "cut_prefix=0|cut_suffix=2|append_suffix=ot",
1156
+ "78": "cut_prefix=0|cut_suffix=2|append_suffix=ow",
1157
+ "79": "cut_prefix=0|cut_suffix=2|append_suffix=u",
1158
+ "80": "cut_prefix=0|cut_suffix=2|append_suffix=um",
1159
+ "81": "cut_prefix=0|cut_suffix=2|append_suffix=un",
1160
+ "82": "cut_prefix=0|cut_suffix=2|append_suffix=unna",
1161
+ "83": "cut_prefix=0|cut_suffix=2|append_suffix=we",
1162
+ "84": "cut_prefix=0|cut_suffix=2|append_suffix=y",
1163
+ "85": "cut_prefix=0|cut_suffix=2|append_suffix=ycket",
1164
+ "86": "cut_prefix=0|cut_suffix=2|append_suffix=yda",
1165
+ "87": "cut_prefix=0|cut_suffix=2|append_suffix=yta",
1166
+ "88": "cut_prefix=0|cut_suffix=2|append_suffix=\u00e5",
1167
+ "89": "cut_prefix=0|cut_suffix=2|append_suffix=\u00e5ta",
1168
+ "90": "cut_prefix=0|cut_suffix=2|append_suffix=\u00e8s",
1169
+ "91": "cut_prefix=0|cut_suffix=2|append_suffix=\u00e9o",
1170
+ "92": "cut_prefix=0|cut_suffix=3|append_suffix=",
1171
+ "93": "cut_prefix=0|cut_suffix=3|append_suffix=-up",
1172
+ "94": "cut_prefix=0|cut_suffix=3|append_suffix=a",
1173
+ "95": "cut_prefix=0|cut_suffix=3|append_suffix=ake",
1174
+ "96": "cut_prefix=0|cut_suffix=3|append_suffix=an",
1175
+ "97": "cut_prefix=0|cut_suffix=3|append_suffix=and",
1176
+ "98": "cut_prefix=0|cut_suffix=3|append_suffix=and_annat",
1177
+ "99": "cut_prefix=0|cut_suffix=3|append_suffix=any",
1178
+ "100": "cut_prefix=0|cut_suffix=3|append_suffix=as",
1179
+ "101": "cut_prefix=0|cut_suffix=3|append_suffix=at",
1180
+ "102": "cut_prefix=0|cut_suffix=3|append_suffix=be",
1181
+ "103": "cut_prefix=0|cut_suffix=3|append_suffix=e",
1182
+ "104": "cut_prefix=0|cut_suffix=3|append_suffix=eak",
1183
+ "105": "cut_prefix=0|cut_suffix=3|append_suffix=eal",
1184
+ "106": "cut_prefix=0|cut_suffix=3|append_suffix=ear",
1185
+ "107": "cut_prefix=0|cut_suffix=3|append_suffix=ell",
1186
+ "108": "cut_prefix=0|cut_suffix=3|append_suffix=er",
1187
+ "109": "cut_prefix=0|cut_suffix=3|append_suffix=f",
1188
+ "110": "cut_prefix=0|cut_suffix=3|append_suffix=fe",
1189
+ "111": "cut_prefix=0|cut_suffix=3|append_suffix=i",
1190
+ "112": "cut_prefix=0|cut_suffix=3|append_suffix=ick",
1191
+ "113": "cut_prefix=0|cut_suffix=3|append_suffix=ike",
1192
+ "114": "cut_prefix=0|cut_suffix=3|append_suffix=ine",
1193
+ "115": "cut_prefix=0|cut_suffix=3|append_suffix=ink",
1194
+ "116": "cut_prefix=0|cut_suffix=3|append_suffix=is",
1195
+ "117": "cut_prefix=0|cut_suffix=3|append_suffix=ite",
1196
+ "118": "cut_prefix=0|cut_suffix=3|append_suffix=ive",
1197
+ "119": "cut_prefix=0|cut_suffix=3|append_suffix=jag",
1198
+ "120": "cut_prefix=0|cut_suffix=3|append_suffix=liten",
1199
+ "121": "cut_prefix=0|cut_suffix=3|append_suffix=m",
1200
+ "122": "cut_prefix=0|cut_suffix=3|append_suffix=nan",
1201
+ "123": "cut_prefix=0|cut_suffix=3|append_suffix=nna",
1202
+ "124": "cut_prefix=0|cut_suffix=3|append_suffix=ola",
1203
+ "125": "cut_prefix=0|cut_suffix=3|append_suffix=ome",
1204
+ "126": "cut_prefix=0|cut_suffix=3|append_suffix=oot",
1205
+ "127": "cut_prefix=0|cut_suffix=3|append_suffix=ose",
1206
+ "128": "cut_prefix=0|cut_suffix=3|append_suffix=r",
1207
+ "129": "cut_prefix=0|cut_suffix=3|append_suffix=ra",
1208
+ "130": "cut_prefix=0|cut_suffix=3|append_suffix=sia",
1209
+ "131": "cut_prefix=0|cut_suffix=3|append_suffix=uch",
1210
+ "132": "cut_prefix=0|cut_suffix=3|append_suffix=vi",
1211
+ "133": "cut_prefix=0|cut_suffix=3|append_suffix=y",
1212
+ "134": "cut_prefix=0|cut_suffix=3|append_suffix=ycket",
1213
+ "135": "cut_prefix=0|cut_suffix=3|append_suffix=ze",
1214
+ "136": "cut_prefix=0|cut_suffix=3|append_suffix=\u00e4ga",
1215
+ "137": "cut_prefix=0|cut_suffix=3|append_suffix=\u00e4gga",
1216
+ "138": "cut_prefix=0|cut_suffix=3|append_suffix=\u00e5",
1217
+ "139": "cut_prefix=0|cut_suffix=3|append_suffix=\u00e5_kallad",
1218
+ "140": "cut_prefix=0|cut_suffix=3|append_suffix=\u00e8ne",
1219
+ "141": "cut_prefix=0|cut_suffix=3|append_suffix=\u00e8re",
1220
+ "142": "cut_prefix=0|cut_suffix=4|append_suffix=",
1221
+ "143": "cut_prefix=0|cut_suffix=4|append_suffix=#url",
1222
+ "144": "cut_prefix=0|cut_suffix=4|append_suffix=-up",
1223
+ "145": "cut_prefix=0|cut_suffix=4|append_suffix=a",
1224
+ "146": "cut_prefix=0|cut_suffix=4|append_suffix=ader",
1225
+ "147": "cut_prefix=0|cut_suffix=4|append_suffix=all",
1226
+ "148": "cut_prefix=0|cut_suffix=4|append_suffix=an",
1227
+ "149": "cut_prefix=0|cut_suffix=4|append_suffix=ay",
1228
+ "150": "cut_prefix=0|cut_suffix=4|append_suffix=e",
1229
+ "151": "cut_prefix=0|cut_suffix=4|append_suffix=eak",
1230
+ "152": "cut_prefix=0|cut_suffix=4|append_suffix=eal",
1231
+ "153": "cut_prefix=0|cut_suffix=4|append_suffix=eeze",
1232
+ "154": "cut_prefix=0|cut_suffix=4|append_suffix=go",
1233
+ "155": "cut_prefix=0|cut_suffix=4|append_suffix=good",
1234
+ "156": "cut_prefix=0|cut_suffix=4|append_suffix=ie",
1235
+ "157": "cut_prefix=0|cut_suffix=4|append_suffix=ill",
1236
+ "158": "cut_prefix=0|cut_suffix=4|append_suffix=ind",
1237
+ "159": "cut_prefix=0|cut_suffix=4|append_suffix=ingly",
1238
+ "160": "cut_prefix=0|cut_suffix=4|append_suffix=ke",
1239
+ "161": "cut_prefix=0|cut_suffix=4|append_suffix=nment",
1240
+ "162": "cut_prefix=0|cut_suffix=4|append_suffix=ola",
1241
+ "163": "cut_prefix=0|cut_suffix=4|append_suffix=on",
1242
+ "164": "cut_prefix=0|cut_suffix=4|append_suffix=or",
1243
+ "165": "cut_prefix=0|cut_suffix=4|append_suffix=ot",
1244
+ "166": "cut_prefix=0|cut_suffix=4|append_suffix=r",
1245
+ "167": "cut_prefix=0|cut_suffix=4|append_suffix=ra",
1246
+ "168": "cut_prefix=0|cut_suffix=4|append_suffix=t",
1247
+ "169": "cut_prefix=0|cut_suffix=4|append_suffix=tch",
1248
+ "170": "cut_prefix=0|cut_suffix=4|append_suffix=y",
1249
+ "171": "cut_prefix=0|cut_suffix=4|append_suffix=\u00e5g",
1250
+ "172": "cut_prefix=0|cut_suffix=4|append_suffix=\u00edtez",
1251
+ "173": "cut_prefix=0|cut_suffix=4|append_suffix=\u00f6ra",
1252
+ "174": "cut_prefix=0|cut_suffix=5|append_suffix=",
1253
+ "175": "cut_prefix=0|cut_suffix=5|append_suffix=-chat",
1254
+ "176": "cut_prefix=0|cut_suffix=5|append_suffix=a",
1255
+ "177": "cut_prefix=0|cut_suffix=5|append_suffix=an",
1256
+ "178": "cut_prefix=0|cut_suffix=5|append_suffix=bad",
1257
+ "179": "cut_prefix=0|cut_suffix=5|append_suffix=badly",
1258
+ "180": "cut_prefix=0|cut_suffix=5|append_suffix=be",
1259
+ "181": "cut_prefix=0|cut_suffix=5|append_suffix=d\u00e5lig",
1260
+ "182": "cut_prefix=0|cut_suffix=5|append_suffix=each",
1261
+ "183": "cut_prefix=0|cut_suffix=5|append_suffix=ead",
1262
+ "184": "cut_prefix=0|cut_suffix=5|append_suffix=eek",
1263
+ "185": "cut_prefix=0|cut_suffix=5|append_suffix=er",
1264
+ "186": "cut_prefix=0|cut_suffix=5|append_suffix=esto",
1265
+ "187": "cut_prefix=0|cut_suffix=5|append_suffix=et",
1266
+ "188": "cut_prefix=0|cut_suffix=5|append_suffix=etts",
1267
+ "189": "cut_prefix=0|cut_suffix=5|append_suffix=g\u00e4rna",
1268
+ "190": "cut_prefix=0|cut_suffix=5|append_suffix=he",
1269
+ "191": "cut_prefix=0|cut_suffix=5|append_suffix=ician",
1270
+ "192": "cut_prefix=0|cut_suffix=5|append_suffix=ill",
1271
+ "193": "cut_prefix=0|cut_suffix=5|append_suffix=ing",
1272
+ "194": "cut_prefix=0|cut_suffix=5|append_suffix=ink",
1273
+ "195": "cut_prefix=0|cut_suffix=5|append_suffix=kick",
1274
+ "196": "cut_prefix=0|cut_suffix=5|append_suffix=lation",
1275
+ "197": "cut_prefix=0|cut_suffix=5|append_suffix=oder",
1276
+ "198": "cut_prefix=0|cut_suffix=5|append_suffix=on",
1277
+ "199": "cut_prefix=0|cut_suffix=5|append_suffix=r",
1278
+ "200": "cut_prefix=0|cut_suffix=5|append_suffix=ra",
1279
+ "201": "cut_prefix=0|cut_suffix=5|append_suffix=ry",
1280
+ "202": "cut_prefix=0|cut_suffix=5|append_suffix=seek",
1281
+ "203": "cut_prefix=0|cut_suffix=5|append_suffix=uy",
1282
+ "204": "cut_prefix=0|cut_suffix=5|append_suffix=\u00e9r\u00e8se",
1283
+ "205": "cut_prefix=0|cut_suffix=6|append_suffix=ar",
1284
+ "206": "cut_prefix=0|cut_suffix=6|append_suffix=er",
1285
+ "207": "cut_prefix=0|cut_suffix=6|append_suffix=good",
1286
+ "208": "cut_prefix=0|cut_suffix=6|append_suffix=pany",
1287
+ "209": "cut_prefix=0|cut_suffix=6|append_suffix=rule",
1288
+ "210": "cut_prefix=0|cut_suffix=6|append_suffix=zation",
1289
+ "211": "cut_prefix=0|cut_suffix=7|append_suffix=efine",
1290
+ "212": "cut_prefix=0|cut_suffix=8|append_suffix=or",
1291
+ "213": "cut_prefix=1|cut_suffix=0|append_suffix=",
1292
+ "214": "cut_prefix=1|cut_suffix=0|append_suffix=a",
1293
+ "215": "cut_prefix=1|cut_suffix=2|append_suffix=",
1294
+ "216": "cut_prefix=1|cut_suffix=2|append_suffix=ll",
1295
+ "217": "cut_prefix=1|cut_suffix=3|append_suffix=",
1296
+ "218": "cut_prefix=1|cut_suffix=3|append_suffix=te",
1297
+ "219": "cut_prefix=1|cut_suffix=4|append_suffix=ll",
1298
+ "220": "cut_prefix=1|cut_suffix=6|append_suffix=url",
1299
+ "221": "cut_prefix=2|cut_suffix=0|append_suffix=",
1300
+ "222": "cut_prefix=2|cut_suffix=0|append_suffix=a",
1301
+ "223": "cut_prefix=2|cut_suffix=1|append_suffix=",
1302
+ "224": "cut_prefix=2|cut_suffix=1|append_suffix=empel",
1303
+ "225": "cut_prefix=2|cut_suffix=1|append_suffix=n",
1304
+ "226": "cut_prefix=2|cut_suffix=2|append_suffix=",
1305
+ "227": "cut_prefix=2|cut_suffix=2|append_suffix=a",
1306
+ "228": "cut_prefix=2|cut_suffix=3|append_suffix=",
1307
+ "229": "cut_prefix=2|cut_suffix=3|append_suffix=as",
1308
+ "230": "cut_prefix=2|cut_suffix=3|append_suffix=n",
1309
+ "231": "cut_prefix=3|cut_suffix=0|append_suffix=",
1310
+ "232": "cut_prefix=3|cut_suffix=1|append_suffix=",
1311
+ "233": "cut_prefix=3|cut_suffix=1|append_suffix=e",
1312
+ "234": "cut_prefix=3|cut_suffix=2|append_suffix=",
1313
+ "235": "cut_prefix=4|cut_suffix=0|append_suffix=",
1314
+ "236": "cut_prefix=4|cut_suffix=1|append_suffix=g",
1315
+ "237": "cut_prefix=4|cut_suffix=20|append_suffix=rl",
1316
+ "238": "cut_prefix=5|cut_suffix=0|append_suffix=",
1317
+ "239": "cut_prefix=5|cut_suffix=4|append_suffix=",
1318
+ "240": "cut_prefix=6|cut_suffix=0|append_suffix=",
1319
+ "241": "cut_prefix=7|cut_suffix=0|append_suffix="
1320
+ },
1321
+ "misc": {
1322
+ "0": "Cxn=rc-that-nsubj",
1323
+ "1": "Cxn=rc-that-obj",
1324
+ "2": "Cxn=rc-wh-nsubj",
1325
+ "3": "Cxn=rc-wh-obl",
1326
+ "4": "Cxn=rc-wh-obl-pfront",
1327
+ "5": "Promoted=Yes|SpaceAfter=No",
1328
+ "6": "SpaceAfter=No",
1329
+ "7": "_",
1330
+ "8": "ellipsis"
1331
+ },
1332
+ "semclass": {
1333
+ "0": "ABILITY_OF_BEING",
1334
+ "1": "ACCESSORY",
1335
+ "2": "ACT",
1336
+ "3": "ACTIVITY",
1337
+ "4": "ACTIVITY_BY_INTEREST",
1338
+ "5": "ADMINISTRATIVE_REGION",
1339
+ "6": "ADVENTURE",
1340
+ "7": "AGGREGATE",
1341
+ "8": "AGGREGATE_OF_LIVING_OBJECTS",
1342
+ "9": "AGGREGATE_OF_MACHINERY_OR_TRANSPORT",
1343
+ "10": "AGGRESSIVE_ACTIONS",
1344
+ "11": "AGREEMENT_VERBS",
1345
+ "12": "AGRICULTURAL_PROCESSING",
1346
+ "13": "AMBIENCE_ENVIRONMENT",
1347
+ "14": "APPARATUS",
1348
+ "15": "AREA_OF_HUMAN_ACTIVITY",
1349
+ "16": "ARRANGEMENTS",
1350
+ "17": "ARTEFACT",
1351
+ "18": "ARTICLES",
1352
+ "19": "ATTRIBUTIVE",
1353
+ "20": "AUXILIARY_VERBS",
1354
+ "21": "BAD_DANGEROUS_EVENT",
1355
+ "22": "BE",
1356
+ "23": "BEGIN_TO_TAKE_PLACE",
1357
+ "24": "BEHAVIOUR",
1358
+ "25": "BEING",
1359
+ "26": "BEVERAGE",
1360
+ "27": "BE_STATE",
1361
+ "28": "BIJOUTERIE_AND_JEWELLERY",
1362
+ "29": "BODY",
1363
+ "30": "BOOM",
1364
+ "31": "BUSINESS",
1365
+ "32": "BUSY_FREE_OCCUPIED",
1366
+ "33": "CARGO",
1367
+ "34": "CHANGE_OF_MATTER_PHYSICAL_STATE",
1368
+ "35": "CHANGE_OF_ORGANIC_OBJECTS",
1369
+ "36": "CHANGE_OF_POST_AND_JOB",
1370
+ "37": "CHARACTERISTIC_GENERAL",
1371
+ "38": "CHEMICAL_CHANGES",
1372
+ "39": "CHOOSING_SORTING",
1373
+ "40": "CH_ABSTRACT_GENERALIZED",
1374
+ "41": "CH_APPEARANCE",
1375
+ "42": "CH_ASPECT",
1376
+ "43": "CH_BENEFIT",
1377
+ "44": "CH_BY_RESIDENCE",
1378
+ "45": "CH_BY_SENSORY_PERCEPTION",
1379
+ "46": "CH_BY_WORLD_OUTLOOK_EDUCATION_AESTHETIC",
1380
+ "47": "CH_CLASSIFICATION",
1381
+ "48": "CH_COMPOSITION",
1382
+ "49": "CH_CONFIGURATION_AND_FORM",
1383
+ "50": "CH_COVERING",
1384
+ "51": "CH_CRIMINAL_ACTIVITY",
1385
+ "52": "CH_DEGREE",
1386
+ "53": "CH_DEGREE_AND_INTENSITY",
1387
+ "54": "CH_DISPOSITION_AND_MOTION",
1388
+ "55": "CH_DISTRIBUTION",
1389
+ "56": "CH_EVALUATION",
1390
+ "57": "CH_EVALUATION_OF_HUMAN_TEMPER_AND_ACTIVITY",
1391
+ "58": "CH_FULLNESS",
1392
+ "59": "CH_FUNCTIONING_OF_ENTITY",
1393
+ "60": "CH_INFORMATION",
1394
+ "61": "CH_INTENTION_CONCENTRATION",
1395
+ "62": "CH_LANGUAGE",
1396
+ "63": "CH_MAGNITUDE",
1397
+ "64": "CH_MEASURE",
1398
+ "65": "CH_OF_CONNECTIONS",
1399
+ "66": "CH_OF_INTENSITY",
1400
+ "67": "CH_OF_LOCATION",
1401
+ "68": "CH_OF_VISUAL_AUDIBLE_REPRESENTATION",
1402
+ "69": "CH_PARAMETER_OF_MATTER",
1403
+ "70": "CH_PARAMETER_OF_OBJECT_AND_SUBSTANCE",
1404
+ "71": "CH_PARAMETER_SPEED",
1405
+ "72": "CH_PERCEPTIBILITY",
1406
+ "73": "CH_PERSON_IDENTITY",
1407
+ "74": "CH_PHYSICAL_STATE",
1408
+ "75": "CH_POWER_AND_EFFECT",
1409
+ "76": "CH_PRICE_AND_SUMS",
1410
+ "77": "CH_REFERENCE_AND_QUANTIFICATION",
1411
+ "78": "CH_RENOWN",
1412
+ "79": "CH_RESISTANCE_TO_IMPACT",
1413
+ "80": "CH_RHYTHM",
1414
+ "81": "CH_SALIENCE",
1415
+ "82": "CH_SCALE",
1416
+ "83": "CH_SOCIAL_CHARACTERISTIC",
1417
+ "84": "CH_SPHERE_OF_COVERAGE",
1418
+ "85": "CH_STYLE",
1419
+ "86": "CH_SURFACE_EDGE",
1420
+ "87": "CH_SYSTEM_STRUCTURE",
1421
+ "88": "CH_TYPE_OF_POSSESSION_AND_PARTICIPATION",
1422
+ "89": "CIRCUMSTANCE",
1423
+ "90": "CLASSIFICATION_TYPES",
1424
+ "91": "CLASSIFICATION_UNIT",
1425
+ "92": "CLOTHES",
1426
+ "93": "COGNITIVE_OBJECT",
1427
+ "94": "COMMUNICATIONS",
1428
+ "95": "COMPOSITE_PARTICLES",
1429
+ "96": "COMPOSITE_SUFFIXES",
1430
+ "97": "CONDITIONS_IN_NATURE",
1431
+ "98": "CONDITION_IN_ECONOMICS",
1432
+ "99": "CONDITION_OF_EXPERIENCER_AND_NATURE",
1433
+ "100": "CONDITION_SITUATION",
1434
+ "101": "CONDITION_STATE",
1435
+ "102": "CONFLICT_INTERACTION",
1436
+ "103": "CONJUNCTIONS",
1437
+ "104": "CONSTRUCTION_AS_WHOLE",
1438
+ "105": "CONTACT_VERBS",
1439
+ "106": "CONTACT_WITH_CONTRAGENT",
1440
+ "107": "CONTAINER",
1441
+ "108": "CONTAIN_INCLUDE_FORM",
1442
+ "109": "CONTINUE_TO_HAVE",
1443
+ "110": "CONTINUE_TO_TAKE_PLACE",
1444
+ "111": "COORDINATING_CONJUNCTIONS",
1445
+ "112": "CORRELATIVES",
1446
+ "113": "COSMOS_AND_COSMIC_OBJECTS",
1447
+ "114": "COST",
1448
+ "115": "COUNTRY_AS_ADMINISTRATIVE_UNIT",
1449
+ "116": "CREATION_VERBS",
1450
+ "117": "CREATIVE_WORK",
1451
+ "118": "CREATIVE_WORK_BY_GENRE",
1452
+ "119": "CRISIS",
1453
+ "120": "CULTURE",
1454
+ "121": "DECLINE",
1455
+ "122": "DECORATING_AND_FINISHING",
1456
+ "123": "DEFEND_SAVE",
1457
+ "124": "DEGREE_OF_FIT",
1458
+ "125": "DEGREE_OF_SIZE_OR_SCALE",
1459
+ "126": "DESTRUCTION_VERBS",
1460
+ "127": "DEVICE",
1461
+ "128": "DEVICE_FOR_ANIMALS",
1462
+ "129": "DEVICE_FOR_CLOSING_AND_LOCKING",
1463
+ "130": "DEVICE_FOR_HEATING",
1464
+ "131": "DEVICE_FOR_LIFTING_OBJECTS",
1465
+ "132": "DEVICE_FOR_MEASURING_AND_COUNTING",
1466
+ "133": "DIFFICULTIES",
1467
+ "134": "DIFFICULT_AND_EASY",
1468
+ "135": "DIMENSION",
1469
+ "136": "DIMENSIONS_CHAR",
1470
+ "137": "DISCOURSIVE_UNITS",
1471
+ "138": "DISTANT_CONTACT",
1472
+ "139": "DOCUMENT",
1473
+ "140": "DYNAMIC_ARTS",
1474
+ "141": "ECONOMIC_CHANGES",
1475
+ "142": "ECONOMY",
1476
+ "143": "EFFICIENCY_PRODUCTIVITY",
1477
+ "144": "ELECTIONS",
1478
+ "145": "EMBARGO",
1479
+ "146": "EMOTIONS_AND_THEIR_EXPRESSION",
1480
+ "147": "EMPTY_SUBJECT",
1481
+ "148": "ENDINGS",
1482
+ "149": "END_TO_TAKE_PLACE",
1483
+ "150": "ENGINEERING_COMMUNICATIONS",
1484
+ "151": "ENTITY_AS_RESULT_OF_ACTIVITY",
1485
+ "152": "ENTITY_BY_FUNCTION_AND_PROPERTY",
1486
+ "153": "ENTITY_BY_RELATION_TO_MAIN_PART",
1487
+ "154": "ENTITY_BY_VALUE",
1488
+ "155": "ENTITY_GENERAL",
1489
+ "156": "ENTITY_OR_SITUATION_PRONOUN",
1490
+ "157": "ETIQUETTE_COMMUNICATION",
1491
+ "158": "EVENT",
1492
+ "159": "EVERYDAY_PROCESSING",
1493
+ "160": "EXISTENCE_AND_POSSESSION",
1494
+ "161": "FACT_INCIDENT",
1495
+ "162": "FATE",
1496
+ "163": "FEELING_AS_CONDITION",
1497
+ "164": "FINE_ARTS_OBJECTS",
1498
+ "165": "FOOD",
1499
+ "166": "FORCE_IN_PHYSICS",
1500
+ "167": "FREQUENCY_CHAR",
1501
+ "168": "FURNISHINGS_AND_DECORATION",
1502
+ "169": "GENERAL_ACTION",
1503
+ "170": "GOOD_BAD_CONDITION",
1504
+ "171": "GRAMMATICAL_ELEMENTS",
1505
+ "172": "GROUP",
1506
+ "173": "HAVE_CLOTHING_ON",
1507
+ "174": "HERITAGE",
1508
+ "175": "HIERARCHICAL_VERBS",
1509
+ "176": "HISTORICAL_LOCALITY_BY_NAME",
1510
+ "177": "IDENTIFYING_ATTRIBUTE",
1511
+ "178": "IDIOMATICAL_ELEMENTS",
1512
+ "179": "INFORMATION",
1513
+ "180": "INFORMATION_BEARER",
1514
+ "181": "INFORMATION_COMMUNICATIONS",
1515
+ "182": "INHABITED_LOCALITY",
1516
+ "183": "INNOVATION",
1517
+ "184": "INSTRUMENT",
1518
+ "185": "INTELLECTUAL_ACTIVITY",
1519
+ "186": "INTERPERSONAL_RELATIONS",
1520
+ "187": "KIND",
1521
+ "188": "KITCHENWARE_AND_TABLEWARE",
1522
+ "189": "KNOWLEDGE",
1523
+ "190": "KNOWLEDGE_FROM_EXPERIENCE",
1524
+ "191": "KNOWLEDGE_FROM_EXPERIENCE_AND_DEDUCTION",
1525
+ "192": "LACK_AND_PLENTY",
1526
+ "193": "LAWS_AND_STANDARDS",
1527
+ "194": "LINES",
1528
+ "195": "LINE_FOR_COMMUNICATION",
1529
+ "196": "LINGUISTIC_OBJECTS",
1530
+ "197": "MAKE_EFFORTS",
1531
+ "198": "MANAGE_FAIL_CONDITION",
1532
+ "199": "MARKET_AS_AREA_OF_ACTIVITY",
1533
+ "200": "MATERIALITY_CHAR",
1534
+ "201": "MATHEMATICAL_OBJECTS",
1535
+ "202": "MEANING_SENSE",
1536
+ "203": "MEDICAL_OPERATIONS",
1537
+ "204": "MENTAL_OBJECT",
1538
+ "205": "METHOD_APPROACH_TECHNIQUE",
1539
+ "206": "MIX_AS_AGGREGATE",
1540
+ "207": "MODALITY",
1541
+ "208": "MODE_OF_EXPRESSIVENESS",
1542
+ "209": "MONEY",
1543
+ "210": "MOTION",
1544
+ "211": "MOTION_ACTIVITY",
1545
+ "212": "MOTIVATE",
1546
+ "213": "MOVEMENT_AS_ACTIVITY",
1547
+ "214": "MULTIMEDIA",
1548
+ "215": "MUSICAL_INSTRUMENT",
1549
+ "216": "MYSTERY_SECRET",
1550
+ "217": "NATURALNESS_GENUINENESS_CHAR",
1551
+ "218": "NETWORK",
1552
+ "219": "NONPRODUCTIVE_AREA",
1553
+ "220": "NORMATIVE_LEGAL_ACTIVITY",
1554
+ "221": "OBJECTS_BY_FORM_OF_MANIFESTATION",
1555
+ "222": "OBJECTS_BY_FUNCTION",
1556
+ "223": "OBJECT_BY_FUNCTION_AND_PROPERTY",
1557
+ "224": "OBJECT_BY_SHAPE",
1558
+ "225": "OBJECT_IN_NATURE",
1559
+ "226": "OCCUPATIONS",
1560
+ "227": "OPERATING_STATE",
1561
+ "228": "OPTICAL_DEVICE_AND_ITS_PARTS",
1562
+ "229": "ORDER_DISORDER",
1563
+ "230": "ORGANIC_NON_ORGANIC",
1564
+ "231": "ORGANIC_OBJECTS",
1565
+ "232": "ORGANIZATION",
1566
+ "233": "ORGANIZED_AGGREGATE",
1567
+ "234": "ORIENTATION_IN_SPACE",
1568
+ "235": "OUTFIT",
1569
+ "236": "PARTICLES",
1570
+ "237": "PART_OF_ARTEFACT",
1571
+ "238": "PART_OF_CLOTHES",
1572
+ "239": "PART_OF_CONSTRUCTION",
1573
+ "240": "PART_OF_CREATIVE_WORK",
1574
+ "241": "PART_OF_FOOTWEAR",
1575
+ "242": "PART_OF_ORGANISM",
1576
+ "243": "PART_OF_WORLD",
1577
+ "244": "PART_OR_PORTION_OF_ENTITY",
1578
+ "245": "PATH_AS_DIRECTION_OF_ACTIVITY",
1579
+ "246": "PEACE",
1580
+ "247": "PERCEPTION_ACTIVITY",
1581
+ "248": "PHENOMENON",
1582
+ "249": "PHRASAL_PARTICLES",
1583
+ "250": "PHYSICAL_AND_BIOLOGICAL_PROPERTIES",
1584
+ "251": "PHYSICAL_CHEMICAL_DAMAGE",
1585
+ "252": "PHYSICAL_OBJECT",
1586
+ "253": "PHYSICAL_OBJECT_AND_SUBSTANCE_CHAR",
1587
+ "254": "PHYSICAL_PSYCHIC_CONDITION",
1588
+ "255": "PHYSIOLOGICAL_PROCESSES",
1589
+ "256": "PLACE",
1590
+ "257": "PLANT",
1591
+ "258": "POINTS_AS_PLACE",
1592
+ "259": "POSITION_AS_STATUS",
1593
+ "260": "POSITION_IN_HIERARCHY",
1594
+ "261": "POSITION_IN_SPACE",
1595
+ "262": "POWER_CHAR",
1596
+ "263": "POWER_RIGHT",
1597
+ "264": "PREMISES",
1598
+ "265": "PREPOSITION",
1599
+ "266": "PRESSURE_CHAR",
1600
+ "267": "PROBLEMS_TO_SOLVE",
1601
+ "268": "PROCESSING",
1602
+ "269": "PROCESS_AND_ITS_STAGES",
1603
+ "270": "PROCESS_PARAMETER",
1604
+ "271": "PRODUCT",
1605
+ "272": "PRODUCTION_AS_TIME_ART",
1606
+ "273": "PRODUCTIVE_AREA",
1607
+ "274": "PUBLIC_ACTIVITY",
1608
+ "275": "PUBLIC_AND_POLITICAL_ACTIVITY",
1609
+ "276": "QUIETNESS",
1610
+ "277": "READINESS",
1611
+ "278": "REALITY",
1612
+ "279": "RELATIVE_ENTITY",
1613
+ "280": "RELATIVE_PART_OF_INHABITED_LOCALITY",
1614
+ "281": "RELATIVE_SPACE",
1615
+ "282": "RELIGIOUS_OBJECT",
1616
+ "283": "REMOVING_DESTRUCTION",
1617
+ "284": "RESERVE",
1618
+ "285": "RESULTS_OF_GIVING_INFORMATION_AND_SPEECH_ACTIVITY",
1619
+ "286": "RESULTS_OF_MAKING_DECISIONS",
1620
+ "287": "RESULTS_OF_MENTAL_ACTIVITY",
1621
+ "288": "RESULT_CONSEQUENCE",
1622
+ "289": "REVEAL_CONCEAL_INFORMATION",
1623
+ "290": "REWARD_AS_ENTITY",
1624
+ "291": "RISK_DANGER",
1625
+ "292": "SAMPLE_AS_AGGREGATE",
1626
+ "293": "SCALE_DIVISION",
1627
+ "294": "SCHEDULE_FOR_ACTIVITY",
1628
+ "295": "SCIENCE",
1629
+ "296": "SCIENTIFIC_AND_LITERARY_WORK",
1630
+ "297": "SEPARATION_PROCESSING",
1631
+ "298": "SERIES_IN_SCIENCE",
1632
+ "299": "SEXUAL_ACTIVITIES",
1633
+ "300": "SILENCE_AS_SOUNDLESSNESS",
1634
+ "301": "SITUATION",
1635
+ "302": "SOCIAL_CONDITIONS_OF_BEING",
1636
+ "303": "SPACE_AND_SPATIAL_OBJECTS",
1637
+ "304": "SPACE_BY_PARTICULAR_PROPERTIES",
1638
+ "305": "SPACE_BY_RELIGIOUS_BELIEFS",
1639
+ "306": "SPACE_TIME_ART",
1640
+ "307": "SPHERE_OF_ACTIVITY_GENERAL",
1641
+ "308": "SPORT",
1642
+ "309": "SPORT_DEVICE",
1643
+ "310": "STAGNATION",
1644
+ "311": "STATE_AREA",
1645
+ "312": "STATE_OF_MIND",
1646
+ "313": "STEADINESS_OF_FORM_OR_POSITION",
1647
+ "314": "STREET_OR_TOWN_SUFFIXES",
1648
+ "315": "SUBSTANCE",
1649
+ "316": "SURFACE_AND_ITS_SPECIALITIES",
1650
+ "317": "SYMBOLS_FOR_INFORMATION_TRANSFER",
1651
+ "318": "SYSTEM_AS_AGGREGATE",
1652
+ "319": "TEETH_AND_TONGUE_CONTACT",
1653
+ "320": "TEMPERATURE_CHAR",
1654
+ "321": "TENDENCY_AND_DISPOSITION",
1655
+ "322": "TERRITORY_AREA",
1656
+ "323": "TEST_FOR_EXPERIENCER",
1657
+ "324": "TEXTS_OF_PROGRAMS",
1658
+ "325": "TEXT_OBJECTS_AND_DOCUMENTS",
1659
+ "326": "TEXT_WITH_ADDRESSEE",
1660
+ "327": "THE_EARTH_AND_ITS_SPATIAL_PARTS",
1661
+ "328": "THE_GOOD_BAD",
1662
+ "329": "THE_MAGIC",
1663
+ "330": "TIME",
1664
+ "331": "TOPIC_SUBJECT",
1665
+ "332": "TOTALITY_OF_DEGREE",
1666
+ "333": "TO_ACCOMPANY_WITH",
1667
+ "334": "TO_ACCUSE_AND_VINDICATE",
1668
+ "335": "TO_ADAPT",
1669
+ "336": "TO_ADD",
1670
+ "337": "TO_ADJUST_AND_REPAIR",
1671
+ "338": "TO_AIM",
1672
+ "339": "TO_ANALYSE_AND_RESEARCH",
1673
+ "340": "TO_ANIMATE_PICTURE",
1674
+ "341": "TO_APPLAUD",
1675
+ "342": "TO_APPLY_COAT",
1676
+ "343": "TO_APPROACH_COME_TO_SOME_POINT_OR_STATE",
1677
+ "344": "TO_ARREST",
1678
+ "345": "TO_ASSEMBLE",
1679
+ "346": "TO_ATTRIBUTE_AS_TO_ADD",
1680
+ "347": "TO_AVOID",
1681
+ "348": "TO_BEAT_AND_PRICK",
1682
+ "349": "TO_BETRAY_AND_LEAVE",
1683
+ "350": "TO_BE_ABOUT_TO_HAPPEN",
1684
+ "351": "TO_BE_A_SIGN_OF",
1685
+ "352": "TO_BE_BASED",
1686
+ "353": "TO_BE_DESCENDED",
1687
+ "354": "TO_BE_GUIDED",
1688
+ "355": "TO_BE_SEEN_IN_FIELD_OF_VIEW",
1689
+ "356": "TO_BLOW_UP",
1690
+ "357": "TO_BREAK",
1691
+ "358": "TO_BUILD",
1692
+ "359": "TO_CALL_AND_DESIGNATE",
1693
+ "360": "TO_CANCEL",
1694
+ "361": "TO_CARE_AND_BRING_UP",
1695
+ "362": "TO_CAUSE_OR_STOP_MOVEMENT",
1696
+ "363": "TO_CAUSE_SUCCESS",
1697
+ "364": "TO_CELEBRATE",
1698
+ "365": "TO_CERTIFY",
1699
+ "366": "TO_CHALLENGE_TO_INVITE",
1700
+ "367": "TO_CHANGE",
1701
+ "368": "TO_CHANGE_FORM",
1702
+ "369": "TO_CHARACTERIZE",
1703
+ "370": "TO_CITE",
1704
+ "371": "TO_CLOSE",
1705
+ "372": "TO_COME_OR_TO_LEAVE_SPHERE_OF_ACTIVITY",
1706
+ "373": "TO_COMMENT",
1707
+ "374": "TO_COMMIT",
1708
+ "375": "TO_COMMUNICATE",
1709
+ "376": "TO_COMPEL_AND_EVOKE",
1710
+ "377": "TO_COMPEL_TO_ACCEPT",
1711
+ "378": "TO_COMPOSE_SYMBOLS",
1712
+ "379": "TO_CONCLUDE",
1713
+ "380": "TO_CONNIVE",
1714
+ "381": "TO_CONTRIBUTE_AND_HINDER",
1715
+ "382": "TO_CORRECT",
1716
+ "383": "TO_COUNT",
1717
+ "384": "TO_COURT_AND_FLIRT",
1718
+ "385": "TO_CREATE_HOLE",
1719
+ "386": "TO_DECIDE",
1720
+ "387": "TO_DESTINE",
1721
+ "388": "TO_DEVELOP",
1722
+ "389": "TO_DIG_PROCESS",
1723
+ "390": "TO_DIRECT_CREATIVE_WORK",
1724
+ "391": "TO_DISAPPEAR_LOSE_GET_RID_OF",
1725
+ "392": "TO_DISTRACT_DEFLECT",
1726
+ "393": "TO_DIVIDE",
1727
+ "394": "TO_ECONOMIZE",
1728
+ "395": "TO_EMIT",
1729
+ "396": "TO_EXIST",
1730
+ "397": "TO_FABRICATE",
1731
+ "398": "TO_FEEL_AND_EXPRESS_MENTAL_ATTITUDE_TO",
1732
+ "399": "TO_FLOW_IN_TIME",
1733
+ "400": "TO_FORGIVE",
1734
+ "401": "TO_FORM",
1735
+ "402": "TO_FORMULATE",
1736
+ "403": "TO_GENERATE",
1737
+ "404": "TO_GESTURE",
1738
+ "405": "TO_GET",
1739
+ "406": "TO_GET_INFORMATION",
1740
+ "407": "TO_GIVE",
1741
+ "408": "TO_GIVE_SIGNALS",
1742
+ "409": "TO_GO_ON_STRIKE",
1743
+ "410": "TO_GUESS",
1744
+ "411": "TO_HIDE",
1745
+ "412": "TO_HURRY_TO_TARRY",
1746
+ "413": "TO_INDEX",
1747
+ "414": "TO_INDUCE_PHYSICAL_PROPERTIES",
1748
+ "415": "TO_INTERACT",
1749
+ "416": "TO_INTERCHANGE",
1750
+ "417": "TO_INTERPRET",
1751
+ "418": "TO_INVENT",
1752
+ "419": "TO_INVOLVE",
1753
+ "420": "TO_JOIN",
1754
+ "421": "TO_JOIN_PHYSICAL_OBJECTS",
1755
+ "422": "TO_KEEP_VIOLATE_NORMS",
1756
+ "423": "TO_LEARN_AND_RESEARCH",
1757
+ "424": "TO_LET_DOWN",
1758
+ "425": "TO_LIQUIDATE",
1759
+ "426": "TO_MAKE",
1760
+ "427": "TO_MARRY_DIVORCE_ENGAGE",
1761
+ "428": "TO_MEAN",
1762
+ "429": "TO_MEASURE",
1763
+ "430": "TO_MIX",
1764
+ "431": "TO_MOVE_IN_GAMES",
1765
+ "432": "TO_OPEN",
1766
+ "433": "TO_ORGANIZE_EVENT",
1767
+ "434": "TO_OVERTHROW",
1768
+ "435": "TO_PARTICIPATE",
1769
+ "436": "TO_PERCEIVE",
1770
+ "437": "TO_PERFORM",
1771
+ "438": "TO_PERFORM_MATHS_OPERATIONS",
1772
+ "439": "TO_PERSUADE_SMB_TO_DO_SMTH",
1773
+ "440": "TO_PICKET",
1774
+ "441": "TO_PICTURE_DRAW",
1775
+ "442": "TO_PLAN_CREATIVE_AND_PHYSICAL_OBJECTS",
1776
+ "443": "TO_PLAY_GAMES",
1777
+ "444": "TO_POSSESS",
1778
+ "445": "TO_PRESS",
1779
+ "446": "TO_PRESS_AS_TOUCH",
1780
+ "447": "TO_PREVENT_SMTH",
1781
+ "448": "TO_PRINT_TEXT_PHOTO",
1782
+ "449": "TO_PROCESS_INFORMATION",
1783
+ "450": "TO_PROCESS_PHYSICAL_OBJECT",
1784
+ "451": "TO_PRODUCE_CERTAIN_SOUNDS",
1785
+ "452": "TO_PROGRAM",
1786
+ "453": "TO_PRONOUNCE",
1787
+ "454": "TO_PROPOSE",
1788
+ "455": "TO_PUNISH",
1789
+ "456": "TO_RATIFY",
1790
+ "457": "TO_REACT",
1791
+ "458": "TO_READ_READABLE",
1792
+ "459": "TO_REBEL",
1793
+ "460": "TO_RECEIVE_CALLERS",
1794
+ "461": "TO_REFLECT",
1795
+ "462": "TO_REGISTER",
1796
+ "463": "TO_REIGN_AS_TO_TAKE_PLACE",
1797
+ "464": "TO_RELEASE",
1798
+ "465": "TO_RESTORE",
1799
+ "466": "TO_REVENGE",
1800
+ "467": "TO_RUB_AND_SCRATCH",
1801
+ "468": "TO_SABOTAGE",
1802
+ "469": "TO_SCREEN",
1803
+ "470": "TO_SEDUCE",
1804
+ "471": "TO_SEEK_FIND",
1805
+ "472": "TO_SEND_TO_DELIVER",
1806
+ "473": "TO_SET",
1807
+ "474": "TO_SHARE",
1808
+ "475": "TO_SHINE",
1809
+ "476": "TO_SHOOT_PHOTO_OR_FILM",
1810
+ "477": "TO_SHOW",
1811
+ "478": "TO_SMOKE",
1812
+ "479": "TO_SOUND",
1813
+ "480": "TO_SPEND",
1814
+ "481": "TO_SPEND_INEFFECTIVELY",
1815
+ "482": "TO_SPEND_TIME",
1816
+ "483": "TO_SPOIL",
1817
+ "484": "TO_STOP_SPEAKING",
1818
+ "485": "TO_SUBSCRIBE",
1819
+ "486": "TO_SUBSTITUTE_AND_EXCHANGE",
1820
+ "487": "TO_SUMMARIZE",
1821
+ "488": "TO_SUPPORT_AND_OPPOSE",
1822
+ "489": "TO_SYMBOLIZE",
1823
+ "490": "TO_TAKE",
1824
+ "491": "TO_TAKE_FOOD_OR_MEDICINE",
1825
+ "492": "TO_TAKE_INTO_CONSIDERATION",
1826
+ "493": "TO_TAKE_PLACE_IN_NATURE",
1827
+ "494": "TO_TEASE_AND_JOKE",
1828
+ "495": "TO_TELEPHONE",
1829
+ "496": "TO_TERRORIZE",
1830
+ "497": "TO_THINK_ABOUT",
1831
+ "498": "TO_THINK_OUT",
1832
+ "499": "TO_TORTURE",
1833
+ "500": "TO_TOUCH",
1834
+ "501": "TO_TRADE",
1835
+ "502": "TO_TURN_INTO",
1836
+ "503": "TO_UNDERSTATE_TO_EXAGGERATE",
1837
+ "504": "TO_USE",
1838
+ "505": "TO_UTTER_ANIMAL_SOUNDS",
1839
+ "506": "TO_VISUALIZE",
1840
+ "507": "TO_WAIT",
1841
+ "508": "TO_WORK",
1842
+ "509": "TO_WRITE",
1843
+ "510": "TRANSPORT",
1844
+ "511": "TRANSPORT_COMMUNICATIONS",
1845
+ "512": "TRIAL",
1846
+ "513": "TRICK_MACHINATION",
1847
+ "514": "UNCERTAINTY",
1848
+ "515": "UNDERTAKING",
1849
+ "516": "UNIT_OF_INFORMATION_QUANTITY",
1850
+ "517": "UNKNOWN_SUBSTANTIVE_CLASS",
1851
+ "518": "URBAN_SPACE_AND_ROADS",
1852
+ "519": "VALUABLE",
1853
+ "520": "VERBAL_COMMUNICATION",
1854
+ "521": "VIOLENCE",
1855
+ "522": "VIRTUAL_OBJECT",
1856
+ "523": "VIRTUAL_TRANSFERENCE",
1857
+ "524": "VISUAL_CHARACTERISTICS",
1858
+ "525": "VISUAL_REPRESENTATION",
1859
+ "526": "WEAPON_AND_ITS_PART",
1860
+ "527": "WEIGHT_CHAR",
1861
+ "528": "WORLD_OUTLOOK",
1862
+ "529": "YES_NO_VERBS",
1863
+ "530": "_"
1864
+ },
1865
+ "ud_deprel": {
1866
+ "0": "acl",
1867
+ "1": "acl:cleft",
1868
+ "2": "acl:relcl",
1869
+ "3": "advcl",
1870
+ "4": "advcl:relcl",
1871
+ "5": "advmod",
1872
+ "6": "amod",
1873
+ "7": "appos",
1874
+ "8": "aux",
1875
+ "9": "aux:pass",
1876
+ "10": "case",
1877
+ "11": "cc",
1878
+ "12": "cc:preconj",
1879
+ "13": "ccomp",
1880
+ "14": "compound",
1881
+ "15": "compound:prt",
1882
+ "16": "conj",
1883
+ "17": "cop",
1884
+ "18": "csubj",
1885
+ "19": "csubj:outer",
1886
+ "20": "csubj:pass",
1887
+ "21": "dep",
1888
+ "22": "det",
1889
+ "23": "det:predet",
1890
+ "24": "discourse",
1891
+ "25": "dislocated",
1892
+ "26": "expl",
1893
+ "27": "fixed",
1894
+ "28": "flat",
1895
+ "29": "flat:foreign",
1896
+ "30": "flat:name",
1897
+ "31": "flatname",
1898
+ "32": "goeswith",
1899
+ "33": "iobj",
1900
+ "34": "list",
1901
+ "35": "mark",
1902
+ "36": "nmod",
1903
+ "37": "nmod:desc",
1904
+ "38": "nmod:npmod",
1905
+ "39": "nmod:poss",
1906
+ "40": "nmod:tmod",
1907
+ "41": "nmod:unmarked",
1908
+ "42": "nsubj",
1909
+ "43": "nsubj:outer",
1910
+ "44": "nsubj:pass",
1911
+ "45": "nummod",
1912
+ "46": "nummod:gov",
1913
+ "47": "obj",
1914
+ "48": "obl",
1915
+ "49": "obl:agent",
1916
+ "50": "obl:npmod",
1917
+ "51": "obl:tmod",
1918
+ "52": "obl:unmarked",
1919
+ "53": "orphan",
1920
+ "54": "parataxis",
1921
+ "55": "punct",
1922
+ "56": "reparandum",
1923
+ "57": "root",
1924
+ "58": "vocative",
1925
+ "59": "xcomp"
1926
+ }
1927
+ }
1928
+ }
configuration.py ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from transformers import PretrainedConfig
2
+
3
+
4
+ class CobaldParserConfig(PretrainedConfig):
5
+ model_type = "cobald_parser"
6
+
7
+ def __init__(
8
+ self,
9
+ encoder_model_name: str = None,
10
+ null_classifier_hidden_size: int = 0,
11
+ lemma_classifier_hidden_size: int = 0,
12
+ morphology_classifier_hidden_size: int = 0,
13
+ dependency_classifier_hidden_size: int = 0,
14
+ misc_classifier_hidden_size: int = 0,
15
+ deepslot_classifier_hidden_size: int = 0,
16
+ semclass_classifier_hidden_size: int = 0,
17
+ activation: str = 'relu',
18
+ dropout: float = 0.1,
19
+ consecutive_null_limit: int = 0,
20
+ vocabulary: dict[dict[int, str]] = {},
21
+ **kwargs
22
+ ):
23
+ self.encoder_model_name = encoder_model_name
24
+ self.null_classifier_hidden_size = null_classifier_hidden_size
25
+ self.consecutive_null_limit = consecutive_null_limit
26
+ self.lemma_classifier_hidden_size = lemma_classifier_hidden_size
27
+ self.morphology_classifier_hidden_size = morphology_classifier_hidden_size
28
+ self.dependency_classifier_hidden_size = dependency_classifier_hidden_size
29
+ self.misc_classifier_hidden_size = misc_classifier_hidden_size
30
+ self.deepslot_classifier_hidden_size = deepslot_classifier_hidden_size
31
+ self.semclass_classifier_hidden_size = semclass_classifier_hidden_size
32
+ self.activation = activation
33
+ self.dropout = dropout
34
+ # The serialized config stores mappings as strings,
35
+ # e.g. {"0": "acl", "1": "conj"}, so we have to convert them to int.
36
+ self.vocabulary = {
37
+ column: {int(k): v for k, v in labels.items()}
38
+ for column, labels in vocabulary.items()
39
+ }
40
+ super().__init__(**kwargs)
dependency_classifier.py ADDED
@@ -0,0 +1,305 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ from copy import deepcopy
3
+
4
+ import numpy as np
5
+
6
+ import torch
7
+ from torch import nn
8
+ from torch import Tensor, FloatTensor, BoolTensor, LongTensor
9
+ import torch.nn.functional as F
10
+
11
+ from transformers.activations import ACT2FN
12
+
13
+ from cobald_parser.bilinear_matrix_attention import BilinearMatrixAttention
14
+ from cobald_parser.chu_liu_edmonds import decode_mst
15
+ from cobald_parser.utils import pairwise_mask, replace_masked_values
16
+
17
+
18
+ class DependencyHeadBase(nn.Module):
19
+ """
20
+ Base class for scoring arcs and relations between tokens in a dependency tree/graph.
21
+ """
22
+
23
+ def __init__(self, hidden_size: int, n_rels: int):
24
+ super().__init__()
25
+
26
+ self.arc_attention = BilinearMatrixAttention(
27
+ hidden_size,
28
+ hidden_size,
29
+ use_input_biases=True,
30
+ n_labels=1
31
+ )
32
+ self.rel_attention = BilinearMatrixAttention(
33
+ hidden_size,
34
+ hidden_size,
35
+ use_input_biases=True,
36
+ n_labels=n_rels
37
+ )
38
+
39
+ def forward(
40
+ self,
41
+ h_arc_head: Tensor, # [batch_size, seq_len, hidden_size]
42
+ h_arc_dep: Tensor, # ...
43
+ h_rel_head: Tensor, # ...
44
+ h_rel_dep: Tensor, # ...
45
+ gold_arcs: LongTensor, # [batch_size, seq_len, seq_len]
46
+ null_mask: BoolTensor, # [batch_size, seq_len]
47
+ padding_mask: BoolTensor # [batch_size, seq_len]
48
+ ) -> dict[str, Tensor]:
49
+
50
+ # Score arcs.
51
+ # s_arc[:, i, j] = score of edge i -> j.
52
+ s_arc = self.arc_attention(h_arc_head, h_arc_dep)
53
+ # Mask undesirable values (padding, nulls, etc.) with -inf.
54
+ mask2d = pairwise_mask(null_mask & padding_mask)
55
+ replace_masked_values(s_arc, mask2d, replace_with=-1e8)
56
+ # Score arcs' relations.
57
+ # [batch_size, seq_len, seq_len, num_labels]
58
+ s_rel = self.rel_attention(h_rel_head, h_rel_dep).permute(0, 2, 3, 1)
59
+
60
+ # Calculate loss.
61
+ loss = 0.0
62
+ if gold_arcs is not None:
63
+ loss += self.calc_arc_loss(s_arc, gold_arcs)
64
+ loss += self.calc_rel_loss(s_rel, gold_arcs)
65
+
66
+ # Predict arcs based on the scores.
67
+ # [batch_size, seq_len, seq_len]
68
+ pred_arcs_matrix = self.predict_arcs(s_arc, null_mask, padding_mask)
69
+ # [batch_size, seq_len, seq_len]
70
+ pred_rels_matrix = self.predict_rels(s_rel)
71
+ # [n_pred_arcs, 4]
72
+ preds_combined = self.combine_arcs_rels(pred_arcs_matrix, pred_rels_matrix)
73
+ return {
74
+ 'preds': preds_combined,
75
+ 'loss': loss
76
+ }
77
+
78
+ @staticmethod
79
+ def calc_arc_loss(
80
+ s_arc: Tensor, # [batch_size, seq_len, seq_len]
81
+ gold_arcs: LongTensor # [n_arcs, 4]
82
+ ) -> Tensor:
83
+ """Calculate arc loss."""
84
+ raise NotImplementedError
85
+
86
+ @staticmethod
87
+ def calc_rel_loss(
88
+ s_rel: Tensor, # [batch_size, seq_len, seq_len, num_labels]
89
+ gold_arcs: LongTensor # [n_arcs, 4]
90
+ ) -> Tensor:
91
+ batch_idxs, arcs_from, arcs_to, rels = gold_arcs.T
92
+ return F.cross_entropy(s_rel[batch_idxs, arcs_from, arcs_to], rels)
93
+
94
+ def predict_arcs(
95
+ self,
96
+ s_arc: Tensor, # [batch_size, seq_len, seq_len]
97
+ null_mask: BoolTensor, # [batch_size, seq_len]
98
+ padding_mask: BoolTensor # [batch_size, seq_len]
99
+ ) -> LongTensor:
100
+ """Predict arcs from scores."""
101
+ raise NotImplementedError
102
+
103
+ def predict_rels(
104
+ self,
105
+ s_rel: FloatTensor
106
+ ) -> LongTensor:
107
+ return s_rel.argmax(dim=-1).long()
108
+
109
+ @staticmethod
110
+ def combine_arcs_rels(
111
+ pred_arcs: LongTensor,
112
+ pred_rels: LongTensor
113
+ ) -> LongTensor:
114
+ """Select relations towards predicted arcs."""
115
+ assert pred_arcs.shape == pred_rels.shape
116
+ # Get indices where arcs exist
117
+ indices = pred_arcs.nonzero(as_tuple=True)
118
+ batch_idxs, from_idxs, to_idxs = indices
119
+ # Get corresponding relation types
120
+ rel_types = pred_rels[batch_idxs, from_idxs, to_idxs]
121
+ # Stack as [batch_idx, from_idx, to_idx, rel_type]
122
+ return torch.stack([batch_idxs, from_idxs, to_idxs, rel_types], dim=1)
123
+
124
+
125
+ class DependencyHead(DependencyHeadBase):
126
+ """
127
+ Basic UD syntax specialization that predicts single edge for each token.
128
+ """
129
+
130
+
131
+ def predict_arcs(
132
+ self,
133
+ s_arc: Tensor, # [batch_size, seq_len, seq_len]
134
+ null_mask: BoolTensor, # [batch_size, seq_len]
135
+ padding_mask: BoolTensor # [batch_size, seq_len, seq_len]
136
+ ) -> Tensor:
137
+
138
+ if self.training:
139
+ # During training, use fast greedy decoding.
140
+ # - [batch_size, seq_len]
141
+ pred_arcs_seq = s_arc.argmax(dim=1)
142
+ else:
143
+ # FIXME
144
+ # During inference, decode Maximum Spanning Tree.
145
+ # pred_arcs_seq = self._mst_decode(s_arc, padding_mask)
146
+ pred_arcs_seq = s_arc.argmax(dim=1)
147
+
148
+ # Upscale arcs sequence of shape [batch_size, seq_len]
149
+ # to matrix of shape [batch_size, seq_len, seq_len].
150
+ pred_arcs = F.one_hot(pred_arcs_seq, num_classes=pred_arcs_seq.size(1)).long().transpose(1, 2)
151
+ # Apply mask one more time (even though s_arc is already masked),
152
+ # because argmax erases information about masked values.
153
+ mask2d = pairwise_mask(null_mask & padding_mask)
154
+ replace_masked_values(pred_arcs, mask2d, replace_with=0)
155
+ return pred_arcs
156
+
157
+ def _mst_decode(
158
+ self,
159
+ s_arc: Tensor, # [batch_size, seq_len, seq_len]
160
+ padding_mask: Tensor
161
+ ) -> tuple[Tensor, Tensor]:
162
+
163
+ batch_size = s_arc.size(0)
164
+ device = s_arc.device
165
+ s_arc = s_arc.cpu()
166
+
167
+ # Convert scores to probabilities, as `decode_mst` expects non-negative values.
168
+ arc_probs = nn.functional.softmax(s_arc, dim=1)
169
+
170
+ # `decode_mst` knows nothing about UD and ROOT, so we have to manually
171
+ # zero probabilities of arcs leading to ROOT to make sure ROOT is a source node
172
+ # of a graph.
173
+
174
+ # Decode ROOT positions from diagonals.
175
+ # shape: [batch_size]
176
+ root_idxs = arc_probs.diagonal(dim1=1, dim2=2).argmax(dim=-1)
177
+ # Zero out arcs leading to ROOTs.
178
+ arc_probs[torch.arange(batch_size), :, root_idxs] = 0.0
179
+
180
+ pred_arcs = []
181
+ for sample_idx in range(batch_size):
182
+ energy = arc_probs[sample_idx]
183
+ length = padding_mask[sample_idx].sum()
184
+ heads = decode_mst(energy, length)
185
+ # Some nodes may be isolated. Pick heads greedily in this case.
186
+ heads[heads <= 0] = s_arc[sample_idx].argmax(dim=1)[heads <= 0]
187
+ pred_arcs.append(heads)
188
+
189
+ # shape: [batch_size, seq_len]
190
+ pred_arcs = torch.from_numpy(np.stack(pred_arcs)).long().to(device)
191
+ return pred_arcs
192
+
193
+ @staticmethod
194
+
195
+ def calc_arc_loss(
196
+ s_arc: Tensor, # [batch_size, seq_len, seq_len]
197
+ gold_arcs: LongTensor # [n_arcs, 4]
198
+ ) -> tuple[Tensor, Tensor]:
199
+ batch_idxs, from_idxs, to_idxs, _ = gold_arcs.T
200
+ return F.cross_entropy(s_arc[batch_idxs, :, to_idxs], from_idxs)
201
+
202
+
203
+ class MultiDependencyHead(DependencyHeadBase):
204
+ """
205
+ Enhanced UD syntax specialization that predicts multiple edges for each token.
206
+ """
207
+
208
+
209
+ def predict_arcs(
210
+ self,
211
+ s_arc: Tensor, # [batch_size, seq_len, seq_len]
212
+ null_mask: BoolTensor, # [batch_size, seq_len]
213
+ padding_mask: BoolTensor # [batch_size, seq_len]
214
+ ) -> Tensor:
215
+ # Convert scores to probabilities.
216
+ arc_probs = torch.sigmoid(s_arc)
217
+ # Find confident arcs (with prob > 0.5).
218
+ return arc_probs.round().long()
219
+
220
+ @staticmethod
221
+
222
+ def calc_arc_loss(
223
+ s_arc: Tensor, # [batch_size, seq_len, seq_len]
224
+ gold_arcs: LongTensor # [n_arcs, 4]
225
+ ) -> Tensor:
226
+ batch_idxs, from_idxs, to_idxs, _ = gold_arcs.T
227
+ # Gold arcs but as a matrix, where matrix[i, arcs_from, arc_to] = 1.0 if arcs is present.
228
+ gold_arcs_matrix = torch.zeros_like(s_arc)
229
+ gold_arcs_matrix[batch_idxs, from_idxs, to_idxs] = 1.0
230
+ # Padded arcs's logits are huge negative values that doesn't contribute to the loss.
231
+ return F.binary_cross_entropy_with_logits(s_arc, gold_arcs_matrix)
232
+
233
+
234
+ class DependencyClassifier(nn.Module):
235
+ """
236
+ Dozat and Manning's biaffine dependency classifier.
237
+ """
238
+
239
+ def __init__(
240
+ self,
241
+ input_size: int,
242
+ hidden_size: int,
243
+ n_rels_ud: int,
244
+ n_rels_eud: int,
245
+ activation: str,
246
+ dropout: float,
247
+ ):
248
+ super().__init__()
249
+
250
+ self.arc_dep_mlp = nn.Sequential(
251
+ nn.Dropout(dropout),
252
+ nn.Linear(input_size, hidden_size),
253
+ ACT2FN[activation],
254
+ nn.Dropout(dropout)
255
+ )
256
+ # All mlps are equal.
257
+ self.arc_head_mlp = deepcopy(self.arc_dep_mlp)
258
+ self.rel_dep_mlp = deepcopy(self.arc_dep_mlp)
259
+ self.rel_head_mlp = deepcopy(self.arc_dep_mlp)
260
+
261
+ self.dependency_head_ud = DependencyHead(hidden_size, n_rels_ud)
262
+ self.dependency_head_eud = MultiDependencyHead(hidden_size, n_rels_eud)
263
+
264
+ def forward(
265
+ self,
266
+ embeddings: Tensor, # [batch_size, seq_len, embedding_size]
267
+ gold_ud: Tensor, # [n_ud_arcs, 4]
268
+ gold_eud: Tensor, # [n_eud_arcs, 4]
269
+ null_mask: Tensor, # [batch_size, seq_len]
270
+ padding_mask: Tensor # [batch_size, seq_len]
271
+ ) -> dict[str, Tensor]:
272
+
273
+ # - [batch_size, seq_len, hidden_size]
274
+ h_arc_head = self.arc_head_mlp(embeddings)
275
+ h_arc_dep = self.arc_dep_mlp(embeddings)
276
+ h_rel_head = self.rel_head_mlp(embeddings)
277
+ h_rel_dep = self.rel_dep_mlp(embeddings)
278
+
279
+ # Share the h vectors between dependency and multi-dependency heads.
280
+ output_ud = self.dependency_head_ud(
281
+ h_arc_head,
282
+ h_arc_dep,
283
+ h_rel_head,
284
+ h_rel_dep,
285
+ gold_arcs=gold_ud,
286
+ null_mask=null_mask,
287
+ padding_mask=padding_mask
288
+ )
289
+ output_eud = self.dependency_head_eud(
290
+ h_arc_head,
291
+ h_arc_dep,
292
+ h_rel_head,
293
+ h_rel_dep,
294
+ gold_arcs=gold_eud,
295
+ # Ignore null mask in E-UD
296
+ null_mask=torch.ones_like(padding_mask),
297
+ padding_mask=padding_mask
298
+ )
299
+
300
+ return {
301
+ 'preds_ud': output_ud["preds"],
302
+ 'preds_eud': output_eud["preds"],
303
+ 'loss_ud': output_ud["loss"],
304
+ 'loss_eud': output_eud["loss"]
305
+ }
encoder.py ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ from torch import nn
3
+ from torch import Tensor, LongTensor
4
+
5
+ from transformers import AutoTokenizer, AutoModel
6
+
7
+
8
+ class WordTransformerEncoder(nn.Module):
9
+ """
10
+ Encodes sentences into word-level embeddings using a pretrained MLM transformer.
11
+ """
12
+ def __init__(self, model_name: str):
13
+ super().__init__()
14
+ self.tokenizer = AutoTokenizer.from_pretrained(model_name)
15
+ # Model like BERT, RoBERTa, etc.
16
+ self.model = AutoModel.from_pretrained(model_name)
17
+
18
+ def forward(self, words: list[list[str]]) -> Tensor:
19
+ """
20
+ Build words embeddings.
21
+
22
+ - Tokenizes input sentences into subtokens.
23
+ - Passes the subtokens through the pre-trained transformer model.
24
+ - Aggregates subtoken embeddings into word embeddings using mean pooling.
25
+ """
26
+ batch_size = len(words)
27
+
28
+ # BPE tokenization: split words into subtokens, e.g. ['kidding'] -> ['▁ki', 'dding'].
29
+ subtokens = self.tokenizer(
30
+ words,
31
+ padding=True,
32
+ truncation=True,
33
+ is_split_into_words=True,
34
+ return_tensors='pt'
35
+ )
36
+ subtokens = subtokens.to(self.model.device)
37
+ # Index words from 1 and reserve 0 for special subtokens (e.g. <s>, </s>, padding, etc.).
38
+ # Such numeration makes a following aggregation easier.
39
+ words_ids = torch.stack([
40
+ torch.tensor(
41
+ [word_id + 1 if word_id is not None else 0 for word_id in subtokens.word_ids(batch_idx)],
42
+ dtype=torch.long,
43
+ device=self.model.device
44
+ )
45
+ for batch_idx in range(batch_size)
46
+ ])
47
+
48
+ # Run model and extract subtokens embeddings from the last layer.
49
+ subtokens_embeddings = self.model(**subtokens).last_hidden_state
50
+
51
+ # Aggreate subtokens embeddings into words embeddings.
52
+ # [batch_size, n_words, embedding_size]
53
+ words_emeddings = self._aggregate_subtokens_embeddings(subtokens_embeddings, words_ids)
54
+ return words_emeddings
55
+
56
+ def _aggregate_subtokens_embeddings(
57
+ self,
58
+ subtokens_embeddings: Tensor, # [batch_size, n_subtokens, embedding_size]
59
+ words_ids: LongTensor # [batch_size, n_subtokens]
60
+ ) -> Tensor:
61
+ """
62
+ Aggregate subtoken embeddings into word embeddings by averaging.
63
+
64
+ This method ensures that multiple subtokens corresponding to a single word are combined
65
+ into a single embedding.
66
+ """
67
+ batch_size, n_subtokens, embedding_size = subtokens_embeddings.shape
68
+ # The number of words in a sentence plus an "auxiliary" word in the beginnig.
69
+ n_words = torch.max(words_ids) + 1
70
+
71
+ words_embeddings = torch.zeros(
72
+ size=(batch_size, n_words, embedding_size),
73
+ dtype=subtokens_embeddings.dtype,
74
+ device=self.model.device
75
+ )
76
+ words_ids_expanded = words_ids.unsqueeze(-1).expand(batch_size, n_subtokens, embedding_size)
77
+
78
+ # Use scatter_reduce_ to average embeddings of subtokens corresponding to the same word.
79
+ # All the padding and special subtokens will be aggregated into an "auxiliary" first embedding,
80
+ # namely into words_embeddings[:, 0, :].
81
+ words_embeddings.scatter_reduce_(
82
+ dim=1,
83
+ index=words_ids_expanded,
84
+ src=subtokens_embeddings,
85
+ reduce="mean",
86
+ include_self=False
87
+ )
88
+ # Now remove the auxiliary word in the beginning.
89
+ words_embeddings = words_embeddings[:, 1:, :]
90
+ return words_embeddings
91
+
92
+ def get_embedding_size(self) -> int:
93
+ """Returns the embedding size of the transformer model, e.g. 768 for BERT."""
94
+ return self.model.config.hidden_size
95
+
96
+ def get_embeddings_layer(self):
97
+ """Returns the embeddings model."""
98
+ return self.model.embeddings
99
+
100
+ def get_transformer_layers(self) -> list[nn.Module]:
101
+ """
102
+ Return a flat list of all transformer-*block* layers, excluding embeddings/poolers, etc.
103
+ """
104
+ layers = []
105
+ for sub in self.model.modules():
106
+ # find all ModuleLists (these always hold the actual block layers)
107
+ if isinstance(sub, nn.ModuleList):
108
+ layers.extend(list(sub))
109
+ return layers
mlp_classifier.py ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ from torch import nn
3
+ from torch import Tensor, LongTensor
4
+
5
+ from transformers.activations import ACT2FN
6
+
7
+
8
+ class MlpClassifier(nn.Module):
9
+ """ Simple feed-forward multilayer perceptron classifier. """
10
+
11
+ def __init__(
12
+ self,
13
+ input_size: int,
14
+ hidden_size: int,
15
+ n_classes: int,
16
+ activation: str,
17
+ dropout: float,
18
+ class_weights: list[float] = None,
19
+ ):
20
+ super().__init__()
21
+
22
+ self.n_classes = n_classes
23
+ self.classifier = nn.Sequential(
24
+ nn.Dropout(dropout),
25
+ nn.Linear(input_size, hidden_size),
26
+ ACT2FN[activation],
27
+ nn.Dropout(dropout),
28
+ nn.Linear(hidden_size, n_classes)
29
+ )
30
+ if class_weights is not None:
31
+ class_weights = torch.tensor(class_weights, dtype=torch.long)
32
+ self.cross_entropy = nn.CrossEntropyLoss(weight=class_weights)
33
+
34
+ def forward(self, embeddings: Tensor, labels: LongTensor = None) -> dict:
35
+ logits = self.classifier(embeddings)
36
+ # Calculate loss.
37
+ loss = 0.0
38
+ if labels is not None:
39
+ # Reshape tensors to match expected dimensions
40
+ loss = self.cross_entropy(
41
+ logits.view(-1, self.n_classes),
42
+ labels.view(-1)
43
+ )
44
+ # Predictions.
45
+ preds = logits.argmax(dim=-1)
46
+ return {'preds': preds, 'loss': loss}
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:febee3c2fe78451b6d0779baefe969f989827af002913b4f53382d6ec1220fee
3
+ size 1164706348
modeling_parser.py ADDED
@@ -0,0 +1,171 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from torch import nn
2
+ from torch import LongTensor
3
+ from transformers import PreTrainedModel
4
+
5
+ from .configuration import CobaldParserConfig
6
+ from .encoder import WordTransformerEncoder
7
+ from .mlp_classifier import MlpClassifier
8
+ from .dependency_classifier import DependencyClassifier
9
+ from .utils import (
10
+ build_padding_mask,
11
+ build_null_mask,
12
+ prepend_cls,
13
+ remove_nulls,
14
+ add_nulls
15
+ )
16
+
17
+
18
+ class CobaldParser(PreTrainedModel):
19
+ """Morpho-Syntax-Semantic Parser."""
20
+
21
+ config_class = CobaldParserConfig
22
+
23
+ def __init__(self, config: CobaldParserConfig):
24
+ super().__init__(config)
25
+
26
+ self.encoder = WordTransformerEncoder(
27
+ model_name=config.encoder_model_name
28
+ )
29
+ embedding_size = self.encoder.get_embedding_size()
30
+
31
+ self.classifiers = nn.ModuleDict()
32
+ self.classifiers["null"] = MlpClassifier(
33
+ input_size=self.encoder.get_embedding_size(),
34
+ hidden_size=config.null_classifier_hidden_size,
35
+ n_classes=config.consecutive_null_limit + 1,
36
+ activation=config.activation,
37
+ dropout=config.dropout
38
+ )
39
+ if "lemma_rule" in config.vocabulary:
40
+ self.classifiers["lemma_rule"] = MlpClassifier(
41
+ input_size=embedding_size,
42
+ hidden_size=config.lemma_classifier_hidden_size,
43
+ n_classes=len(config.vocabulary["lemma_rule"]),
44
+ activation=config.activation,
45
+ dropout=config.dropout
46
+ )
47
+ if "joint_feats" in config.vocabulary:
48
+ self.classifiers["joint_feats"] = MlpClassifier(
49
+ input_size=embedding_size,
50
+ hidden_size=config.morphology_classifier_hidden_size,
51
+ n_classes=len(config.vocabulary["joint_feats"]),
52
+ activation=config.activation,
53
+ dropout=config.dropout
54
+ )
55
+ if "ud_deprel" in config.vocabulary or "eud_deprel" in config.vocabulary:
56
+ self.classifiers["syntax"] = DependencyClassifier(
57
+ input_size=embedding_size,
58
+ hidden_size=config.dependency_classifier_hidden_size,
59
+ n_rels_ud=len(config.vocabulary["ud_deprel"]),
60
+ n_rels_eud=len(config.vocabulary["eud_deprel"]),
61
+ activation=config.activation,
62
+ dropout=config.dropout
63
+ )
64
+ if "misc" in config.vocabulary:
65
+ self.classifiers["misc"] = MlpClassifier(
66
+ input_size=embedding_size,
67
+ hidden_size=config.misc_classifier_hidden_size,
68
+ n_classes=len(config.vocabulary["misc"]),
69
+ activation=config.activation,
70
+ dropout=config.dropout
71
+ )
72
+ if "deepslot" in config.vocabulary:
73
+ self.classifiers["deepslot"] = MlpClassifier(
74
+ input_size=embedding_size,
75
+ hidden_size=config.deepslot_classifier_hidden_size,
76
+ n_classes=len(config.vocabulary["deepslot"]),
77
+ activation=config.activation,
78
+ dropout=config.dropout
79
+ )
80
+ if "semclass" in config.vocabulary:
81
+ self.classifiers["semclass"] = MlpClassifier(
82
+ input_size=embedding_size,
83
+ hidden_size=config.semclass_classifier_hidden_size,
84
+ n_classes=len(config.vocabulary["semclass"]),
85
+ activation=config.activation,
86
+ dropout=config.dropout
87
+ )
88
+
89
+ def forward(
90
+ self,
91
+ words: list[list[str]],
92
+ counting_masks: LongTensor = None,
93
+ lemma_rules: LongTensor = None,
94
+ joint_feats: LongTensor = None,
95
+ deps_ud: LongTensor = None,
96
+ deps_eud: LongTensor = None,
97
+ miscs: LongTensor = None,
98
+ deepslots: LongTensor = None,
99
+ semclasses: LongTensor = None,
100
+ sent_ids: list[str] = None,
101
+ texts: list[str] = None,
102
+ inference_mode: bool = False
103
+ ) -> dict:
104
+ output = {}
105
+
106
+ # Extra [CLS] token accounts for the case when #NULL is the first token in a sentence.
107
+ words_with_cls = prepend_cls(words)
108
+ words_without_nulls = remove_nulls(words_with_cls)
109
+ # Embeddings of words without nulls.
110
+ embeddings_without_nulls = self.encoder(words_without_nulls)
111
+ # Predict nulls.
112
+ null_output = self.classifiers["null"](embeddings_without_nulls, counting_masks)
113
+ output["counting_mask"] = null_output['preds']
114
+ output["loss"] = null_output["loss"]
115
+
116
+ # "Teacher forcing": during training, pass the original words (with gold nulls)
117
+ # to the classification heads, so that they are trained upon correct sentences.
118
+ if inference_mode:
119
+ # Restore predicted nulls in the original sentences.
120
+ output["words"] = add_nulls(words, null_output["preds"])
121
+ else:
122
+ output["words"] = words
123
+
124
+ # Encode words with nulls.
125
+ # [batch_size, seq_len, embedding_size]
126
+ embeddings = self.encoder(output["words"])
127
+
128
+ # Predict lemmas and morphological features.
129
+ if "lemma_rule" in self.classifiers:
130
+ lemma_output = self.classifiers["lemma_rule"](embeddings, lemma_rules)
131
+ output["lemma_rules"] = lemma_output['preds']
132
+ output["loss"] += lemma_output['loss']
133
+
134
+ if "joint_feats" in self.classifiers:
135
+ joint_feats_output = self.classifiers["joint_feats"](embeddings, joint_feats)
136
+ output["joint_feats"] = joint_feats_output['preds']
137
+ output["loss"] += joint_feats_output['loss']
138
+
139
+ # Predict syntax.
140
+ if "syntax" in self.classifiers:
141
+ padding_mask = build_padding_mask(output["words"], self.device)
142
+ null_mask = build_null_mask(output["words"], self.device)
143
+ deps_output = self.classifiers["syntax"](
144
+ embeddings,
145
+ deps_ud,
146
+ deps_eud,
147
+ null_mask,
148
+ padding_mask
149
+ )
150
+ output["deps_ud"] = deps_output['preds_ud']
151
+ output["deps_eud"] = deps_output['preds_eud']
152
+ output["loss"] += deps_output['loss_ud'] + deps_output['loss_eud']
153
+
154
+ # Predict miscellaneous features.
155
+ if "misc" in self.classifiers:
156
+ misc_output = self.classifiers["misc"](embeddings, miscs)
157
+ output["miscs"] = misc_output['preds']
158
+ output["loss"] += misc_output['loss']
159
+
160
+ # Predict semantics.
161
+ if "deepslot" in self.classifiers:
162
+ deepslot_output = self.classifiers["deepslot"](embeddings, deepslots)
163
+ output["deepslots"] = deepslot_output['preds']
164
+ output["loss"] += deepslot_output['loss']
165
+
166
+ if "semclass" in self.classifiers:
167
+ semclass_output = self.classifiers["semclass"](embeddings, semclasses)
168
+ output["semclasses"] = semclass_output['preds']
169
+ output["loss"] += semclass_output['loss']
170
+
171
+ return output
runs/Jun02_11-26-31_b20c304d4aee/events.out.tfevents.1748863678.b20c304d4aee.2886.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eddd2088faf8cac4a436178cb64c6c356d5827c7c000965e281975fa7a538139
3
+ size 75520
runs/Jun02_11-29-35_b20c304d4aee/events.out.tfevents.1748863798.b20c304d4aee.3759.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:30258dcc4cdce035babfb0897cfd1fce99d5062bc55d625d7adf6ef1a6512840
3
+ size 75520
runs/Jun02_11-31-40_b20c304d4aee/events.out.tfevents.1748863923.b20c304d4aee.4331.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:edaf96e6cfce003aec189d98126bd8fb9edf21f3d141d263f4db1346b0c6f6ac
3
+ size 75520
runs/Jun02_11-39-26_b20c304d4aee/events.out.tfevents.1748864395.b20c304d4aee.6344.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9585e236154a2879b08dbc15b3237c6e656e60bbf31302cbdcc32993a9a5add8
3
+ size 79755
runs/Jun02_11-41-53_b20c304d4aee/events.out.tfevents.1748864550.b20c304d4aee.7023.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e3e2af7a645cc472b5e54ea09fb372999aaa1783101f270edf6081a31ecaa33b
3
+ size 81553
runs/Jun02_11-56-41_b20c304d4aee/events.out.tfevents.1748865428.b20c304d4aee.10833.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5e66c27937281332273136813d0fc467f224d417b54fa1cc597d499a546436f9
3
+ size 81553
runs/Jun02_12-01-23_b20c304d4aee/events.out.tfevents.1748865720.b20c304d4aee.12053.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0be88191e4962ed67156cae4d329e9e09d27353bb4417d6ac99e949ea5d92564
3
+ size 81553
runs/Jun02_12-03-50_b20c304d4aee/events.out.tfevents.1748865865.b20c304d4aee.12757.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:709139cf6f00317f591bd1da7d16d662f7d608c7aa91d298ee9b02112a53d51c
3
+ size 79757
runs/Jun02_12-05-59_b20c304d4aee/events.out.tfevents.1748865998.b20c304d4aee.13334.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b5f75e015208d98ca0878082206cf1579de9258480fa6d1559cf29462ba7c64
3
+ size 88206
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:51f7e6a2220ee8f18ef289db60b31fa8ec735bb4c9ccd3fafebd3d7a812071a1
3
+ size 5496
utils.py ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ from torch import Tensor
3
+
4
+
5
+ def pad_sequences(sequences: list[Tensor], padding_value: int) -> Tensor:
6
+ """
7
+ Stack 1d tensors (sequences) into a single 2d tensor so that each sequence is padded on the
8
+ right.
9
+ """
10
+ return torch.nn.utils.rnn.pad_sequence(sequences, padding_value=padding_value, batch_first=True)
11
+
12
+
13
+ def _build_condition_mask(sentences: list[list[str]], condition_fn: callable, device) -> Tensor:
14
+ masks = [
15
+ torch.tensor([condition_fn(word) for word in sentence], dtype=bool, device=device)
16
+ for sentence in sentences
17
+ ]
18
+ return pad_sequences(masks, padding_value=False)
19
+
20
+ def build_padding_mask(sentences: list[list[str]], device) -> Tensor:
21
+ return _build_condition_mask(sentences, condition_fn=lambda word: True, device=device)
22
+
23
+ def build_null_mask(sentences: list[list[str]], device) -> Tensor:
24
+ return _build_condition_mask(sentences, condition_fn=lambda word: word != "#NULL", device=device)
25
+
26
+
27
+ def pairwise_mask(masks1d: Tensor) -> Tensor:
28
+ """
29
+ Calculate an outer product of a mask, i.e. masks2d[:, i, j] = masks1d[:, i] & masks1d[:, j].
30
+ """
31
+ return masks1d[:, None, :] & masks1d[:, :, None]
32
+
33
+
34
+ # Credits: https://docs.allennlp.org/main/api/nn/util/#replace_masked_values
35
+ def replace_masked_values(tensor: Tensor, mask: Tensor, replace_with: float):
36
+ """
37
+ Replace all masked values in tensor with `replace_with`.
38
+ """
39
+ assert tensor.dim() == mask.dim(), "tensor.dim() of {tensor.dim()} != mask.dim() of {mask.dim()}"
40
+ tensor.masked_fill_(~mask, replace_with)
41
+
42
+
43
+ def prepend_cls(sentences: list[list[str]]) -> list[list[str]]:
44
+ """
45
+ Return a copy of sentences with [CLS] token prepended.
46
+ """
47
+ return [["[CLS]", *sentence] for sentence in sentences]
48
+
49
+ def remove_nulls(sentences: list[list[str]]) -> list[list[str]]:
50
+ """
51
+ Return a copy of sentences with nulls removed.
52
+ """
53
+ return [[word for word in sentence if word != "#NULL"] for sentence in sentences]
54
+
55
+ def add_nulls(sentences: list[list[str]], counting_mask) -> list[list[str]]:
56
+ """
57
+ Return a copy of sentences with nulls restored according to counting masks.
58
+ """
59
+ sentences_with_nulls = []
60
+ for sentence, counting_mask in zip(sentences, counting_mask, strict=True):
61
+ sentence_with_nulls = []
62
+ assert 0 < len(counting_mask)
63
+ # Account for leading (CLS) auxiliary token.
64
+ sentence_with_nulls.extend(["#NULL"] * counting_mask[0])
65
+ for word, n_nulls_to_insert in zip(sentence, counting_mask[1:], strict=True):
66
+ sentence_with_nulls.append(word)
67
+ sentence_with_nulls.extend(["#NULL"] * n_nulls_to_insert)
68
+ sentences_with_nulls.append(sentence_with_nulls)
69
+ return sentences_with_nulls