Changwei0921 ylfeng commited on
Commit
188a123
·
0 Parent(s):

Duplicate from LTP/base

Browse files

Co-authored-by: Feng YunLong <ylfeng@users.noreply.huggingface.co>

.gitattributes ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ftz filter=lfs diff=lfs merge=lfs -text
6
+ *.gz filter=lfs diff=lfs merge=lfs -text
7
+ *.h5 filter=lfs diff=lfs merge=lfs -text
8
+ *.joblib filter=lfs diff=lfs merge=lfs -text
9
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
10
+ *.model filter=lfs diff=lfs merge=lfs -text
11
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
12
+ *.npy filter=lfs diff=lfs merge=lfs -text
13
+ *.npz filter=lfs diff=lfs merge=lfs -text
14
+ *.onnx filter=lfs diff=lfs merge=lfs -text
15
+ *.ot filter=lfs diff=lfs merge=lfs -text
16
+ *.parquet filter=lfs diff=lfs merge=lfs -text
17
+ *.pb filter=lfs diff=lfs merge=lfs -text
18
+ *.pickle filter=lfs diff=lfs merge=lfs -text
19
+ *.pkl filter=lfs diff=lfs merge=lfs -text
20
+ *.pt filter=lfs diff=lfs merge=lfs -text
21
+ *.pth filter=lfs diff=lfs merge=lfs -text
22
+ *.rar filter=lfs diff=lfs merge=lfs -text
23
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
24
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
25
+ *.tflite filter=lfs diff=lfs merge=lfs -text
26
+ *.tgz filter=lfs diff=lfs merge=lfs -text
27
+ *.wasm filter=lfs diff=lfs merge=lfs -text
28
+ *.xz filter=lfs diff=lfs merge=lfs -text
29
+ *.zip filter=lfs diff=lfs merge=lfs -text
30
+ *.zst filter=lfs diff=lfs merge=lfs -text
31
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
32
+ model.safetensors filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,160 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ![CODE SIZE](https://img.shields.io/github/languages/code-size/HIT-SCIR/ltp)
2
+ ![CONTRIBUTORS](https://img.shields.io/github/contributors/HIT-SCIR/ltp)
3
+ ![LAST COMMIT](https://img.shields.io/github/last-commit/HIT-SCIR/ltp)
4
+
5
+ | Language | version |
6
+ | ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
7
+ | [Python](python/interface/README.md) | [![LTP](https://img.shields.io/pypi/v/ltp?label=LTP)](https://pypi.org/project/ltp) [![LTP-Core](https://img.shields.io/pypi/v/ltp-core?label=LTP-Core)](https://pypi.org/project/ltp-core) [![LTP-Extension](https://img.shields.io/pypi/v/ltp-extension?label=LTP-Extension)](https://pypi.org/project/ltp-extension) |
8
+ | [Rust](rust/ltp/README.md) | [![LTP](https://img.shields.io/crates/v/ltp?label=LTP)](https://crates.io/crates/ltp) |
9
+
10
+ # LTP 4
11
+
12
+ LTP(Language Technology Platform) 提供了一系列中文自然语言处理工具,用户可以使用这些工具对于中文文本进行分词、词性标注、句法分析等等工作。
13
+
14
+ ## 引用
15
+
16
+ 如果您在工作中使用了 LTP,您可以引用这篇论文
17
+
18
+ ```bibtex
19
+ @article{che2020n,
20
+ title={N-LTP: A Open-source Neural Chinese Language Technology Platform with Pretrained Models},
21
+ author={Che, Wanxiang and Feng, Yunlong and Qin, Libo and Liu, Ting},
22
+ journal={arXiv preprint arXiv:2009.11616},
23
+ year={2020}
24
+ }
25
+ ```
26
+
27
+ **参考书:**
28
+ 由哈工大社会计算与信息检索研究中心(HIT-SCIR)的多位学者共同编著的《[自然语言处理:基于预训练模型的方法](https://item.jd.com/13344628.html)
29
+ 》(作者:车万翔、郭江、崔一鸣;主审:刘挺)一书现已正式出版,该书重点介绍了新的基于预训练模型的自然语言处理技术,包括基础知识、预训练词向量和预训练模型三大部分,可供广大LTP用户学习参考。
30
+
31
+ ### 更新说明
32
+
33
+ - 4.2.0
34
+ - \[结构性变化\] 将 LTP 拆分成 2 个部分,维护和训练更方便,结构更清晰
35
+ - \[Legacy 模型\] 针对广大用户对于**推理速度**的需求,使用 Rust 重写了基于感知机的算法,准确率与 LTP3 版本相当,速度则是 LTP v3 的 **3.55** 倍,开启多线程更可获得 **17.17** 倍的速度提升,但目前仅支持分词、词性、命名实体三大任务
36
+ - \[深度学习模型\] 即基于 PyTorch 实现的深度学习模型,支持全部的6大任务(分词/词性/命名实体/语义角色/依存句法/语义依存)
37
+ - \[其他改进\] 改进了模型训练方法
38
+ - \[共同\] 提供了训练脚本和训练样例,使得用户能够更方便地使用私有的数据,自行训练个性化的模型
39
+ - \[深度学习模型\] 采用 hydra 对训练过程进行配置,方便广大用户修改模型训练参数以及对 LTP 进行扩展(比如使用其他包中的 Module)
40
+ - \[其他变化\] 分词、依存句法分析 (Eisner) 和 语义依存分析 (Eisner) 任务的解码算法使用 Rust 实现,速度更快
41
+ - \[新特性\] 模型上传至 [Huggingface Hub](https://huggingface.co/LTP),支持自动下载,下载速度更快,并且支持用户自行上传自己训练的模型供LTP进行推理使用
42
+ - \[破坏性变更\] 改用 Pipeline API 进行推理,方便后续进行更深入的性能优化(如SDP和SDPG很大一部分是重叠的,重用可以加快推理速度),使用说明参见[Github快速使用部分](https://github.com/hit-scir/ltp)
43
+ - 4.1.0
44
+ - 提供了自定义分词等功能
45
+ - 修复了一些bug
46
+ - 4.0.0
47
+ - 基于Pytorch 开发,原生 Python 接口
48
+ - 可根据需要自由选择不同速度和指标的模型
49
+ - 分词、词性、命名实体、依存句法、语义角色、语义依存6大任务
50
+
51
+ ## 快速使用
52
+
53
+ ### [Python](python/interface/README.md)
54
+
55
+ ```bash
56
+ pip install -U ltp ltp-core ltp-extension -i https://pypi.org/simple # 安装 ltp
57
+ ```
58
+
59
+ **注:** 如果遇到任何错误,请尝试使用上述命令重新安装 ltp,如果依然报错,请在 Github issues 中反馈。
60
+
61
+ ```python
62
+ import torch
63
+ from ltp import LTP
64
+
65
+ ltp = LTP("LTP/small") # 默认加载 Small 模型
66
+
67
+ # 将模型移动到 GPU 上
68
+ if torch.cuda.is_available():
69
+ # ltp.cuda()
70
+ ltp.to("cuda")
71
+
72
+ output = ltp.pipeline(["他叫汤姆去拿外衣。"], tasks=["cws", "pos", "ner", "srl", "dep", "sdp"])
73
+ # 使用字典格式作为返回结果
74
+ print(output.cws) # print(output[0]) / print(output['cws']) # 也可以使用下标访问
75
+ print(output.pos)
76
+ print(output.sdp)
77
+
78
+ # 使用感知机算法实现的分词、词性和命名实体识别,速度比较快,但是精度略低
79
+ ltp = LTP("LTP/legacy")
80
+ # cws, pos, ner = ltp.pipeline(["他叫汤姆去拿外衣。"], tasks=["cws", "ner"]).to_tuple() # error: NER 需要 词性标注任务的结果
81
+ cws, pos, ner = ltp.pipeline(["他叫汤姆去拿外衣。"], tasks=["cws", "pos", "ner"]).to_tuple() # to tuple 可以自动转换为元组格式
82
+ # 使用元组格式作为返回结果
83
+ print(cws, pos, ner)
84
+ ```
85
+
86
+ **[详细说明](python/interface/docs/quickstart.rst)**
87
+
88
+ ### [Rust](rust/ltp/README.md)
89
+
90
+ ```rust
91
+ use std::fs::File;
92
+ use itertools::multizip;
93
+ use ltp::{CWSModel, POSModel, NERModel, ModelSerde, Format, Codec};
94
+
95
+ fn main() -> Result<(), Box<dyn std::error::Error>> {
96
+ let file = File::open("data/legacy-models/cws_model.bin")?;
97
+ let cws: CWSModel = ModelSerde::load(file, Format::AVRO(Codec::Deflate))?;
98
+ let file = File::open("data/legacy-models/pos_model.bin")?;
99
+ let pos: POSModel = ModelSerde::load(file, Format::AVRO(Codec::Deflate))?;
100
+ let file = File::open("data/legacy-models/ner_model.bin")?;
101
+ let ner: NERModel = ModelSerde::load(file, Format::AVRO(Codec::Deflate))?;
102
+
103
+ let words = cws.predict("他叫汤姆去拿外衣。")?;
104
+ let pos = pos.predict(&words)?;
105
+ let ner = ner.predict((&words, &pos))?;
106
+
107
+ for (w, p, n) in multizip((words, pos, ner)) {
108
+ println!("{}/{}/{}", w, p, n);
109
+ }
110
+
111
+ Ok(())
112
+ }
113
+ ```
114
+
115
+ ## 模型性能以及下载地址
116
+
117
+ | 深度学习模型 | 分词 | 词性 | 命名实体 | 语义角色 | 依存句法 | 语义依存 | 速度(句/S) |
118
+ | :---------------------------------------: | :---: | :---: | :---: | :---: | :---: | :---: | :-----: |
119
+ | [Base](https://huggingface.co/LTP/base) | 98.7 | 98.5 | 95.4 | 80.6 | 89.5 | 75.2 | 39.12 |
120
+ | [Base1](https://huggingface.co/LTP/base1) | 99.22 | 98.73 | 96.39 | 79.28 | 89.57 | 76.57 | --.-- |
121
+ | [Base2](https://huggingface.co/LTP/base2) | 99.18 | 98.69 | 95.97 | 79.49 | 90.19 | 76.62 | --.-- |
122
+ | [Small](https://huggingface.co/LTP/small) | 98.4 | 98.2 | 94.3 | 78.4 | 88.3 | 74.7 | 43.13 |
123
+ | [Tiny](https://huggingface.co/LTP/tiny) | 96.8 | 97.1 | 91.6 | 70.9 | 83.8 | 70.1 | 53.22 |
124
+
125
+ | 感知机算法 | 分词 | 词性 | 命名实体 | 速度(句/s) | 备注 |
126
+ | :-----------------------------------------: | :---: | :---: | :---: | :------: | :------------------------: |
127
+ | [Legacy](https://huggingface.co/LTP/legacy) | 97.93 | 98.41 | 94.28 | 21581.48 | [性能详情](rust/ltp/README.md) |
128
+
129
+ **注:感知机算法速度为开启16线程速度**
130
+
131
+ ## 构建 Wheel 包
132
+
133
+ ```shell script
134
+ make bdist
135
+ ```
136
+
137
+ ## 其他语言绑定
138
+
139
+ **感知机算法**
140
+
141
+ - [Rust](rust/ltp)
142
+ - [C/C++](rust/ltp-cffi)
143
+
144
+ **深度学习算法**
145
+
146
+ - [Rust](https://github.com/HIT-SCIR/libltp/tree/master/ltp-rs)
147
+ - [C++](https://github.com/HIT-SCIR/libltp/tree/master/ltp-cpp)
148
+ - [Java](https://github.com/HIT-SCIR/libltp/tree/master/ltp-java)
149
+
150
+ ## 作者信息
151
+
152
+ - 冯云龙 \<\<[ylfeng@ir.hit.edu.cn](mailto:ylfeng@ir.hit.edu.cn)>>
153
+
154
+ ## 开源协议
155
+
156
+ 1. 语言技术平台面向国内外大学、中科院各研究所以及个人研究者免费开放源代码,但如上述机构和个人将该平台用于商业目的(如企业合作项目等)则需要付费。
157
+ 2. 除上述机构以外的企事业单位,如申请使用该平台,需付费。
158
+ 3. 凡涉及付费问题,请发邮件到 car@ir.hit.edu.cn 洽商。
159
+ 4. 如果您在 LTP 基础上发表论文或取得科研成果,请您在发表论文和申报成果时声明“使用了哈工大社会计算与信息检索研究中心研制的语言技术平台(LTP)”.
160
+ 同时,发信给car@ir.hit.edu.cn,说明发表论文或申报成果的题目、出处等。
added_tokens.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {}
config.json ADDED
@@ -0,0 +1,352 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model": {
3
+ "_ltp_target_": "ltp_core.models.ltp_model.LTPModule",
4
+ "backbone": {
5
+ "_ltp_target_": "ltp_core.models.utils.load_transformers",
6
+ "config": {
7
+ "output_attentions": false,
8
+ "output_hidden_states": false,
9
+ "use_cache": true,
10
+ "torchscript": false,
11
+ "use_bfloat16": false,
12
+ "pruned_heads": {},
13
+ "is_encoder_decoder": false,
14
+ "is_decoder": false,
15
+ "max_length": 20,
16
+ "min_length": 0,
17
+ "do_sample": false,
18
+ "early_stopping": false,
19
+ "num_beams": 1,
20
+ "temperature": 1.0,
21
+ "top_k": 50,
22
+ "top_p": 1.0,
23
+ "repetition_penalty": 1.0,
24
+ "length_penalty": 1.0,
25
+ "no_repeat_ngram_size": 0,
26
+ "bad_words_ids": null,
27
+ "num_return_sequences": 1,
28
+ "architectures": null,
29
+ "finetuning_task": null,
30
+ "id2label": {
31
+ "0": "LABEL_0",
32
+ "1": "LABEL_1"
33
+ },
34
+ "label2id": {
35
+ "LABEL_0": 0,
36
+ "LABEL_1": 1
37
+ },
38
+ "prefix": null,
39
+ "bos_token_id": null,
40
+ "pad_token_id": 0,
41
+ "eos_token_id": null,
42
+ "decoder_start_token_id": null,
43
+ "task_specific_params": null,
44
+ "xla_device": null,
45
+ "directionality": "bidi",
46
+ "vocab_size": 21128,
47
+ "embedding_size": 768,
48
+ "hidden_size": 768,
49
+ "num_hidden_layers": 12,
50
+ "num_attention_heads": 12,
51
+ "intermediate_size": 3072,
52
+ "hidden_act": "gelu",
53
+ "hidden_dropout_prob": 0.1,
54
+ "attention_probs_dropout_prob": 0.1,
55
+ "max_position_embeddings": 512,
56
+ "type_vocab_size": 2,
57
+ "initializer_range": 0.02,
58
+ "layer_norm_eps": 1e-12,
59
+ "model_type": "electra"
60
+ }
61
+ },
62
+ "processor": {
63
+ "cws": {
64
+ "_ltp_target_": "ltp_core.models.processor.TokenOnly"
65
+ },
66
+ "pos": {
67
+ "_ltp_target_": "ltp_core.models.processor.WordsOnly"
68
+ },
69
+ "ner": {
70
+ "_ltp_target_": "ltp_core.models.processor.WordsOnly"
71
+ },
72
+ "srl": {
73
+ "_ltp_target_": "ltp_core.models.processor.WordsOnly"
74
+ },
75
+ "dep": {
76
+ "_ltp_target_": "ltp_core.models.processor.WordsWithHead"
77
+ },
78
+ "sdp": {
79
+ "_ltp_target_": "ltp_core.models.processor.WordsWithHead"
80
+ }
81
+ },
82
+ "heads": {
83
+ "cws": {
84
+ "_ltp_target_": "ltp_core.models.components.token.MLPTokenClassifier",
85
+ "input_size": 768,
86
+ "num_labels": 2,
87
+ "dropout": 0.1
88
+ },
89
+ "pos": {
90
+ "_ltp_target_": "ltp_core.models.components.token.MLPTokenClassifier",
91
+ "input_size": 768,
92
+ "num_labels": 27,
93
+ "dropout": 0.1
94
+ },
95
+ "ner": {
96
+ "_ltp_target_": "ltp_core.models.components.token.RelTransformerTokenClassifier",
97
+ "input_size": 768,
98
+ "num_labels": 13,
99
+ "dropout": 0.1,
100
+ "num_heads": 4,
101
+ "num_layers": 2
102
+ },
103
+ "srl": {
104
+ "_ltp_target_": "ltp_core.models.components.token.BiaffineTokenClassifier",
105
+ "input_size": 768,
106
+ "hidden_size": 300,
107
+ "num_labels": 97,
108
+ "dropout": 0.1,
109
+ "use_crf": true
110
+ },
111
+ "dep": {
112
+ "_ltp_target_": "ltp_core.models.components.graph.BiaffineClassifier",
113
+ "input_size": 768,
114
+ "num_labels": 14,
115
+ "dropout": 0.1,
116
+ "arc_hidden_size": 500,
117
+ "rel_hidden_size": 100
118
+ },
119
+ "sdp": {
120
+ "_ltp_target_": "ltp_core.models.components.graph.BiaffineClassifier",
121
+ "input_size": 768,
122
+ "num_labels": 56,
123
+ "arc_hidden_size": 600,
124
+ "rel_hidden_size": 600
125
+ }
126
+ }
127
+ },
128
+ "nerual": true,
129
+ "vocabs": {
130
+ "cws": [
131
+ "B-W",
132
+ "I-W"
133
+ ],
134
+ "pos": [
135
+ "n",
136
+ "v",
137
+ "wp",
138
+ "u",
139
+ "d",
140
+ "a",
141
+ "m",
142
+ "p",
143
+ "r",
144
+ "ns",
145
+ "c",
146
+ "q",
147
+ "nt",
148
+ "nh",
149
+ "nd",
150
+ "j",
151
+ "i",
152
+ "b",
153
+ "ni",
154
+ "nz",
155
+ "nl",
156
+ "z",
157
+ "k",
158
+ "ws",
159
+ "o",
160
+ "h",
161
+ "e"
162
+ ],
163
+ "ner": [
164
+ "O",
165
+ "S-Ns",
166
+ "S-Nh",
167
+ "B-Ni",
168
+ "E-Ni",
169
+ "I-Ni",
170
+ "S-Ni",
171
+ "B-Ns",
172
+ "E-Ns",
173
+ "I-Ns",
174
+ "B-Nh",
175
+ "E-Nh",
176
+ "I-Nh"
177
+ ],
178
+ "srl": [
179
+ "O",
180
+ "B-A0",
181
+ "B-A0-ADV",
182
+ "B-A0-CND",
183
+ "B-A0-CRD",
184
+ "B-A0-MNR",
185
+ "B-A0-PRD",
186
+ "B-A0-PSE",
187
+ "B-A0-PSR",
188
+ "B-A0-QTY",
189
+ "B-A1",
190
+ "B-A1-CRD",
191
+ "B-A1-DIS",
192
+ "B-A1-FRQ",
193
+ "B-A1-PRD",
194
+ "B-A1-PSE",
195
+ "B-A1-PSR",
196
+ "B-A1-QTY",
197
+ "B-A1-TPC",
198
+ "B-A2",
199
+ "B-A2-CRD",
200
+ "B-A2-PRD",
201
+ "B-A2-PSE",
202
+ "B-A2-PSR",
203
+ "B-A2-QTY",
204
+ "B-A3",
205
+ "B-A3-TMP",
206
+ "B-A4",
207
+ "B-ARGM-ADV",
208
+ "B-ARGM-BNF",
209
+ "B-ARGM-CND",
210
+ "B-ARGM-CRD",
211
+ "B-ARGM-DGR",
212
+ "B-ARGM-DIR",
213
+ "B-ARGM-DIS",
214
+ "B-ARGM-EXT",
215
+ "B-ARGM-FRQ",
216
+ "B-ARGM-LOC",
217
+ "B-ARGM-MNR",
218
+ "B-ARGM-PRD",
219
+ "B-ARGM-PRP",
220
+ "B-ARGM-QTY",
221
+ "B-ARGM-T",
222
+ "B-ARGM-TMP",
223
+ "B-ARGM-TPC",
224
+ "B-rel-ADV",
225
+ "B-rel-DIS",
226
+ "B-rel-EXT",
227
+ "B-rel-MNR",
228
+ "I-A0",
229
+ "I-A0-ADV",
230
+ "I-A0-CND",
231
+ "I-A0-CRD",
232
+ "I-A0-MNR",
233
+ "I-A0-PRD",
234
+ "I-A0-PSE",
235
+ "I-A0-PSR",
236
+ "I-A0-QTY",
237
+ "I-A1",
238
+ "I-A1-CRD",
239
+ "I-A1-DIS",
240
+ "I-A1-FRQ",
241
+ "I-A1-PRD",
242
+ "I-A1-PSE",
243
+ "I-A1-PSR",
244
+ "I-A1-QTY",
245
+ "I-A1-TPC",
246
+ "I-A2",
247
+ "I-A2-CRD",
248
+ "I-A2-PRD",
249
+ "I-A2-PSE",
250
+ "I-A2-PSR",
251
+ "I-A2-QTY",
252
+ "I-A3",
253
+ "I-A3-TMP",
254
+ "I-A4",
255
+ "I-ARGM-ADV",
256
+ "I-ARGM-BNF",
257
+ "I-ARGM-CND",
258
+ "I-ARGM-CRD",
259
+ "I-ARGM-DGR",
260
+ "I-ARGM-DIR",
261
+ "I-ARGM-DIS",
262
+ "I-ARGM-EXT",
263
+ "I-ARGM-FRQ",
264
+ "I-ARGM-LOC",
265
+ "I-ARGM-MNR",
266
+ "I-ARGM-PRD",
267
+ "I-ARGM-PRP",
268
+ "I-ARGM-QTY",
269
+ "I-ARGM-T",
270
+ "I-ARGM-TMP",
271
+ "I-ARGM-TPC",
272
+ "I-rel-ADV",
273
+ "I-rel-DIS",
274
+ "I-rel-EXT",
275
+ "I-rel-MNR"
276
+ ],
277
+ "dep": [
278
+ "ATT",
279
+ "WP",
280
+ "ADV",
281
+ "VOB",
282
+ "SBV",
283
+ "COO",
284
+ "RAD",
285
+ "HED",
286
+ "POB",
287
+ "CMP",
288
+ "LAD",
289
+ "FOB",
290
+ "DBL",
291
+ "IOB"
292
+ ],
293
+ "sdp": [
294
+ "mDEPD",
295
+ "mPUNC",
296
+ "FEAT",
297
+ "mRELA",
298
+ "Root",
299
+ "AGT",
300
+ "eSUCC",
301
+ "EXP",
302
+ "MEAS",
303
+ "eCOO",
304
+ "CONT",
305
+ "LOC",
306
+ "DATV",
307
+ "LINK",
308
+ "PAT",
309
+ "TIME",
310
+ "dCONT",
311
+ "SCO",
312
+ "MANN",
313
+ "mNEG",
314
+ "ePREC",
315
+ "dFEAT",
316
+ "rEXP",
317
+ "dEXP",
318
+ "dTIME",
319
+ "rCONT",
320
+ "rAGT",
321
+ "dLINK",
322
+ "STAT",
323
+ "REAS",
324
+ "rPAT",
325
+ "TOOL",
326
+ "dSTAT",
327
+ "dMANN",
328
+ "rTIME",
329
+ "rLOC",
330
+ "dDATV",
331
+ "rFEAT",
332
+ "MATL",
333
+ "rDATV",
334
+ "dREAS",
335
+ "dLOC",
336
+ "rLINK",
337
+ "dPAT",
338
+ "rMANN",
339
+ "rREAS",
340
+ "rTOOL",
341
+ "rMEAS",
342
+ "dSCO",
343
+ "dMEAS",
344
+ "rSCO",
345
+ "dAGT",
346
+ "rMATL",
347
+ "rSTAT",
348
+ "dTOOL",
349
+ "dMATL"
350
+ ]
351
+ }
352
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a097964f7b7ed18cb469b88bfaaacd0fd2e74868788142e49070fce7a35bf073
3
+ size 553109320
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:30f651d5031f694e6320ceffc5fdba86c7681f0d9fe18dacab482e9c5e037bfb
3
+ size 557512529
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"init_inputs": []}
vocab.txt ADDED
The diff for this file is too large to render. See raw diff