Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# 中文预训练Longformer模型 | Longformer_ZH with PyTorch
|
| 2 |
|
| 3 |
相比于Transformer的O(n^2)复杂度,Longformer提供了一种以线性复杂度处理最长4K字符级别文档序列的方法。Longformer Attention包括了标准的自注意力与全局注意力机制,方便模型更好地学习超长序列的信息。
|
|
@@ -37,8 +41,8 @@ LongformerZhForMaksedLM.from_pretrained('ValkyriaLenneth/longformer_zh')
|
|
| 37 |
## 关于预训练 | About Pretraining
|
| 38 |
- 我们的预训练语料来自 https://github.com/brightmart/nlp_chinese_corpus, 根据Longformer原文的设置,采用了多种语料混合的预训练数据。
|
| 39 |
- The corpus of pretraining is from https://github.com/brightmart/nlp_chinese_corpus. Based on the paper of Longformer, we use a mixture of 4 different chinese corpus for pretraining.
|
| 40 |
-
- 我们的模型是基于Roberta_zh_mid
|
| 41 |
-
- The basement of our model is Roberta_zh_mid
|
| 42 |
|
| 43 |
- 同时我们在原版基础上,引入了 `Whole-Word-Masking` 机制,以便更好地适应中文特性。
|
| 44 |
- We introduce `Whole-Word-Masking` method into pretraining for better fitting Chinese language.
|
|
@@ -97,6 +101,4 @@ LongformerZhForMaksedLM.from_pretrained('ValkyriaLenneth/longformer_zh')
|
|
| 97 |
## 致谢
|
| 98 |
感谢东京工业大学 奥村·船越研究室 提供算力。
|
| 99 |
|
| 100 |
-
Thanks Okumula·Funakoshi Lab from Tokyo Institute of Technology who provides the devices and oppotunity for me to finish this project.
|
| 101 |
-
|
| 102 |
-
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- zh
|
| 4 |
+
---
|
| 5 |
# 中文预训练Longformer模型 | Longformer_ZH with PyTorch
|
| 6 |
|
| 7 |
相比于Transformer的O(n^2)复杂度,Longformer提供了一种以线性复杂度处理最长4K字符级别文档序列的方法。Longformer Attention包括了标准的自注意力与全局注意力机制,方便模型更好地学习超长序列的信息。
|
|
|
|
| 41 |
## 关于预训练 | About Pretraining
|
| 42 |
- 我们的预训练语料来自 https://github.com/brightmart/nlp_chinese_corpus, 根据Longformer原文的设置,采用了多种语料混合的预训练数据。
|
| 43 |
- The corpus of pretraining is from https://github.com/brightmart/nlp_chinese_corpus. Based on the paper of Longformer, we use a mixture of 4 different chinese corpus for pretraining.
|
| 44 |
+
- 我们的模型是基于[Roberta_zh_mid](https://github.com/brightmart/roberta_zh),训练脚本参考https://github.com/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb
|
| 45 |
+
- The basement of our model is [Roberta_zh_mid](https://github.com/brightmart/roberta_zh). Pretraining scripts is modified from https://github.com/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb.
|
| 46 |
|
| 47 |
- 同时我们在原版基础上,引入了 `Whole-Word-Masking` 机制,以便更好地适应中文特性。
|
| 48 |
- We introduce `Whole-Word-Masking` method into pretraining for better fitting Chinese language.
|
|
|
|
| 101 |
## 致谢
|
| 102 |
感谢东京工业大学 奥村·船越研究室 提供算力。
|
| 103 |
|
| 104 |
+
Thanks Okumula·Funakoshi Lab from Tokyo Institute of Technology who provides the devices and oppotunity for me to finish this project.
|
|
|
|
|
|