Commit
·
a148e41
1
Parent(s):
72e7501
Update README.md
Browse files
README.md
CHANGED
|
@@ -7,4 +7,10 @@ sdk: static
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
+
Initiang from the recent work of (Chalkidis, Garneau, et al.), "LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development", we release legal NLP resources to broaden legal NLP research, while also helping practioners who aim to build assistive legal NLP technologies.
|
| 11 |
+
|
| 12 |
+
As of May 2023, we released:
|
| 13 |
+
|
| 14 |
+
- LeXFiles (https://huggingface.co/datasets/lexlms/lex_files), a new diverse English legal corpus including 11 sub-corpora that cover legislation and case law from 6 primarily English-speaking legal systems (EU, CoE, Canada, US, UK, India). The corpus comprises approx. 6 million documents which sum up to approx. 19 billion tokens.
|
| 15 |
+
- LegalLAMA (https://huggingface.co/datasets/lexlms/legal_lama), a diverse probing benchmark suite comprising 8 sub-tasks that aims to assess the acquaintance of legal knowledge that PLMs acquired in pre-training.
|
| 16 |
+
- 2 new legal-oriented PLMs, dubbed LexLMs (https://huggingface.co/models?search=lexlms/legal-roberta), warm-started from the RoBERTa models, and further pre-trained on LeXFiles for 1M additional steps.
|