Update README.md
Browse files
README.md
CHANGED
|
@@ -11,8 +11,10 @@ metrics:
|
|
| 11 |
---
|
| 12 |
# Web register classification (multilingual model)
|
| 13 |
|
| 14 |
-
|
| 15 |
-
|
|
|
|
|
|
|
| 16 |
## Model Details
|
| 17 |
|
| 18 |
### Model Description
|
|
@@ -30,13 +32,6 @@ A multilingual web register classification model fine-tuned from XLM-RoBERTa-lar
|
|
| 30 |
- **Repository:** https://github.com/TurkuNLP/pytorch-registerlabeling
|
| 31 |
- **Paper:** Coming soon!
|
| 32 |
|
| 33 |
-
## Uses
|
| 34 |
-
|
| 35 |
-
This model is designed for classifying texts scraped from the unrestricted web into 25 pre-defined categories based on a hierarchical register taxonomy.
|
| 36 |
-
The taxonomy, based on the [CORE taxonomy](https://www.cambridge.org/core/books/register-variation-online/D1D0F0E0BFEA077107F4686C357AA66B), is detailed [here](https://turkunlp.org/register-annotation-docs/abbreviations).
|
| 37 |
-
It is trained on English, Finnish, French, Swedish, and Turkish, and performs well in zero-shot labeling for other languages.
|
| 38 |
-
It is designed to support the development of open language models and for linguists analyzing register variation.
|
| 39 |
-
|
| 40 |
## How to Get Started with the Model
|
| 41 |
|
| 42 |
Use the code below to get started with the model.
|
|
|
|
| 11 |
---
|
| 12 |
# Web register classification (multilingual model)
|
| 13 |
|
| 14 |
+
Amultilingual web register classifier, fine-tuned from XLM-RoBERTa-large.
|
| 15 |
+
The model is trained with the multilingual CORE corpora across five languages (English, Finnish, French, Swedish, Turkish) to classify documents based on the [CORE taxonomy](https://www.cambridge.org/core/books/register-variation-online/D1D0F0E0BFEA077107F4686C357AA66B), detailed [here](https://turkunlp.org/register-annotation-docs/abbreviations).
|
| 16 |
+
The model demonstrates state-of-the-art performance in classifying web registers and achieves good zero-shot performance for additional languages.
|
| 17 |
+
It is designed to support the development of open language models and for linguists analyzing register variation.
|
| 18 |
## Model Details
|
| 19 |
|
| 20 |
### Model Description
|
|
|
|
| 32 |
- **Repository:** https://github.com/TurkuNLP/pytorch-registerlabeling
|
| 33 |
- **Paper:** Coming soon!
|
| 34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
## How to Get Started with the Model
|
| 36 |
|
| 37 |
Use the code below to get started with the model.
|