Update README.md
Browse files
README.md
CHANGED
|
@@ -13,8 +13,9 @@ metrics:
|
|
| 13 |
|
| 14 |
A multilingual web register classifier, fine-tuned from XLM-RoBERTa-large.
|
| 15 |
The model is trained with the multilingual CORE corpora across five languages (English, Finnish, French, Swedish, Turkish) to classify documents based on the CORE taxonomy, detailed below.
|
| 16 |
-
The model demonstrates state-of-the-art performance in classifying web registers and achieves good zero-shot performance for additional languages.
|
| 17 |
It is designed to support the development of open language models and for linguists analyzing register variation.
|
|
|
|
| 18 |
## Model Details
|
| 19 |
|
| 20 |
### Model Description
|
|
@@ -69,7 +70,7 @@ The main labels are uppercase. To only include these main labels in the predicti
|
|
| 69 |
|
| 70 |
Use the code below to get started with the model.
|
| 71 |
|
| 72 |
-
```
|
| 73 |
import torch
|
| 74 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
| 75 |
|
|
@@ -133,7 +134,7 @@ Average inference time (across 1000 iterations), using a single NVIDIA A100 GPU
|
|
| 133 |
|
| 134 |
| Language | F1 (All labels) | F1 (Main labels) |
|
| 135 |
| -------- | --------------- | ---------------- |
|
| 136 |
-
| English | 0.72 |
|
| 137 |
| Finnish | 0.79 |
|
| 138 |
| French | 0.75 |
|
| 139 |
| Swedish | 0.81 |
|
|
|
|
| 13 |
|
| 14 |
A multilingual web register classifier, fine-tuned from XLM-RoBERTa-large.
|
| 15 |
The model is trained with the multilingual CORE corpora across five languages (English, Finnish, French, Swedish, Turkish) to classify documents based on the CORE taxonomy, detailed below.
|
| 16 |
+
The model demonstrates state-of-the-art performance in classifying web registers and achieves good zero-shot performance for additional languages (see Evaluation below).
|
| 17 |
It is designed to support the development of open language models and for linguists analyzing register variation.
|
| 18 |
+
|
| 19 |
## Model Details
|
| 20 |
|
| 21 |
### Model Description
|
|
|
|
| 70 |
|
| 71 |
Use the code below to get started with the model.
|
| 72 |
|
| 73 |
+
```python
|
| 74 |
import torch
|
| 75 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
| 76 |
|
|
|
|
| 134 |
|
| 135 |
| Language | F1 (All labels) | F1 (Main labels) |
|
| 136 |
| -------- | --------------- | ---------------- |
|
| 137 |
+
| English | 0.72 | 0.75
|
| 138 |
| Finnish | 0.79 |
|
| 139 |
| French | 0.75 |
|
| 140 |
| Swedish | 0.81 |
|