Update README.md
Browse files
README.md
CHANGED
|
@@ -27,12 +27,16 @@ This Named Entity Recognition (NER) model is designed to extract book titles fro
|
|
| 27 |
The model has been fine-tuned and evaluated on a Dutch dataset consisting of 12,535 book reviews from the Leeuwarder Courant, identifying 23,529 book titles. The dataset utilizes the IO Tagging Schema. The data was divided into a training set (70%), validation set (15%), and test set (15%). Training involved the Majority or Minority loss function, achieving an F1 score of 84.3%, Precision of 83.4%, and Recall of 85.2% on the test set.
|
| 28 |

|
| 29 |
|
| 30 |
-
|
| 31 |
|
| 32 |
- **Model type:** XML-RoBERTa
|
| 33 |
- **Language(s):** Dutch
|
| 34 |
- **Fine-tuned from model:** [FacebookAI/xlm-roberta-large-finetuned-conll03-english](https://huggingface.co/FacebookAI/xlm-roberta-large-finetuned-conll03-english)
|
| 35 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 36 |
## Uses
|
| 37 |
|
| 38 |
This model is intended for extracting book titles from Dutch texts, particularly useful for applications involving text analysis in the literary domain.
|
|
|
|
| 27 |
The model has been fine-tuned and evaluated on a Dutch dataset consisting of 12,535 book reviews from the Leeuwarder Courant, identifying 23,529 book titles. The dataset utilizes the IO Tagging Schema. The data was divided into a training set (70%), validation set (15%), and test set (15%). Training involved the Majority or Minority loss function, achieving an F1 score of 84.3%, Precision of 83.4%, and Recall of 85.2% on the test set.
|
| 28 |

|
| 29 |
|
| 30 |
+
## Model Description
|
| 31 |
|
| 32 |
- **Model type:** XML-RoBERTa
|
| 33 |
- **Language(s):** Dutch
|
| 34 |
- **Fine-tuned from model:** [FacebookAI/xlm-roberta-large-finetuned-conll03-english](https://huggingface.co/FacebookAI/xlm-roberta-large-finetuned-conll03-english)
|
| 35 |
|
| 36 |
+
## Model Flaws
|
| 37 |
+
- Struggles with accurately identifying subtitles of book titles.
|
| 38 |
+
- When a book title is mentioned multiple times within the same review, the model tends to mark it only once, missing subsequent occurrences.
|
| 39 |
+
|
| 40 |
## Uses
|
| 41 |
|
| 42 |
This model is intended for extracting book titles from Dutch texts, particularly useful for applications involving text analysis in the literary domain.
|