Commit
·
3534230
1
Parent(s):
a40f15e
Update README.md
Browse files
README.md
CHANGED
|
@@ -12,6 +12,9 @@ tags:
|
|
| 12 |
|
| 13 |
A SetFit model fit on 166 downlsampled multilingual IPTC Subject labels (concatenated for the lowest hierarchy level into artificial sentences of keywords) to predict the mid level news categories.
|
| 14 |
The purpose of this classifier is to support exploring corpora as weak labeler, since the representations of these descriptions are only approximations of real documents from those topics.
|
|
|
|
|
|
|
|
|
|
| 15 |
Accuracy on highest level labels in eval:
|
| 16 |
0.9779412
|
| 17 |
Accuracy/F1/mcc on mid level labels in eval:
|
|
|
|
| 12 |
|
| 13 |
A SetFit model fit on 166 downlsampled multilingual IPTC Subject labels (concatenated for the lowest hierarchy level into artificial sentences of keywords) to predict the mid level news categories.
|
| 14 |
The purpose of this classifier is to support exploring corpora as weak labeler, since the representations of these descriptions are only approximations of real documents from those topics.
|
| 15 |
+
The dataset I used to train the model is based on this file:
|
| 16 |
+
https://huggingface.co/datasets/KnutJaegersberg/News_topics_IPTC_codes_long
|
| 17 |
+
|
| 18 |
Accuracy on highest level labels in eval:
|
| 19 |
0.9779412
|
| 20 |
Accuracy/F1/mcc on mid level labels in eval:
|