Updated to Roberta-base from distilbert
Browse files
README.md
CHANGED
|
@@ -14,29 +14,30 @@ widget:
|
|
| 14 |
|
| 15 |
---
|
| 16 |
Text Multi-Label Sequence Classification model used to decode if passages contain a misfortunate event, a cause for misfortune, and/or an action to mollify or prevent some misfortune.
|
| 17 |
-
8293 passages were used for Training and split into 5 folds (~6634 for the train set, ~1659 for the validation set over 5 folds).
|
| 18 |
|
| 19 |
<br><b>Parameters</b>:
|
| 20 |
-
<br>Transformer:
|
| 21 |
-
<br>Tokenizer:
|
|
|
|
| 22 |
<br>learning rate: 2e-05
|
| 23 |
<br>weight decay: .01
|
| 24 |
<br>Dropout: .1
|
| 25 |
<br>Batch Size: 8
|
| 26 |
<br>Epochs: 15
|
| 27 |
<br>Metric for best model: F1 micro
|
| 28 |
-
<br><br>Using epoch
|
| 29 |
<ul>
|
| 30 |
<li>EVENT: -
|
| 31 |
<ul>
|
| 32 |
<li>
|
| 33 |
-
Illness: .
|
| 34 |
</li>
|
| 35 |
<li>
|
| 36 |
-
Accident: .
|
| 37 |
</li>
|
| 38 |
<li>
|
| 39 |
-
Other: .
|
| 40 |
</li>
|
| 41 |
</ul>
|
| 42 |
</li>
|
|
@@ -46,16 +47,16 @@ Text Multi-Label Sequence Classification model used to decode if passages contai
|
|
| 46 |
Just Happens: -
|
| 47 |
</li>
|
| 48 |
<li>
|
| 49 |
-
Material Physical: .
|
| 50 |
</li>
|
| 51 |
<li>
|
| 52 |
-
Spirits and Gods: .
|
| 53 |
</li>
|
| 54 |
<li>
|
| 55 |
-
Witchcraft and Sorcery: .
|
| 56 |
</li>
|
| 57 |
<li>
|
| 58 |
-
Rule Violation Taboo: .
|
| 59 |
</li>
|
| 60 |
<li>
|
| 61 |
Jealous Evil Eye: -
|
|
@@ -65,19 +66,19 @@ Text Multi-Label Sequence Classification model used to decode if passages contai
|
|
| 65 |
<li>ACTION: -
|
| 66 |
<ul>
|
| 67 |
<li>
|
| 68 |
-
Physical Material: .
|
| 69 |
</li>
|
| 70 |
<li>
|
| 71 |
-
Technical Specialist: .
|
| 72 |
</li>
|
| 73 |
<li>
|
| 74 |
-
Divination: .
|
| 75 |
</li>
|
| 76 |
<li>
|
| 77 |
-
Shaman Medium Healer: .
|
| 78 |
</li>
|
| 79 |
<li>
|
| 80 |
-
Priest High Religion: .
|
| 81 |
</li>
|
| 82 |
<li>
|
| 83 |
Other: -
|
|
|
|
| 14 |
|
| 15 |
---
|
| 16 |
Text Multi-Label Sequence Classification model used to decode if passages contain a misfortunate event, a cause for misfortune, and/or an action to mollify or prevent some misfortune.
|
| 17 |
+
8293 passages were used for Training and split into 5 folds (~6634 for the train set, ~1659 for the validation set over 5 folds). However, as previous comparisons between folds have shown no difference in accuracy or overfitting, to save on computation, we only trained on the 1st fold.
|
| 18 |
|
| 19 |
<br><b>Parameters</b>:
|
| 20 |
+
<br>Transformer: roberta-base
|
| 21 |
+
<br>Tokenizer: roberta-base
|
| 22 |
+
<br>Head: multi_label_classification
|
| 23 |
<br>learning rate: 2e-05
|
| 24 |
<br>weight decay: .01
|
| 25 |
<br>Dropout: .1
|
| 26 |
<br>Batch Size: 8
|
| 27 |
<br>Epochs: 15
|
| 28 |
<br>Metric for best model: F1 micro
|
| 29 |
+
<br><br>Using epoch 15, the current F1 micro score of 2074 passages not used for training is .664 which is an improvement compared to the identical model with distilbert which achieved F1 micro of .637. Individual class f1 scores are shown below. Note that some labels have been excluded as they are not relevant for the final use of the model.
|
| 30 |
<ul>
|
| 31 |
<li>EVENT: -
|
| 32 |
<ul>
|
| 33 |
<li>
|
| 34 |
+
Illness: .876
|
| 35 |
</li>
|
| 36 |
<li>
|
| 37 |
+
Accident: .458
|
| 38 |
</li>
|
| 39 |
<li>
|
| 40 |
+
Other: .588
|
| 41 |
</li>
|
| 42 |
</ul>
|
| 43 |
</li>
|
|
|
|
| 47 |
Just Happens: -
|
| 48 |
</li>
|
| 49 |
<li>
|
| 50 |
+
Material Physical: .476
|
| 51 |
</li>
|
| 52 |
<li>
|
| 53 |
+
Spirits and Gods: .728
|
| 54 |
</li>
|
| 55 |
<li>
|
| 56 |
+
Witchcraft and Sorcery: .651
|
| 57 |
</li>
|
| 58 |
<li>
|
| 59 |
+
Rule Violation Taboo: .517
|
| 60 |
</li>
|
| 61 |
<li>
|
| 62 |
Jealous Evil Eye: -
|
|
|
|
| 66 |
<li>ACTION: -
|
| 67 |
<ul>
|
| 68 |
<li>
|
| 69 |
+
Physical Material: .672
|
| 70 |
</li>
|
| 71 |
<li>
|
| 72 |
+
Technical Specialist: .5
|
| 73 |
</li>
|
| 74 |
<li>
|
| 75 |
+
Divination: .406
|
| 76 |
</li>
|
| 77 |
<li>
|
| 78 |
+
Shaman Medium Healer: .582
|
| 79 |
</li>
|
| 80 |
<li>
|
| 81 |
+
Priest High Religion: .375
|
| 82 |
</li>
|
| 83 |
<li>
|
| 84 |
Other: -
|