st2 / classification_report.txt
eitang's picture
Initial upload of st2 BERT-large-NER BIO span extractor
5859cf7 verified
Raw
History Blame Contribute Delete
894 Bytes
precision recall f1-score support
O 0.99 0.95 0.97 75421
B-SUBJ 0.45 0.42 0.43 445
I-SUBJ 0.46 0.67 0.55 2120
B-OBJ 0.45 0.45 0.45 461
I-OBJ 0.45 0.81 0.58 2321
accuracy 0.93 80768
macro avg 0.56 0.66 0.60 80768
weighted avg 0.95 0.93 0.94 80768
# Notes
# Best dev macro F1: 0.7000 (epoch 6)
# Model: dslim/bert-large-NER, 10 epochs, BIO token classification
# Train/Dev/Test rows: 6791 / 627 / 631
# Label scheme: O, B-SUBJ, I-SUBJ, B-OBJ, I-OBJ
# Known span-level pattern: I-class F1 > B-class F1, so spans may be off-by-one
# at boundaries. Resolver should expand spans to token boundaries when matching
# to coref clusters by char-offset overlap.