HavelockAI
/

bert-token-classifier

@@ -31,8 +31,8 @@ This model performs span-level detection of 72 rhetorical marker types using BIO
 | Base model | `bert-base-uncased` |
 | Task | Token classification (BIO tagging) |
 | Labels | 145 (72 marker types × B/I + O) |
-| Best F1 | **0.4611** (macro, markers only) |
-| Training | 15 epochs, batch 8, lr 2e-5 |
 | Loss | Focal loss (γ=1.0) for class imbalance |
 ## Usage
@@ -124,156 +124,159 @@ Per-class F1 on test set:
 <details><summary>Click to show per-marker precision/recall/F1/support</summary>
 ```
-                                   precision    recall  f1-score   support
-                                O      0.721     0.835     0.774      3496
-         B-literate_abstract_noun      0.500     0.286     0.364        14
-       B-literate_additive_formal      0.667     0.667     0.667         3
-         B-literate_agent_demoted      0.800     1.000     0.889         4
-     B-literate_agentless_passive      0.312     0.417     0.357        24
-                 B-literate_aside      0.438     0.778     0.560         9
- B-literate_categorical_statement      0.333     0.500     0.400         4
-          B-literate_causal_chain      0.667     0.667     0.667         3
-       B-literate_causal_explicit      0.538     0.636     0.583        11
-              B-literate_citation      0.000     0.000     0.000        10
-   B-literate_conceptual_metaphor      0.500     0.167     0.250         6
-            B-literate_concessive      1.000     1.000     1.000         2
-  B-literate_concessive_connector      0.800     0.800     0.800         5
-           B-literate_conditional      0.667     0.714     0.690        14
-           B-literate_contrastive      0.333     0.375     0.353         8
-     B-literate_definitional_move      1.000     1.000     1.000         1
-           B-literate_enumeration      0.500     0.667     0.571         3
-       B-literate_epistemic_hedge      0.371     0.542     0.441        24
-            B-literate_evidential      0.000     0.000     0.000        11
-    B-literate_footnote_reference      0.500     0.667     0.571         3
- B-literate_institutional_subject      0.600     1.000     0.750         3
-        B-literate_list_structure      0.000     0.000     0.000         1
-         B-literate_metadiscourse      0.500     0.500     0.500         4
-B-literate_methodological_framing      1.000     0.500     0.667         4
-        B-literate_nested_clauses      0.300     0.545     0.387        22
-        B-literate_nominalization      0.750     0.300     0.429        10
-   B-literate_objectifying_stance      0.800     1.000     0.889         4
-               B-literate_paradox      0.500     0.333     0.400         3
-           B-literate_probability      0.333     0.200     0.250         5
-   B-literate_qualified_assertion      0.000     0.000     0.000         5
-        B-literate_relative_chain      0.327     0.773     0.459        22
-B-literate_technical_abbreviation      0.000     0.000     0.000         2
-        B-literate_technical_term      0.400     0.667     0.500         3
-    B-literate_temporal_embedding      1.000     0.500     0.667         4
-B-literate_third_person_reference      0.250     0.333     0.286         3
-              B-oral_alliteration      0.000     0.000     0.000         3
-                  B-oral_anaphora      0.185     0.333     0.238        15
-                 B-oral_asyndeton      0.000     0.000     0.000         1
-         B-oral_audience_response      1.000     1.000     1.000         4
-       B-oral_binomial_expression      0.333     0.400     0.364         5
-            B-oral_conflict_frame      0.800     0.800     0.800         5
-         B-oral_discourse_formula      0.333     0.500     0.400         6
-            B-oral_dramatic_pause      0.000     0.000     0.000         2
-           B-oral_embodied_action      1.000     0.167     0.286         6
-                B-oral_epistrophe      0.000     0.000     0.000         3
-                   B-oral_epithet      0.333     0.500     0.400         2
-          B-oral_everyday_example      0.750     1.000     0.857         3
-              B-oral_first_person      0.000     0.000     0.000         5
-                B-oral_imperative      0.600     0.643     0.621        14
-              B-oral_inclusive_we      0.486     0.586     0.531        29
-      B-oral_intensifier_doubling      0.667     0.667     0.667         3
-        B-oral_lexical_repetition      0.273     0.300     0.286        10
-          B-oral_named_individual      0.450     0.450     0.450        20
-               B-oral_parallelism      0.000     0.000     0.000         7
-              B-oral_phatic_check      1.000     1.000     1.000         1
-             B-oral_phatic_filler      0.667     0.800     0.727         5
-              B-oral_polysyndeton      0.250     0.100     0.143        10
-                   B-oral_proverb      1.000     0.333     0.500         6
-                   B-oral_refrain      1.000     1.000     1.000         1
-         B-oral_religious_formula      0.000     0.000     0.000         2
-       B-oral_rhetorical_question      0.222     1.000     0.364         2
-                    B-oral_rhythm      0.714     0.833     0.769         6
-             B-oral_second_person      0.533     0.640     0.582        25
-           B-oral_self_correction      0.600     1.000     0.750         3
-            B-oral_sensory_detail      1.000     1.000     1.000         1
-        B-oral_simple_conjunction      0.000     0.000     0.000         3
-            B-oral_specific_place      0.333     0.667     0.444         3
-           B-oral_temporal_anchor      0.000     0.000     0.000         3
-                  B-oral_tricolon      0.200     1.000     0.333         2
-                   B-oral_us_them      0.667     0.667     0.667         3
-                  B-oral_vocative      0.714     0.556     0.625        27
-         I-literate_abstract_noun      0.000     0.000     0.000        12
-       I-literate_additive_formal      0.000     0.000     0.000         6
-         I-literate_agent_demoted      0.500     0.800     0.615        15
-     I-literate_agentless_passive      0.483     0.414     0.446        70
-                 I-literate_aside      0.400     0.235     0.296       102
- I-literate_categorical_statement      0.412     0.304     0.350        23
-          I-literate_causal_chain      0.917     0.440     0.595        25
-       I-literate_causal_explicit      0.444     0.762     0.561        21
-              I-literate_citation      0.444     0.182     0.258        44
-   I-literate_conceptual_metaphor      0.571     0.267     0.364        15
-            I-literate_concessive      0.750     0.429     0.545         7
-  I-literate_concessive_connector      0.400     0.667     0.500         3
-           I-literate_conditional      0.479     0.307     0.374       114
-           I-literate_contrastive      0.600     0.400     0.480        15
-       I-literate_cross_reference      0.000     0.000     0.000         0
-     I-literate_definitional_move      0.833     1.000     0.909         5
-           I-literate_enumeration      0.824     0.718     0.767        39
-       I-literate_epistemic_hedge      0.375     0.341     0.357        44
-            I-literate_evidential      0.333     0.034     0.062        29
-    I-literate_footnote_reference      0.667     0.727     0.696        11
- I-literate_institutional_subject      1.000     1.000     1.000         5
-        I-literate_list_structure      0.000     0.000     0.000         3
-         I-literate_metadiscourse      0.200     0.125     0.154        16
-I-literate_methodological_framing      0.750     0.500     0.600        12
-        I-literate_nested_clauses      0.336     0.127     0.184       379
-        I-literate_nominalization      0.000     0.000     0.000        11
-   I-literate_objectifying_stance      0.917     0.846     0.880        13
-               I-literate_paradox      0.100     0.062     0.077        16
-           I-literate_probability      0.000     0.000     0.000         7
-   I-literate_qualified_assertion      0.000     0.000     0.000        21
-        I-literate_relative_chain      0.402     0.422     0.412       251
-I-literate_technical_abbreviation      0.833     0.455     0.588        11
-        I-literate_technical_term      0.250     0.273     0.261        11
-    I-literate_temporal_embedding      1.000     0.600     0.750        50
-I-literate_third_person_reference      0.556     0.833     0.667         6
-              I-oral_alliteration      0.778     0.778     0.778         9
-                  I-oral_anaphora      0.116     0.080     0.095       100
-                 I-oral_asyndeton      0.000     0.000     0.000         7
-         I-oral_audience_response      0.864     0.905     0.884        21
-       I-oral_binomial_expression      0.533     0.727     0.615        11
-            I-oral_conflict_frame      1.000     0.714     0.833         7
-         I-oral_discourse_formula      0.545     1.000     0.706         6
-            I-oral_dramatic_pause      0.400     0.500     0.444         4
-           I-oral_embodied_action      0.000     0.000     0.000        16
-                I-oral_epistrophe      0.000     0.000     0.000         3
-                   I-oral_epithet      0.400     0.400     0.400         5
-          I-oral_everyday_example      0.947     0.900     0.923        20
-              I-oral_first_person      0.000     0.000     0.000         2
-                I-oral_imperative      0.714     0.370     0.488        27
-              I-oral_inclusive_we      0.754     0.920     0.829        50
-      I-oral_intensifier_doubling      0.800     1.000     0.889         4
-        I-oral_lexical_repetition      0.250     0.317     0.280        41
-          I-oral_named_individual      0.620     0.646     0.633        48
-               I-oral_parallelism      0.485     0.237     0.318       135
-              I-oral_phatic_check      1.000     1.000     1.000         3
-             I-oral_phatic_filler      1.000     0.400     0.571         5
-              I-oral_polysyndeton      0.700     0.171     0.275        82
-                   I-oral_proverb      0.938     0.405     0.566        37
-                   I-oral_refrain      1.000     1.000     1.000         4
-         I-oral_religious_formula      1.000     0.062     0.118        16
-       I-oral_rhetorical_question      0.389     0.467     0.424        15
-                    I-oral_rhythm      0.957     0.584     0.726        77
-             I-oral_second_person      0.250     0.143     0.182         7
-           I-oral_self_correction      0.889     0.800     0.842        20
-            I-oral_sensory_detail      0.833     1.000     0.909         5
-        I-oral_simple_conjunction      0.625     1.000     0.769         5
-            I-oral_specific_place      0.556     0.625     0.588         8
-           I-oral_temporal_anchor      0.056     0.100     0.071        10
-                  I-oral_tricolon      0.329     0.964     0.491        28
-                   I-oral_us_them      0.750     0.333     0.462         9
-                  I-oral_vocative      0.846     0.702     0.767        47
 ```
 </details>
 <details><summary>Click to show split proportions per marker</summary>
 ```
 bio_train.jsonl: 3460 markers across 72 types
 bio_val.jsonl: 514 markers across 70 types
@@ -373,9 +376,10 @@ Markers with <100 examples:  57 (79%)
 </details>
-**Macro F1 (all 145 labels):** 0.4611
-**Weighted F1:** 0.645
-**Accuracy:** 66.5%
 ## Architecture

 | Base model | `bert-base-uncased` |
 | Task | Token classification (BIO tagging) |
 | Labels | 145 (72 marker types × B/I + O) |
+| Best F1 | **0.5003** (macro, markers only) |
+| Training | 20 epochs, batch 8, lr 2e-5 |
 | Loss | Focal loss (γ=1.0) for class imbalance |
 ## Usage
 <details><summary>Click to show per-marker precision/recall/F1/support</summary>
 ```
+                                   precision    recall  f1-score   support
+                                O      0.733     0.828     0.778      3556
+         B-literate_abstract_noun      0.333     0.286     0.308        14
+       B-literate_additive_formal      1.000     0.667     0.800         3
+         B-literate_agent_demoted      0.800     1.000     0.889         4
+     B-literate_agentless_passive      0.357     0.417     0.385        24
+                 B-literate_aside      0.429     0.667     0.522         9
+ B-literate_categorical_statement      0.500     0.750     0.600         4
+          B-literate_causal_chain      1.000     0.333     0.500         3
+       B-literate_causal_explicit      0.538     0.636     0.583        11
+              B-literate_citation      0.000     0.000     0.000        10
+   B-literate_conceptual_metaphor      0.667     0.333     0.444         6
+            B-literate_concessive      1.000     1.000     1.000         2
+  B-literate_concessive_connector      0.800     0.800     0.800         5
+           B-literate_conditional      0.643     0.643     0.643        14
+           B-literate_contrastive      0.400     0.500     0.444         8
+     B-literate_definitional_move      1.000     1.000     1.000         1
+           B-literate_enumeration      0.500     0.667     0.571         3
+       B-literate_epistemic_hedge      0.387     0.500     0.436        24
+            B-literate_evidential      0.333     0.091     0.143        11
+    B-literate_footnote_reference      0.500     0.667     0.571         3
+ B-literate_institutional_subject      0.750     1.000     0.857         3
+        B-literate_list_structure      0.000     0.000     0.000         1
+         B-literate_metadiscourse      0.500     0.500     0.500         4
+B-literate_methodological_framing      1.000     0.500     0.667         4
+        B-literate_nested_clauses      0.293     0.545     0.381        22
+        B-literate_nominalization      0.750     0.300     0.429        10
+   B-literate_objectifying_stance      0.500     0.500     0.500         4
+               B-literate_paradox      0.500     0.333     0.400         3
+           B-literate_probability      0.333     0.200     0.250         5
+   B-literate_qualified_assertion      0.000     0.000     0.000         5
+        B-literate_relative_chain      0.314     0.727     0.438        22
+B-literate_technical_abbreviation      0.000     0.000     0.000         2
+        B-literate_technical_term      0.333     0.667     0.444         3
+    B-literate_temporal_embedding      1.000     0.500     0.667         4
+B-literate_third_person_reference      0.333     0.333     0.333         3
+              B-oral_alliteration      1.000     0.667     0.800         3
+                  B-oral_anaphora      0.130     0.200     0.158        15
+                 B-oral_asyndeton      0.000     0.000     0.000         1
+         B-oral_audience_response      1.000     1.000     1.000         4
+       B-oral_binomial_expression      0.400     0.400     0.400         5
+            B-oral_conflict_frame      0.800     0.800     0.800         5
+         B-oral_discourse_formula      0.500     0.500     0.500         6
+            B-oral_dramatic_pause      0.000     0.000     0.000         2
+           B-oral_embodied_action      0.333     0.167     0.222         6
+                B-oral_epistrophe      0.000     0.000     0.000         3
+                   B-oral_epithet      0.000     0.000     0.000         2
+          B-oral_everyday_example      1.000     1.000     1.000         3
+              B-oral_first_person      0.000     0.000     0.000         5
+                B-oral_imperative      0.600     0.643     0.621        14
+              B-oral_inclusive_we      0.486     0.586     0.531        29
+      B-oral_intensifier_doubling      1.000     0.667     0.800         3
+        B-oral_lexical_repetition      0.273     0.300     0.286        10
+          B-oral_named_individual      0.600     0.450     0.514        20
+               B-oral_parallelism      0.083     0.143     0.105         7
+              B-oral_phatic_check      1.000     1.000     1.000         1
+             B-oral_phatic_filler      0.429     0.600     0.500         5
+              B-oral_polysyndeton      0.250     0.200     0.222        10
+                   B-oral_proverb      1.000     0.500     0.667         6
+                   B-oral_refrain      1.000     1.000     1.000         1
+         B-oral_religious_formula      1.000     0.500     0.667         2
+       B-oral_rhetorical_question      0.250     1.000     0.400         2
+                    B-oral_rhythm      0.714     0.833     0.769         6
+             B-oral_second_person      0.516     0.640     0.571        25
+           B-oral_self_correction      0.750     1.000     0.857         3
+            B-oral_sensory_detail      1.000     1.000     1.000         1
+        B-oral_simple_conjunction      0.000     0.000     0.000         3
+            B-oral_specific_place      0.400     0.667     0.500         3
+           B-oral_temporal_anchor      0.000     0.000     0.000         3
+                  B-oral_tricolon      0.222     1.000     0.364         2
+                   B-oral_us_them      0.667     0.667     0.667         3
+                  B-oral_vocative      0.941     0.593     0.727        27
+         I-literate_abstract_noun      0.000     0.000     0.000        14
+       I-literate_additive_formal      0.000     0.000     0.000         6
+         I-literate_agent_demoted      0.583     0.933     0.718        15
+     I-literate_agentless_passive      0.420     0.397     0.408        73
+                 I-literate_aside      0.544     0.523     0.533       107
+ I-literate_categorical_statement      0.571     0.348     0.432        23
+          I-literate_causal_chain      0.800     0.640     0.711        25
+       I-literate_causal_explicit      0.576     0.826     0.679        23
+              I-literate_citation      0.706     0.250     0.369        48
+   I-literate_conceptual_metaphor      0.714     0.333     0.455        15
+            I-literate_concessive      0.778     1.000     0.875         7
+  I-literate_concessive_connector      0.200     0.333     0.250         3
+           I-literate_conditional      0.676     0.410     0.511       117
+           I-literate_contrastive      0.286     0.400     0.333        15
+       I-literate_cross_reference      0.000     0.000     0.000         0
+     I-literate_definitional_move      1.000     1.000     1.000         5
+           I-literate_enumeration      1.000     0.375     0.545        40
+       I-literate_epistemic_hedge      0.486     0.370     0.420        46
+            I-literate_evidential      0.250     0.034     0.061        29
+    I-literate_footnote_reference      0.800     0.727     0.762        11
+ I-literate_institutional_subject      0.833     1.000     0.909         5
+        I-literate_list_structure      0.000     0.000     0.000         3
+         I-literate_metadiscourse      0.200     0.125     0.154        16
+I-literate_methodological_framing      0.667     0.500     0.571        12
+        I-literate_nested_clauses      0.489     0.292     0.366       390
+        I-literate_nominalization      0.000     0.000     0.000        14
+   I-literate_objectifying_stance      0.833     0.769     0.800        13
+               I-literate_paradox      0.100     0.062     0.077        16
+           I-literate_probability      0.000     0.000     0.000         7
+   I-literate_qualified_assertion      0.000     0.000     0.000        21
+        I-literate_relative_chain      0.479     0.531     0.504       262
+I-literate_technical_abbreviation      0.667     0.182     0.286        11
+        I-literate_technical_term      0.455     0.357     0.400        14
+    I-literate_temporal_embedding      1.000     0.588     0.741        51
+I-literate_third_person_reference      0.500     0.167     0.250         6
+              I-oral_alliteration      0.857     0.545     0.667        11
+                  I-oral_anaphora      0.208     0.198     0.203       101
+                 I-oral_asyndeton      0.000     0.000     0.000         7
+         I-oral_audience_response      0.905     0.905     0.905        21
+       I-oral_binomial_expression      0.400     0.727     0.516        11
+            I-oral_conflict_frame      1.000     0.714     0.833         7
+         I-oral_discourse_formula      0.667     0.667     0.667         6
+            I-oral_dramatic_pause      0.400     0.500     0.444         4
+           I-oral_embodied_action      0.000     0.000     0.000        16
+                I-oral_epistrophe      0.000     0.000     0.000         3
+                   I-oral_epithet      0.429     0.600     0.500         5
+          I-oral_everyday_example      0.955     1.000     0.977        21
+              I-oral_first_person      0.000     0.000     0.000         2
+                I-oral_imperative      0.615     0.276     0.381        29
+              I-oral_inclusive_we      0.904     0.922     0.913        51
+      I-oral_intensifier_doubling      0.800     1.000     0.889         4
+        I-oral_lexical_repetition      0.196     0.244     0.217        41
+          I-oral_named_individual      0.579     0.589     0.584        56
+               I-oral_parallelism      0.471     0.287     0.357       143
+              I-oral_phatic_check      1.000     1.000     1.000         3
+             I-oral_phatic_filler      0.667     0.400     0.500         5
+              I-oral_polysyndeton      1.000     0.217     0.356        83
+                   I-oral_proverb      1.000     0.568     0.724        37
+                   I-oral_refrain      1.000     1.000     1.000         4
+         I-oral_religious_formula      1.000     0.125     0.222        16
+       I-oral_rhetorical_question      0.429     0.600     0.500        15
+                    I-oral_rhythm      0.957     0.571     0.715        77
+             I-oral_second_person      0.333     0.143     0.200         7
+           I-oral_self_correction      0.842     0.800     0.821        20
+            I-oral_sensory_detail      1.000     0.800     0.889         5
+        I-oral_simple_conjunction      0.667     1.000     0.800         6
+            I-oral_specific_place      0.714     0.625     0.667         8
+           I-oral_temporal_anchor      0.056     0.100     0.071        10
+                  I-oral_tricolon      0.309     0.806     0.446        31
+                   I-oral_us_them      0.571     0.444     0.500         9
+                  I-oral_vocative      0.897     0.745     0.814        47
+                         accuracy                          0.653      6441
+                        macro avg      0.530     0.487     0.481      6441
+                     weighted avg      0.653     0.653     0.637      6441
 ```
 </details>
 <details><summary>Click to show split proportions per marker</summary>
 ```
 bio_train.jsonl: 3460 markers across 72 types
 bio_val.jsonl: 514 markers across 70 types
 </details>
+**Best Val F1 (markers only):** 0.5003
+**Macro F1 (all 145 labels, test):** 0.481
+**Weighted F1 (test):** 0.637
+**Accuracy (test):** 65.3%
 ## Architecture

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:862313c38ca9273d9d4cbb21c001ce9fbf3798c8c3601eddfc009e63303341d7
 size 436035932

 version https://git-lfs.github.com/spec/v1
+oid sha256:843209761c32ebf9e994fe50058c120e9b945d381da0cbec76b14f1fce7fe250
 size 436035932