Upload folder using huggingface_hub
Browse files- README.md +60 -61
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -31,7 +31,7 @@ This model performs multi-label span-level detection of 53 rhetorical marker typ
|
|
| 31 |
| Base model | `bert-base-uncased` |
|
| 32 |
| Task | Multi-label token classification (independent B/I/O per type) |
|
| 33 |
| Marker types | 53 (22 oral, 31 literate) |
|
| 34 |
-
| Test macro F1 | **0.
|
| 35 |
| Training | 20 epochs, batch 24, lr 3e-5, fp16 |
|
| 36 |
| Regularization | Mixout (p=0.1) — stochastic L2 anchor to pretrained weights |
|
| 37 |
| Loss | Per-type weighted cross-entropy with inverse-frequency type weights |
|
|
@@ -118,61 +118,61 @@ Per-type detection F1 on test set (binary: B or I = positive, O = negative):
|
|
| 118 |
```
|
| 119 |
Type Prec Rec F1 Sup
|
| 120 |
========================================================================
|
| 121 |
-
literate_abstract_noun 0.
|
| 122 |
-
literate_additive_formal 0.
|
| 123 |
-
literate_agent_demoted 0.
|
| 124 |
-
literate_agentless_passive 0.
|
| 125 |
-
literate_aside 0.
|
| 126 |
-
literate_categorical_statement 0.
|
| 127 |
-
literate_causal_explicit 0.
|
| 128 |
-
literate_citation 0.
|
| 129 |
-
literate_conceptual_metaphor 0.
|
| 130 |
-
literate_concessive 0.
|
| 131 |
-
literate_concessive_connector 0.
|
| 132 |
-
literate_concrete_setting 0.
|
| 133 |
-
literate_conditional 0.
|
| 134 |
-
literate_contrastive 0.
|
| 135 |
-
literate_cross_reference 0.
|
| 136 |
-
literate_definitional_move 0.
|
| 137 |
-
literate_enumeration 0.
|
| 138 |
-
literate_epistemic_hedge 0.
|
| 139 |
-
literate_evidential 0.
|
| 140 |
-
literate_institutional_subject 0.
|
| 141 |
-
literate_list_structure 0.
|
| 142 |
-
literate_metadiscourse 0.
|
| 143 |
-
literate_nested_clauses 0.
|
| 144 |
-
literate_nominalization 0.
|
| 145 |
-
literate_objectifying_stance 0.
|
| 146 |
-
literate_probability 0.
|
| 147 |
-
literate_qualified_assertion 0.
|
| 148 |
-
literate_relative_chain 0.
|
| 149 |
-
literate_technical_abbreviation 0.
|
| 150 |
-
literate_technical_term 0.
|
| 151 |
-
literate_temporal_embedding 0.
|
| 152 |
-
oral_anaphora 0.
|
| 153 |
-
oral_antithesis 0.
|
| 154 |
-
oral_discourse_formula 0.
|
| 155 |
-
oral_embodied_action 0.
|
| 156 |
-
oral_everyday_example 0.
|
| 157 |
-
oral_imperative 0.
|
| 158 |
-
oral_inclusive_we 0.
|
| 159 |
-
oral_intensifier_doubling 0.
|
| 160 |
-
oral_lexical_repetition 0.
|
| 161 |
-
oral_named_individual 0.
|
| 162 |
-
oral_parallelism 0.
|
| 163 |
-
oral_phatic_check 0.
|
| 164 |
-
oral_phatic_filler 0.
|
| 165 |
-
oral_rhetorical_question 0.
|
| 166 |
-
oral_second_person 0.
|
| 167 |
-
oral_self_correction 0.
|
| 168 |
-
oral_sensory_detail 0.
|
| 169 |
-
oral_simple_conjunction 0.
|
| 170 |
-
oral_specific_place 0.
|
| 171 |
-
oral_temporal_anchor 0.
|
| 172 |
-
oral_tricolon 0.
|
| 173 |
-
oral_vocative 0.
|
| 174 |
========================================================================
|
| 175 |
-
Macro avg (types w/ support) 0.
|
| 176 |
```
|
| 177 |
|
| 178 |
</details>
|
|
@@ -180,10 +180,9 @@ Macro avg (types w/ support) 0.400
|
|
| 180 |
**Missing labels (test set):** 0/53 — all types detected at least once.
|
| 181 |
|
| 182 |
Notable patterns:
|
| 183 |
-
- **Strong performers** (F1 > 0.5): rhetorical_question (0.
|
| 184 |
-
- **Weak performers** (F1 < 0.2):
|
| 185 |
-
- **Precision-recall tradeoff**: Most types
|
| 186 |
-
- **Recovered type**: `oral_parallelism` crossed the 150-span threshold and was re-included, though its near-zero recall (0.048) means it is effectively non-functional despite high precision when it does fire.
|
| 187 |
|
| 188 |
## Architecture
|
| 189 |
|
|
@@ -216,8 +215,8 @@ classifier.bias → randomly initialized
|
|
| 216 |
## Limitations
|
| 217 |
|
| 218 |
- **Recall-dominated errors**: Most types over-predict (recall > precision), producing false positives; downstream applications may need confidence thresholding
|
| 219 |
-
- **Near-zero recall types**: `
|
| 220 |
-
- **Low-precision types**: `
|
| 221 |
- **Context window**: 128 tokens max; longer spans may be truncated
|
| 222 |
- **Domain**: Trained primarily on historical/literary texts; may underperform on modern social media
|
| 223 |
- **Subjectivity**: Some marker boundaries are inherently ambiguous
|
|
|
|
| 31 |
| Base model | `bert-base-uncased` |
|
| 32 |
| Task | Multi-label token classification (independent B/I/O per type) |
|
| 33 |
| Marker types | 53 (22 oral, 31 literate) |
|
| 34 |
+
| Test macro F1 | **0.386** (per-type detection, binary positive = B or I) |
|
| 35 |
| Training | 20 epochs, batch 24, lr 3e-5, fp16 |
|
| 36 |
| Regularization | Mixout (p=0.1) — stochastic L2 anchor to pretrained weights |
|
| 37 |
| Loss | Per-type weighted cross-entropy with inverse-frequency type weights |
|
|
|
|
| 118 |
```
|
| 119 |
Type Prec Rec F1 Sup
|
| 120 |
========================================================================
|
| 121 |
+
literate_abstract_noun 0.209 0.329 0.255 420
|
| 122 |
+
literate_additive_formal 0.243 0.479 0.322 71
|
| 123 |
+
literate_agent_demoted 0.468 0.664 0.549 414
|
| 124 |
+
literate_agentless_passive 0.555 0.648 0.598 1168
|
| 125 |
+
literate_aside 0.481 0.469 0.475 469
|
| 126 |
+
literate_categorical_statement 0.084 0.263 0.128 118
|
| 127 |
+
literate_causal_explicit 0.314 0.386 0.347 272
|
| 128 |
+
literate_citation 0.468 0.431 0.449 255
|
| 129 |
+
literate_conceptual_metaphor 0.370 0.397 0.383 517
|
| 130 |
+
literate_concessive 0.456 0.503 0.478 533
|
| 131 |
+
literate_concessive_connector 0.250 0.603 0.353 63
|
| 132 |
+
literate_concrete_setting 0.186 0.322 0.236 298
|
| 133 |
+
literate_conditional 0.519 0.548 0.533 1514
|
| 134 |
+
literate_contrastive 0.391 0.462 0.424 424
|
| 135 |
+
literate_cross_reference 0.825 0.316 0.457 253
|
| 136 |
+
literate_definitional_move 0.443 0.432 0.438 236
|
| 137 |
+
literate_enumeration 0.147 0.306 0.198 297
|
| 138 |
+
literate_epistemic_hedge 0.236 0.431 0.305 255
|
| 139 |
+
literate_evidential 0.269 0.472 0.342 106
|
| 140 |
+
literate_institutional_subject 0.157 0.450 0.233 111
|
| 141 |
+
literate_list_structure 0.528 0.614 0.567 295
|
| 142 |
+
literate_metadiscourse 0.355 0.407 0.379 447
|
| 143 |
+
literate_nested_clauses 0.143 0.093 0.113 2044
|
| 144 |
+
literate_nominalization 0.433 0.538 0.480 1013
|
| 145 |
+
literate_objectifying_stance 0.451 0.575 0.506 113
|
| 146 |
+
literate_probability 0.439 0.720 0.545 50
|
| 147 |
+
literate_qualified_assertion 0.186 0.077 0.109 142
|
| 148 |
+
literate_relative_chain 0.344 0.606 0.439 1456
|
| 149 |
+
literate_technical_abbreviation 0.500 0.705 0.585 139
|
| 150 |
+
literate_technical_term 0.278 0.423 0.336 825
|
| 151 |
+
literate_temporal_embedding 0.174 0.253 0.206 400
|
| 152 |
+
oral_anaphora 0.500 0.303 0.377 297
|
| 153 |
+
oral_antithesis 0.298 0.339 0.317 561
|
| 154 |
+
oral_discourse_formula 0.373 0.461 0.413 492
|
| 155 |
+
oral_embodied_action 0.295 0.368 0.327 454
|
| 156 |
+
oral_everyday_example 0.279 0.307 0.293 420
|
| 157 |
+
oral_imperative 0.359 0.600 0.449 110
|
| 158 |
+
oral_inclusive_we 0.579 0.668 0.620 681
|
| 159 |
+
oral_intensifier_doubling 0.429 0.220 0.290 82
|
| 160 |
+
oral_lexical_repetition 0.328 0.382 0.353 275
|
| 161 |
+
oral_named_individual 0.359 0.712 0.478 573
|
| 162 |
+
oral_parallelism 0.111 0.114 0.112 202
|
| 163 |
+
oral_phatic_check 0.288 0.436 0.347 39
|
| 164 |
+
oral_phatic_filler 0.389 0.527 0.448 146
|
| 165 |
+
oral_rhetorical_question 0.581 0.892 0.703 1006
|
| 166 |
+
oral_second_person 0.555 0.528 0.541 718
|
| 167 |
+
oral_self_correction 0.293 0.357 0.322 115
|
| 168 |
+
oral_sensory_detail 0.194 0.402 0.262 246
|
| 169 |
+
oral_simple_conjunction 0.174 0.229 0.198 131
|
| 170 |
+
oral_specific_place 0.453 0.751 0.565 406
|
| 171 |
+
oral_temporal_anchor 0.223 0.704 0.339 257
|
| 172 |
+
oral_tricolon 0.470 0.293 0.361 907
|
| 173 |
+
oral_vocative 0.386 0.942 0.547 52
|
| 174 |
========================================================================
|
| 175 |
+
Macro avg (types w/ support) 0.386
|
| 176 |
```
|
| 177 |
|
| 178 |
</details>
|
|
|
|
| 180 |
**Missing labels (test set):** 0/53 — all types detected at least once.
|
| 181 |
|
| 182 |
Notable patterns:
|
| 183 |
+
- **Strong performers** (F1 > 0.5): rhetorical_question (0.703), inclusive_we (0.620), agentless_passive (0.598), technical_abbreviation (0.585), list_structure (0.567), specific_place (0.565), agent_demoted (0.549), vocative (0.547), probability (0.545), second_person (0.541), conditional (0.533), objectifying_stance (0.506)
|
| 184 |
+
- **Weak performers** (F1 < 0.2): qualified_assertion (0.109), parallelism (0.112), nested_clauses (0.113), categorical_statement (0.128), enumeration (0.198), simple_conjunction (0.198)
|
| 185 |
+
- **Precision-recall tradeoff**: Most types show higher recall than precision, indicating the model over-predicts markers. Notable exceptions include `cross_reference` (0.825 precision / 0.316 recall), `anaphora` (0.500 / 0.303), and `tricolon` (0.470 / 0.293), which remain high-precision but low-recall.
|
|
|
|
| 186 |
|
| 187 |
## Architecture
|
| 188 |
|
|
|
|
| 215 |
## Limitations
|
| 216 |
|
| 217 |
- **Recall-dominated errors**: Most types over-predict (recall > precision), producing false positives; downstream applications may need confidence thresholding
|
| 218 |
+
- **Near-zero recall types**: `qualified_assertion` (0.077 recall), `nested_clauses` (0.093), and `parallelism` (0.114) are rarely detected despite being present in training data
|
| 219 |
+
- **Low-precision types**: `categorical_statement` (0.084), `parallelism` (0.111), and `nested_clauses` (0.143) have precision below 0.15, meaning most predictions for those types are false positives
|
| 220 |
- **Context window**: 128 tokens max; longer spans may be truncated
|
| 221 |
- **Domain**: Trained primarily on historical/literary texts; may underperform on modern social media
|
| 222 |
- **Subjectivity**: Some marker boundaries are inherently ambiguous
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 436082548
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:37d9c74b122fa304421948d1f1bc5ad1d686fb33eab36ae82079c1e8f4a03282
|
| 3 |
size 436082548
|