permutans commited on
Commit
ef4ddf1
·
verified ·
1 Parent(s): 32c2c50

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +60 -34
README.md CHANGED
@@ -2,7 +2,7 @@
2
  license: mit
3
  tags:
4
  - text-classification
5
- - bert
6
  - orality
7
  - linguistics
8
  - rhetorical-analysis
@@ -12,7 +12,7 @@ metrics:
12
  - f1
13
  - accuracy
14
  base_model:
15
- - google-bert/bert-base-uncased
16
  pipeline_tag: text-classification
17
  library_name: transformers
18
  datasets:
@@ -25,7 +25,7 @@ model-index:
25
  name: Marker Type Classification
26
  metrics:
27
  - type: f1
28
- value: 0.583
29
  name: F1 (macro)
30
  - type: accuracy
31
  value: 0.584
@@ -34,7 +34,7 @@ model-index:
34
 
35
  # Havelock Marker Type Classifier
36
 
37
- BERT-based classifier for **18 rhetorical marker types** on the oral–literate spectrum, grounded in Walter Ong's *Orality and Literacy* (1982).
38
 
39
  This is the mid-level of the Havelock span classification hierarchy. Given a text span identified as a rhetorical marker, the model classifies it into one of 18 functional types (e.g., `repetition`, `subordination`, `direct_address`, `hedging_qualification`).
40
 
@@ -42,14 +42,14 @@ This is the mid-level of the Havelock span classification hierarchy. Given a tex
42
 
43
  | Property | Value |
44
  |----------|-------|
45
- | Base model | `bert-base-uncased` |
46
- | Architecture | `BertForSequenceClassification` |
47
  | Task | Multi-class classification (18 classes) |
48
  | Max sequence length | 128 tokens |
49
- | Test F1 (macro) | **0.583** |
50
  | Test Accuracy | **0.584** |
51
  | Missing labels | **0/18** |
52
- | Parameters | ~109M |
53
 
54
  ## Usage
55
  ```python
@@ -91,7 +91,7 @@ The 18 types group fine-grained subtypes into functional families. Prior version
91
 
92
  ### Data
93
 
94
- Span-level annotations from the Havelock corpus. Each span carries a `marker_type` field normalized against a canonical taxonomy at build time. A stratified 80/10/10 train/val/test split was used with swap-based optimization to balance label distributions across splits. The test set contains 2,178 spans.
95
 
96
  ### Hyperparameters
97
 
@@ -109,53 +109,77 @@ Span-level annotations from the Havelock corpus. Each span carries a `marker_typ
109
  | Mixed precision | FP16 |
110
  | Min examples per class | 50 |
111
 
 
 
 
 
112
  ### Test Set Classification Report
113
 
114
  <details><summary>Click to expand per-class precision/recall/F1/support</summary>
115
  ```
116
  precision recall f1-score support
117
 
118
- abstraction 0.379 0.667 0.483 117
119
- agonistic_framing 0.806 0.781 0.794 32
120
- analytical_distance 0.542 0.483 0.511 120
121
- concrete_situational 0.495 0.385 0.433 143
122
- direct_address 0.693 0.640 0.666 367
123
- formulaic_phrases 0.214 0.471 0.294 51
124
- hedging_qualification 0.512 0.544 0.528 114
125
- literate_feature 0.520 0.803 0.631 66
126
- logical_connectives 0.527 0.548 0.538 124
127
- oral_feature 0.813 0.465 0.592 159
128
- parallelism 0.714 0.789 0.750 19
129
- parataxis 0.647 0.473 0.547 93
130
- passive_agentless 0.643 0.581 0.610 62
131
- performance_markers 0.638 0.481 0.548 77
132
- repetition 0.661 0.724 0.691 156
133
- sound_patterns 0.661 0.536 0.592 69
134
- subordination 0.626 0.639 0.632 296
135
- textual_apparatus 0.711 0.611 0.657 113
136
 
137
  accuracy 0.584 2178
138
- macro avg 0.600 0.590 0.583 2178
139
- weighted avg 0.613 0.584 0.589 2178
140
  ```
141
 
142
  </details>
143
 
144
- **Top performing types (F1 ≥ 0.65):** `agonistic_framing` (0.794), `parallelism` (0.750), `repetition` (0.691), `direct_address` (0.666), `textual_apparatus` (0.657), `literate_feature` (0.631), `subordination` (0.632), `passive_agentless` (0.610).
 
 
145
 
146
- **Weakest types (F1 < 0.50):** `formulaic_phrases` (0.294), `concrete_situational` (0.433), `abstraction` (0.483). These are high-frequency classes where confusion with related types is common.
 
 
 
 
 
 
 
147
 
148
  ## Limitations
149
 
150
- - **Class imbalance**: `direct_address` has 367 test examples while `parallelism` has 19. Weighted F1 (0.589) is close to macro F1 (0.583), indicating reasonably balanced performance, but rare types remain harder.
151
  - **Span-level only**: Requires pre-extracted spans. Does not detect boundaries.
152
  - **128-token context window**: Longer spans are truncated.
153
- - **Abstraction underperforms**: At 0.483 F1 despite being the 3rd largest class (117 test spans), suggesting the type may be too broad or overlapping with `analytical_distance` and `literate_feature`.
 
154
 
155
  ## Theoretical Background
156
 
157
  The type level captures functional groupings within the oral–literate framework. Oral types reflect Ong's characterization of oral discourse as additive (`parataxis`), aggregative (`formulaic_phrases`), redundant (`repetition`), agonistically toned (`agonistic_framing`), empathetic and participatory (`direct_address`), and close to the human lifeworld (`concrete_situational`). Literate types capture the analytic (`abstraction`, `subordination`), distanced (`analytical_distance`, `passive_agentless`), and self-referential (`textual_apparatus`) qualities of written discourse.
158
 
 
 
 
 
 
 
 
 
 
 
159
  ## Citation
160
  ```bibtex
161
  @misc{havelock2026type,
@@ -169,7 +193,9 @@ The type level captures functional groupings within the oral–literate framewor
169
  ## References
170
 
171
  - Ong, Walter J. *Orality and Literacy: The Technologizing of the Word*. Routledge, 1982.
 
 
172
 
173
  ---
174
 
175
- *Model version: b31f147d · Trained: February 2026*
 
2
  license: mit
3
  tags:
4
  - text-classification
5
+ - modernbert
6
  - orality
7
  - linguistics
8
  - rhetorical-analysis
 
12
  - f1
13
  - accuracy
14
  base_model:
15
+ - answerdotai/ModernBERT-base
16
  pipeline_tag: text-classification
17
  library_name: transformers
18
  datasets:
 
25
  name: Marker Type Classification
26
  metrics:
27
  - type: f1
28
+ value: 0.573
29
  name: F1 (macro)
30
  - type: accuracy
31
  value: 0.584
 
34
 
35
  # Havelock Marker Type Classifier
36
 
37
+ ModernBERT-based classifier for **18 rhetorical marker types** on the oral–literate spectrum, grounded in Walter Ong's *Orality and Literacy* (1982).
38
 
39
  This is the mid-level of the Havelock span classification hierarchy. Given a text span identified as a rhetorical marker, the model classifies it into one of 18 functional types (e.g., `repetition`, `subordination`, `direct_address`, `hedging_qualification`).
40
 
 
42
 
43
  | Property | Value |
44
  |----------|-------|
45
+ | Base model | `answerdotai/ModernBERT-base` |
46
+ | Architecture | `ModernBertForSequenceClassification` |
47
  | Task | Multi-class classification (18 classes) |
48
  | Max sequence length | 128 tokens |
49
+ | Test F1 (macro) | **0.573** |
50
  | Test Accuracy | **0.584** |
51
  | Missing labels | **0/18** |
52
+ | Parameters | ~149M |
53
 
54
  ## Usage
55
  ```python
 
91
 
92
  ### Data
93
 
94
+ 22,367 span-level annotations from the Havelock corpus. Each span carries a `marker_type` field normalized against a canonical taxonomy at build time. A stratified 80/10/10 train/val/test split was used with swap-based optimization to balance label distributions across splits. The test set contains 2,178 spans.
95
 
96
  ### Hyperparameters
97
 
 
109
  | Mixed precision | FP16 |
110
  | Min examples per class | 50 |
111
 
112
+ ### Training Metrics
113
+
114
+ Best checkpoint selected at epoch 15 by missing-label-primary, F1-tiebreaker (0 missing, F1 0.590).
115
+
116
  ### Test Set Classification Report
117
 
118
  <details><summary>Click to expand per-class precision/recall/F1/support</summary>
119
  ```
120
  precision recall f1-score support
121
 
122
+ abstraction 0.368 0.658 0.472 117
123
+ agonistic_framing 0.857 0.750 0.800 32
124
+ analytical_distance 0.504 0.475 0.489 120
125
+ concrete_situational 0.509 0.385 0.438 143
126
+ direct_address 0.671 0.689 0.680 367
127
+ formulaic_phrases 0.205 0.608 0.307 51
128
+ hedging_qualification 0.600 0.500 0.545 114
129
+ literate_feature 0.478 0.833 0.608 66
130
+ logical_connectives 0.621 0.516 0.564 124
131
+ oral_feature 0.784 0.365 0.498 159
132
+ parallelism 0.688 0.579 0.629 19
133
+ parataxis 0.655 0.387 0.486 93
134
+ passive_agentless 0.721 0.500 0.590 62
135
+ performance_markers 0.660 0.403 0.500 77
136
+ repetition 0.738 0.705 0.721 156
137
+ sound_patterns 0.672 0.623 0.647 69
138
+ subordination 0.622 0.689 0.654 296
139
+ textual_apparatus 0.718 0.655 0.685 113
140
 
141
  accuracy 0.584 2178
142
+ macro avg 0.615 0.573 0.573 2178
143
+ weighted avg 0.624 0.584 0.587 2178
144
  ```
145
 
146
  </details>
147
 
148
+ **Top performing types (F1 ≥ 0.65):** `agonistic_framing` (0.800), `repetition` (0.721), `textual_apparatus` (0.685), `direct_address` (0.680), `subordination` (0.654), `sound_patterns` (0.647), `parallelism` (0.629), `literate_feature` (0.608).
149
+
150
+ **Weakest types (F1 < 0.50):** `formulaic_phrases` (0.307), `concrete_situational` (0.438), `abstraction` (0.472), `parataxis` (0.486), `oral_feature` (0.498). `formulaic_phrases` suffers from severe precision collapse (P=0.205) despite reasonable recall, suggesting heavy confusion with other oral types. `oral_feature` shows the inverse pattern (P=0.784, R=0.365) — the model is confident but conservative.
151
 
152
+ ## Class Distribution
153
+
154
+ | Support Range | Classes | Examples |
155
+ |---------------|---------|----------|
156
+ | >2500 | `direct_address`, `subordination`, `abstraction` | 3 |
157
+ | 1000–2500 | `repetition`, `formulaic_phrases`, `hedging_qualification`, `analytical_distance`, `concrete_situational`, `logical_connectives`, `textual_apparatus` | 7 |
158
+ | 500–1000 | `sound_patterns`, `passive_agentless`, `performance_markers`, `parataxis`, `literate_feature`, `oral_feature` | 6 |
159
+ | <500 | `agonistic_framing`, `parallelism` | 2 |
160
 
161
  ## Limitations
162
 
163
+ - **Class imbalance**: `direct_address` has 367 test examples while `parallelism` has 19. Weighted F1 (0.587) is close to macro F1 (0.573), indicating reasonably balanced performance, but rare types remain harder.
164
  - **Span-level only**: Requires pre-extracted spans. Does not detect boundaries.
165
  - **128-token context window**: Longer spans are truncated.
166
+ - **Abstraction underperforms**: At 0.472 F1 despite being a large class (117 test spans), suggesting the type may be too broad or overlapping with `analytical_distance` and `literate_feature`.
167
+ - **Precision-recall asymmetry**: Several types show strong precision–recall imbalance (`oral_feature` P=0.784/R=0.365; `formulaic_phrases` P=0.205/R=0.608), indicating the focal loss weighting could be further tuned.
168
 
169
  ## Theoretical Background
170
 
171
  The type level captures functional groupings within the oral–literate framework. Oral types reflect Ong's characterization of oral discourse as additive (`parataxis`), aggregative (`formulaic_phrases`), redundant (`repetition`), agonistically toned (`agonistic_framing`), empathetic and participatory (`direct_address`), and close to the human lifeworld (`concrete_situational`). Literate types capture the analytic (`abstraction`, `subordination`), distanced (`analytical_distance`, `passive_agentless`), and self-referential (`textual_apparatus`) qualities of written discourse.
172
 
173
+ ## Related Models
174
+
175
+ | Model | Task | Classes | F1 |
176
+ |-------|------|---------|-----|
177
+ | [`HavelockAI/bert-marker-category`](https://huggingface.co/HavelockAI/bert-marker-category) | Binary (oral/literate) | 2 | 0.875 |
178
+ | **This model** | Functional type | 18 | 0.573 |
179
+ | [`HavelockAI/bert-marker-subtype`](https://huggingface.co/HavelockAI/bert-marker-subtype) | Fine-grained subtype | 71 | 0.493 |
180
+ | [`HavelockAI/bert-orality-regressor`](https://huggingface.co/HavelockAI/bert-orality-regressor) | Document-level score | Regression | MAE 0.079 |
181
+ | [`HavelockAI/bert-token-classifier`](https://huggingface.co/HavelockAI/bert-token-classifier) | Span detection (BIO) | 145 | 0.500 |
182
+
183
  ## Citation
184
  ```bibtex
185
  @misc{havelock2026type,
 
193
  ## References
194
 
195
  - Ong, Walter J. *Orality and Literacy: The Technologizing of the Word*. Routledge, 1982.
196
+ - Lee, C. et al. "Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models." ICLR 2020.
197
+ - Warner, A. et al. "Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference." 2024.
198
 
199
  ---
200
 
201
+ *Trained: February 2026*