File size: 9,019 Bytes
b5503c0 ef4ddf1 b5503c0 ef4ddf1 b5503c0 ef4ddf1 b5503c0 7004a5f b5503c0 ef4ddf1 b5503c0 7004a5f b5503c0 ef4ddf1 7004a5f b5503c0 ef4ddf1 7004a5f ef4ddf1 b5503c0 7004a5f b5503c0 7004a5f b5503c0 7004a5f b5503c0 ef4ddf1 b5503c0 7004a5f b5503c0 7004a5f b5503c0 ef4ddf1 b5503c0 ef4ddf1 7004a5f ef4ddf1 b5503c0 ef4ddf1 b5503c0 ef4ddf1 b5503c0 ef4ddf1 b5503c0 ef4ddf1 b5503c0 ef4ddf1 b5503c0 ef4ddf1 b5503c0 ef4ddf1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 | ---
license: mit
tags:
- text-classification
- modernbert
- orality
- linguistics
- rhetorical-analysis
language:
- en
metrics:
- f1
- accuracy
base_model:
- answerdotai/ModernBERT-base
pipeline_tag: text-classification
library_name: transformers
datasets:
- custom
model-index:
- name: bert-marker-type
results:
- task:
type: text-classification
name: Marker Type Classification
metrics:
- type: f1
value: 0.573
name: F1 (macro)
- type: accuracy
value: 0.584
name: Accuracy
---
# Havelock Marker Type Classifier
ModernBERT-based classifier for **18 rhetorical marker types** on the oral–literate spectrum, grounded in Walter Ong's *Orality and Literacy* (1982).
This is the mid-level of the Havelock span classification hierarchy. Given a text span identified as a rhetorical marker, the model classifies it into one of 18 functional types (e.g., `repetition`, `subordination`, `direct_address`, `hedging_qualification`).
## Model Details
| Property | Value |
|----------|-------|
| Base model | `answerdotai/ModernBERT-base` |
| Architecture | `ModernBertForSequenceClassification` |
| Task | Multi-class classification (18 classes) |
| Max sequence length | 128 tokens |
| Test F1 (macro) | **0.573** |
| Test Accuracy | **0.584** |
| Missing labels | **0/18** |
| Parameters | ~149M |
## Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_name = "HavelockAI/bert-marker-type"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
span = "whether or not the underlying assumptions hold true"
inputs = tokenizer(span, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
logits = model(**inputs).logits
pred = torch.argmax(logits, dim=1).item()
print(f"Marker type: {model.config.id2label[pred]}")
```
## Label Taxonomy (18 types)
The 18 types group fine-grained subtypes into functional families. Prior versions carried spurious label variants (e.g., `hedging` alongside `hedging_qualification`, `passive` alongside `passive_agentless`) introduced by inconsistent upstream annotation. These have been resolved via a canonical taxonomy with normalization and validation at build time.
| Oral Types (10) | Literate Types (8) |
|------------|----------------|
| `direct_address` | `subordination` |
| `repetition` | `abstraction` |
| `formulaic_phrases` | `hedging_qualification` |
| `parallelism` | `analytical_distance` |
| `parataxis` | `logical_connectives` |
| `sound_patterns` | `textual_apparatus` |
| `performance_markers` | `literate_feature` |
| `concrete_situational` | `passive_agentless` |
| `agonistic_framing` | |
| `oral_feature` | |
## Training
### Data
22,367 span-level annotations from the Havelock corpus. Each span carries a `marker_type` field normalized against a canonical taxonomy at build time. A stratified 80/10/10 train/val/test split was used with swap-based optimization to balance label distributions across splits. The test set contains 2,178 spans.
### Hyperparameters
| Parameter | Value |
|-----------|-------|
| Epochs | 20 |
| Batch size | 16 |
| Learning rate | 3e-5 |
| Optimizer | AdamW (weight decay 0.01) |
| LR schedule | Cosine with 10% warmup |
| Gradient clipping | 1.0 |
| Loss | Focal loss (γ=2.0) + class weights |
| Label smoothing | 0.0 |
| Mixout | 0.1 |
| Mixed precision | FP16 |
| Min examples per class | 50 |
### Training Metrics
Best checkpoint selected at epoch 15 by missing-label-primary, F1-tiebreaker (0 missing, F1 0.590).
### Test Set Classification Report
<details><summary>Click to expand per-class precision/recall/F1/support</summary>
```
precision recall f1-score support
abstraction 0.368 0.658 0.472 117
agonistic_framing 0.857 0.750 0.800 32
analytical_distance 0.504 0.475 0.489 120
concrete_situational 0.509 0.385 0.438 143
direct_address 0.671 0.689 0.680 367
formulaic_phrases 0.205 0.608 0.307 51
hedging_qualification 0.600 0.500 0.545 114
literate_feature 0.478 0.833 0.608 66
logical_connectives 0.621 0.516 0.564 124
oral_feature 0.784 0.365 0.498 159
parallelism 0.688 0.579 0.629 19
parataxis 0.655 0.387 0.486 93
passive_agentless 0.721 0.500 0.590 62
performance_markers 0.660 0.403 0.500 77
repetition 0.738 0.705 0.721 156
sound_patterns 0.672 0.623 0.647 69
subordination 0.622 0.689 0.654 296
textual_apparatus 0.718 0.655 0.685 113
accuracy 0.584 2178
macro avg 0.615 0.573 0.573 2178
weighted avg 0.624 0.584 0.587 2178
```
</details>
**Top performing types (F1 ≥ 0.65):** `agonistic_framing` (0.800), `repetition` (0.721), `textual_apparatus` (0.685), `direct_address` (0.680), `subordination` (0.654), `sound_patterns` (0.647), `parallelism` (0.629), `literate_feature` (0.608).
**Weakest types (F1 < 0.50):** `formulaic_phrases` (0.307), `concrete_situational` (0.438), `abstraction` (0.472), `parataxis` (0.486), `oral_feature` (0.498). `formulaic_phrases` suffers from severe precision collapse (P=0.205) despite reasonable recall, suggesting heavy confusion with other oral types. `oral_feature` shows the inverse pattern (P=0.784, R=0.365) — the model is confident but conservative.
## Class Distribution
| Support Range | Classes | Examples |
|---------------|---------|----------|
| >2500 | `direct_address`, `subordination`, `abstraction` | 3 |
| 1000–2500 | `repetition`, `formulaic_phrases`, `hedging_qualification`, `analytical_distance`, `concrete_situational`, `logical_connectives`, `textual_apparatus` | 7 |
| 500–1000 | `sound_patterns`, `passive_agentless`, `performance_markers`, `parataxis`, `literate_feature`, `oral_feature` | 6 |
| <500 | `agonistic_framing`, `parallelism` | 2 |
## Limitations
- **Class imbalance**: `direct_address` has 367 test examples while `parallelism` has 19. Weighted F1 (0.587) is close to macro F1 (0.573), indicating reasonably balanced performance, but rare types remain harder.
- **Span-level only**: Requires pre-extracted spans. Does not detect boundaries.
- **128-token context window**: Longer spans are truncated.
- **Abstraction underperforms**: At 0.472 F1 despite being a large class (117 test spans), suggesting the type may be too broad or overlapping with `analytical_distance` and `literate_feature`.
- **Precision-recall asymmetry**: Several types show strong precision–recall imbalance (`oral_feature` P=0.784/R=0.365; `formulaic_phrases` P=0.205/R=0.608), indicating the focal loss weighting could be further tuned.
## Theoretical Background
The type level captures functional groupings within the oral–literate framework. Oral types reflect Ong's characterization of oral discourse as additive (`parataxis`), aggregative (`formulaic_phrases`), redundant (`repetition`), agonistically toned (`agonistic_framing`), empathetic and participatory (`direct_address`), and close to the human lifeworld (`concrete_situational`). Literate types capture the analytic (`abstraction`, `subordination`), distanced (`analytical_distance`, `passive_agentless`), and self-referential (`textual_apparatus`) qualities of written discourse.
## Related Models
| Model | Task | Classes | F1 |
|-------|------|---------|-----|
| [`HavelockAI/bert-marker-category`](https://huggingface.co/HavelockAI/bert-marker-category) | Binary (oral/literate) | 2 | 0.875 |
| **This model** | Functional type | 18 | 0.573 |
| [`HavelockAI/bert-marker-subtype`](https://huggingface.co/HavelockAI/bert-marker-subtype) | Fine-grained subtype | 71 | 0.493 |
| [`HavelockAI/bert-orality-regressor`](https://huggingface.co/HavelockAI/bert-orality-regressor) | Document-level score | Regression | MAE 0.079 |
| [`HavelockAI/bert-token-classifier`](https://huggingface.co/HavelockAI/bert-token-classifier) | Span detection (BIO) | 145 | 0.500 |
## Citation
```bibtex
@misc{havelock2026type,
title={Havelock Marker Type Classifier},
author={Havelock AI},
year={2026},
url={https://huggingface.co/HavelockAI/bert-marker-type}
}
```
## References
- Ong, Walter J. *Orality and Literacy: The Technologizing of the Word*. Routledge, 1982.
- Lee, C. et al. "Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models." ICLR 2020.
- Warner, A. et al. "Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference." 2024.
---
*Trained: February 2026* |