permutans commited on
Commit
2070386
·
verified ·
1 Parent(s): 5d97200

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. README.md +126 -319
  2. config.json +1 -295
  3. head_config.json +5 -0
  4. model.safetensors +2 -2
  5. type_to_idx.json +55 -0
README.md CHANGED
@@ -5,7 +5,7 @@ tags:
5
  - bert
6
  - orality
7
  - linguistics
8
- - ner
9
  language:
10
  - en
11
  metrics:
@@ -22,389 +22,196 @@ datasets:
22
 
23
  BERT-based token classifier for detecting **oral and literate markers** in text, based on Walter Ong's "Orality and Literacy" (1982).
24
 
25
- This model performs span-level detection of 72 rhetorical marker types using BIO tagging (145 labels total).
26
 
27
  ## Model Details
28
 
29
  | Property | Value |
30
  |----------|-------|
31
  | Base model | `bert-base-uncased` |
32
- | Task | Token classification (BIO tagging) |
33
- | Labels | 145 (72 marker types × B/I + O) |
34
- | Best F1 | **0.5003** (macro, markers only) |
35
- | Training | 20 epochs, batch 8, lr 2e-5 |
36
- | Loss | Focal loss (γ=1.0) for class imbalance |
 
 
37
 
38
  ## Usage
39
  ```python
40
- from transformers import AutoTokenizer, AutoModelForTokenClassification
41
  import torch
 
 
42
 
43
- model_name = "HavelockAI/bert-token-classifier"
44
- tokenizer = AutoTokenizer.from_pretrained(model_name)
45
- model = AutoModelForTokenClassification.from_pretrained(model_name)
 
 
 
 
46
 
47
  text = "Tell me, O Muse, of that ingenious hero who travelled far and wide"
48
- inputs = tokenizer(text, return_tensors="pt", return_offsets_mapping=True)
49
- offset_mapping = inputs.pop("offset_mapping")
50
 
51
  with torch.no_grad():
52
- outputs = model(**inputs)
53
- predictions = torch.argmax(outputs.logits, dim=-1)
54
 
55
- # Decode predictions
56
  tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
57
- labels = [model.config.id2label[p.item()] for p in predictions[0]]
58
-
59
- for token, label in zip(tokens, labels):
60
- if label != "O":
61
- print(f"{token:15} {label}")
62
- ```
63
-
64
- **Output:**
65
- ```
66
- tell B-oral_imperative
67
- me I-oral_imperative
68
- , I-oral_imperative
69
- o B-oral_vocative
70
- muse I-oral_vocative
71
  ```
72
 
73
  ## Training Data
74
 
75
- - **3,119 examples** with BIO-tagged spans
76
- - **4,474 marker annotations** across 72 types
77
  - Sources: Project Gutenberg, textfiles.com, Reddit, Wikipedia talk pages
78
- - Synthetic examples for rare marker types (30 examples minimum per type)
79
-
80
- ### Class Distribution
81
-
82
- The dataset exhibits extreme class imbalance (72 marker types, long-tail distribution). We use focal loss to down-weight easy examples and focus learning on rare markers.
83
-
84
- | Frequency | Marker types |
85
- |-----------|--------------|
86
- | >100 examples | 15 types (21%) |
87
- | 30-100 examples | 37 types (51%) |
88
- | <30 examples | 20 types (28%) |
89
 
90
- ## Marker Types (72)
91
 
92
- ### Oral Markers (36 types)
93
 
94
  Characteristics of oral tradition and spoken discourse:
95
 
96
  | Category | Markers |
97
  |----------|---------|
98
- | **Repetition & Pattern** | anaphora, epistrophe, parallelism, tricolon, lexical_repetition, refrain |
99
- | **Sound & Rhythm** | alliteration, rhythm, assonance, rhyme |
100
- | **Address & Interaction** | vocative, imperative, second_person, inclusive_we, rhetorical_question, audience_response, phatic_check, phatic_filler |
101
- | **Conjunction** | polysyndeton, asyndeton, simple_conjunction, binomial_expression |
102
- | **Formulas** | discourse_formula, proverb, religious_formula, epithet |
103
  | **Narrative** | named_individual, specific_place, temporal_anchor, sensory_detail, embodied_action, everyday_example |
104
- | **Performance** | dramatic_pause, self_correction, conflict_frame, us_them, first_person, paradox |
105
 
106
- ### Literate Markers (36 types)
107
 
108
  Characteristics of written, analytical discourse:
109
 
110
  | Category | Markers |
111
  |----------|---------|
112
  | **Abstraction** | nominalization, abstract_noun, conceptual_metaphor, categorical_statement |
113
- | **Syntax** | nested_clauses, relative_chain, conditional, concessive, temporal_embedding, causal_chain |
114
  | **Hedging** | epistemic_hedge, probability, evidential, qualified_assertion, concessive_connector |
115
- | **Impersonality** | agentless_passive, agent_demoted, institutional_subject, objectifying_stance, third_person_reference |
116
- | **Scholarly apparatus** | citation, footnote_reference, cross_reference, metadiscourse, methodological_framing |
117
- | **Technical** | technical_term, technical_abbreviation, enumeration, list_structure, definitional_move |
118
- | **Connectives** | contrastive, causal_explicit, additive_formal, paradox |
 
119
 
120
  ## Evaluation
121
 
122
- Per-class F1 on test set:
123
 
124
  <details><summary>Click to show per-marker precision/recall/F1/support</summary>
125
-
126
  ```
127
- precision recall f1-score support
128
-
129
- O 0.733 0.828 0.778 3556
130
- B-literate_abstract_noun 0.333 0.286 0.308 14
131
- B-literate_additive_formal 1.000 0.667 0.800 3
132
- B-literate_agent_demoted 0.800 1.000 0.889 4
133
- B-literate_agentless_passive 0.357 0.417 0.385 24
134
- B-literate_aside 0.429 0.667 0.522 9
135
- B-literate_categorical_statement 0.500 0.750 0.600 4
136
- B-literate_causal_chain 1.000 0.333 0.500 3
137
- B-literate_causal_explicit 0.538 0.636 0.583 11
138
- B-literate_citation 0.000 0.000 0.000 10
139
- B-literate_conceptual_metaphor 0.667 0.333 0.444 6
140
- B-literate_concessive 1.000 1.000 1.000 2
141
- B-literate_concessive_connector 0.800 0.800 0.800 5
142
- B-literate_conditional 0.643 0.643 0.643 14
143
- B-literate_contrastive 0.400 0.500 0.444 8
144
- B-literate_definitional_move 1.000 1.000 1.000 1
145
- B-literate_enumeration 0.500 0.667 0.571 3
146
- B-literate_epistemic_hedge 0.387 0.500 0.436 24
147
- B-literate_evidential 0.333 0.091 0.143 11
148
- B-literate_footnote_reference 0.500 0.667 0.571 3
149
- B-literate_institutional_subject 0.750 1.000 0.857 3
150
- B-literate_list_structure 0.000 0.000 0.000 1
151
- B-literate_metadiscourse 0.500 0.500 0.500 4
152
- B-literate_methodological_framing 1.000 0.500 0.667 4
153
- B-literate_nested_clauses 0.293 0.545 0.381 22
154
- B-literate_nominalization 0.750 0.300 0.429 10
155
- B-literate_objectifying_stance 0.500 0.500 0.500 4
156
- B-literate_paradox 0.500 0.333 0.400 3
157
- B-literate_probability 0.333 0.200 0.250 5
158
- B-literate_qualified_assertion 0.000 0.000 0.000 5
159
- B-literate_relative_chain 0.314 0.727 0.438 22
160
- B-literate_technical_abbreviation 0.000 0.000 0.000 2
161
- B-literate_technical_term 0.333 0.667 0.444 3
162
- B-literate_temporal_embedding 1.000 0.500 0.667 4
163
- B-literate_third_person_reference 0.333 0.333 0.333 3
164
- B-oral_alliteration 1.000 0.667 0.800 3
165
- B-oral_anaphora 0.130 0.200 0.158 15
166
- B-oral_asyndeton 0.000 0.000 0.000 1
167
- B-oral_audience_response 1.000 1.000 1.000 4
168
- B-oral_binomial_expression 0.400 0.400 0.400 5
169
- B-oral_conflict_frame 0.800 0.800 0.800 5
170
- B-oral_discourse_formula 0.500 0.500 0.500 6
171
- B-oral_dramatic_pause 0.000 0.000 0.000 2
172
- B-oral_embodied_action 0.333 0.167 0.222 6
173
- B-oral_epistrophe 0.000 0.000 0.000 3
174
- B-oral_epithet 0.000 0.000 0.000 2
175
- B-oral_everyday_example 1.000 1.000 1.000 3
176
- B-oral_first_person 0.000 0.000 0.000 5
177
- B-oral_imperative 0.600 0.643 0.621 14
178
- B-oral_inclusive_we 0.486 0.586 0.531 29
179
- B-oral_intensifier_doubling 1.000 0.667 0.800 3
180
- B-oral_lexical_repetition 0.273 0.300 0.286 10
181
- B-oral_named_individual 0.600 0.450 0.514 20
182
- B-oral_parallelism 0.083 0.143 0.105 7
183
- B-oral_phatic_check 1.000 1.000 1.000 1
184
- B-oral_phatic_filler 0.429 0.600 0.500 5
185
- B-oral_polysyndeton 0.250 0.200 0.222 10
186
- B-oral_proverb 1.000 0.500 0.667 6
187
- B-oral_refrain 1.000 1.000 1.000 1
188
- B-oral_religious_formula 1.000 0.500 0.667 2
189
- B-oral_rhetorical_question 0.250 1.000 0.400 2
190
- B-oral_rhythm 0.714 0.833 0.769 6
191
- B-oral_second_person 0.516 0.640 0.571 25
192
- B-oral_self_correction 0.750 1.000 0.857 3
193
- B-oral_sensory_detail 1.000 1.000 1.000 1
194
- B-oral_simple_conjunction 0.000 0.000 0.000 3
195
- B-oral_specific_place 0.400 0.667 0.500 3
196
- B-oral_temporal_anchor 0.000 0.000 0.000 3
197
- B-oral_tricolon 0.222 1.000 0.364 2
198
- B-oral_us_them 0.667 0.667 0.667 3
199
- B-oral_vocative 0.941 0.593 0.727 27
200
- I-literate_abstract_noun 0.000 0.000 0.000 14
201
- I-literate_additive_formal 0.000 0.000 0.000 6
202
- I-literate_agent_demoted 0.583 0.933 0.718 15
203
- I-literate_agentless_passive 0.420 0.397 0.408 73
204
- I-literate_aside 0.544 0.523 0.533 107
205
- I-literate_categorical_statement 0.571 0.348 0.432 23
206
- I-literate_causal_chain 0.800 0.640 0.711 25
207
- I-literate_causal_explicit 0.576 0.826 0.679 23
208
- I-literate_citation 0.706 0.250 0.369 48
209
- I-literate_conceptual_metaphor 0.714 0.333 0.455 15
210
- I-literate_concessive 0.778 1.000 0.875 7
211
- I-literate_concessive_connector 0.200 0.333 0.250 3
212
- I-literate_conditional 0.676 0.410 0.511 117
213
- I-literate_contrastive 0.286 0.400 0.333 15
214
- I-literate_cross_reference 0.000 0.000 0.000 0
215
- I-literate_definitional_move 1.000 1.000 1.000 5
216
- I-literate_enumeration 1.000 0.375 0.545 40
217
- I-literate_epistemic_hedge 0.486 0.370 0.420 46
218
- I-literate_evidential 0.250 0.034 0.061 29
219
- I-literate_footnote_reference 0.800 0.727 0.762 11
220
- I-literate_institutional_subject 0.833 1.000 0.909 5
221
- I-literate_list_structure 0.000 0.000 0.000 3
222
- I-literate_metadiscourse 0.200 0.125 0.154 16
223
- I-literate_methodological_framing 0.667 0.500 0.571 12
224
- I-literate_nested_clauses 0.489 0.292 0.366 390
225
- I-literate_nominalization 0.000 0.000 0.000 14
226
- I-literate_objectifying_stance 0.833 0.769 0.800 13
227
- I-literate_paradox 0.100 0.062 0.077 16
228
- I-literate_probability 0.000 0.000 0.000 7
229
- I-literate_qualified_assertion 0.000 0.000 0.000 21
230
- I-literate_relative_chain 0.479 0.531 0.504 262
231
- I-literate_technical_abbreviation 0.667 0.182 0.286 11
232
- I-literate_technical_term 0.455 0.357 0.400 14
233
- I-literate_temporal_embedding 1.000 0.588 0.741 51
234
- I-literate_third_person_reference 0.500 0.167 0.250 6
235
- I-oral_alliteration 0.857 0.545 0.667 11
236
- I-oral_anaphora 0.208 0.198 0.203 101
237
- I-oral_asyndeton 0.000 0.000 0.000 7
238
- I-oral_audience_response 0.905 0.905 0.905 21
239
- I-oral_binomial_expression 0.400 0.727 0.516 11
240
- I-oral_conflict_frame 1.000 0.714 0.833 7
241
- I-oral_discourse_formula 0.667 0.667 0.667 6
242
- I-oral_dramatic_pause 0.400 0.500 0.444 4
243
- I-oral_embodied_action 0.000 0.000 0.000 16
244
- I-oral_epistrophe 0.000 0.000 0.000 3
245
- I-oral_epithet 0.429 0.600 0.500 5
246
- I-oral_everyday_example 0.955 1.000 0.977 21
247
- I-oral_first_person 0.000 0.000 0.000 2
248
- I-oral_imperative 0.615 0.276 0.381 29
249
- I-oral_inclusive_we 0.904 0.922 0.913 51
250
- I-oral_intensifier_doubling 0.800 1.000 0.889 4
251
- I-oral_lexical_repetition 0.196 0.244 0.217 41
252
- I-oral_named_individual 0.579 0.589 0.584 56
253
- I-oral_parallelism 0.471 0.287 0.357 143
254
- I-oral_phatic_check 1.000 1.000 1.000 3
255
- I-oral_phatic_filler 0.667 0.400 0.500 5
256
- I-oral_polysyndeton 1.000 0.217 0.356 83
257
- I-oral_proverb 1.000 0.568 0.724 37
258
- I-oral_refrain 1.000 1.000 1.000 4
259
- I-oral_religious_formula 1.000 0.125 0.222 16
260
- I-oral_rhetorical_question 0.429 0.600 0.500 15
261
- I-oral_rhythm 0.957 0.571 0.715 77
262
- I-oral_second_person 0.333 0.143 0.200 7
263
- I-oral_self_correction 0.842 0.800 0.821 20
264
- I-oral_sensory_detail 1.000 0.800 0.889 5
265
- I-oral_simple_conjunction 0.667 1.000 0.800 6
266
- I-oral_specific_place 0.714 0.625 0.667 8
267
- I-oral_temporal_anchor 0.056 0.100 0.071 10
268
- I-oral_tricolon 0.309 0.806 0.446 31
269
- I-oral_us_them 0.571 0.444 0.500 9
270
- I-oral_vocative 0.897 0.745 0.814 47
271
-
272
- accuracy 0.653 6441
273
- macro avg 0.530 0.487 0.481 6441
274
- weighted avg 0.653 0.653 0.637 6441
275
  ```
276
 
277
  </details>
278
 
279
- <details><summary>Click to show split proportions per marker</summary>
280
- ```
281
- bio_train.jsonl: 3460 markers across 72 types
282
- bio_val.jsonl: 514 markers across 70 types
283
- bio_test.jsonl: 500 markers across 70 types
284
-
285
- ======================================================================
286
- Marker Train Val Test Total
287
- ======================================================================
288
- oral_inclusive_we 207 26 29 262
289
- oral_second_person 160 25 25 210
290
- literate_agentless_passive 158 22 24 204
291
- oral_named_individual 157 26 20 203
292
- literate_relative_chain 146 8 22 176
293
- literate_epistemic_hedge 125 23 24 172
294
- oral_vocative 118 17 27 162
295
- oral_rhetorical_question 132 16 2 150
296
- oral_anaphora 115 10 15 140
297
- oral_imperative 104 16 14 134
298
- literate_nested_clauses 103 4 22 129
299
- literate_abstract_noun 95 20 14 129
300
- oral_discourse_formula 93 15 6 114
301
- literate_conditional 85 10 14 109
302
- oral_specific_place 81 22 3 106
303
- literate_contrastive 65 11 8 84
304
- literate_causal_explicit 69 3 11 83
305
- oral_temporal_anchor 66 14 3 83
306
- oral_parallelism 66 10 7 83
307
- oral_lexical_repetition 48 12 10 70
308
- literate_technical_term 56 8 3 67
309
- literate_aside 51 6 9 66
310
- literate_nominalization 44 3 10 57
311
- oral_tricolon 43 8 2 53
312
- literate_concessive 37 6 2 45
313
- oral_epithet 36 5 2 43
314
- literate_additive_formal 29 4 3 36
315
- oral_polysyndeton 15 10 10 35
316
- literate_list_structure 28 5 1 34
317
- oral_embodied_action 19 6 6 31
318
- literate_metadiscourse 22 5 4 31
319
- oral_binomial_expression 23 3 5 31
320
- oral_alliteration 23 5 3 31
321
- literate_causal_chain 22 5 3 30
322
- oral_epistrophe 23 4 3 30
323
- oral_refrain 25 4 1 30
324
- oral_audience_response 25 1 4 30
325
- oral_self_correction 23 4 3 30
326
- literate_methodological_framing 21 5 4 30
327
- oral_rhythm 21 3 6 30
328
- oral_conflict_frame 24 1 5 30
329
- literate_footnote_reference 25 2 3 30
330
- literate_definitional_move 25 4 1 30
331
- literate_evidential 13 6 11 30
332
- oral_phatic_filler 24 1 5 30
333
- oral_phatic_check 25 4 1 30
334
- literate_agent_demoted 21 5 4 30
335
- literate_enumeration 24 3 3 30
336
- literate_conceptual_metaphor 21 3 6 30
337
- oral_everyday_example 22 5 3 30
338
- oral_us_them 24 3 3 30
339
- oral_intensifier_doubling 25 2 3 30
340
- literate_institutional_subject 22 4 3 29
341
- literate_temporal_embedding 23 2 4 29
342
- literate_concessive_connector 22 2 5 29
343
- literate_third_person_reference 21 5 3 29
344
- literate_probability 21 3 5 29
345
- literate_citation 12 7 10 29
346
- oral_religious_formula 24 3 2 29
347
- literate_technical_abbreviation 24 3 2 29
348
- literate_qualified_assertion 23 1 5 29
349
- literate_categorical_statement 24 1 4 29
350
- oral_first_person 22 2 5 29
351
- oral_simple_conjunction 21 5 3 29
352
- literate_paradox 18 7 3 28
353
- oral_proverb 22 0 6 28 ⚠️
354
- literate_objectifying_stance 21 3 4 28
355
- oral_asyndeton 24 3 1 28
356
- oral_sensory_detail 21 5 1 27
357
- oral_dramatic_pause 20 4 2 26
358
- literate_cross_reference 21 5 0 26 ⚠️
359
- oral_paradox 2 0 0 2 ⚠️
360
- ======================================================================
361
- TOTAL 3460 514 500 4474
362
-
363
- --- Long Tail Summary ---
364
-
365
- Markers with < 10 examples: 1 (1%)
366
- Markers with < 20 examples: 1 (1%)
367
- Markers with < 30 examples: 20 (28%)
368
- Markers with < 50 examples: 48 (67%)
369
- Markers with <100 examples: 57 (79%)
370
- ```
371
-
372
- - **Note**: ⚠️ indicates a 0 sized split
373
- - `oral_proverb`: 0 val split
374
- - `literate_cross_reference`: 0 test split
375
- - `oral_paradox`: 0 val/test splits
376
 
377
- </details>
378
-
379
- **Best Val F1 (markers only):** 0.5003
380
- **Macro F1 (all 145 labels, test):** 0.481
381
- **Weighted F1 (test):** 0.637
382
- **Accuracy (test):** 65.3%
383
 
384
  ## Architecture
385
 
386
- Custom `BertTokenClassifier` with focal loss:
387
  ```
388
  BertModel (bert-base-uncased)
389
  └── Dropout (p=0.1)
390
- └── Linear (768 → 145)
391
- └── FocalLoss (α=1.0, γ=1.0)
392
  ```
393
 
394
- Focal loss addresses class imbalance by down-weighting well-classified tokens (mostly "O") and focusing on hard examples (rare markers).
 
 
 
 
 
 
 
395
 
396
  ### Initialization
397
 
398
- Fine-tuned from `bert-base-uncased`. The classification head (`classifier.weight`, `classifier.bias`) is randomly initialized:
399
  ```
400
- bert.* layers → loaded from checkpoint
401
  classifier.weight → randomly initialized
402
  classifier.bias → randomly initialized
403
  ```
404
 
405
  ## Limitations
406
 
407
- - **Rare markers**: Types with <10 training examples (e.g., `oral_paradox`, `oral_dramatic_pause`) have poor recall
 
408
  - **Context window**: 128 tokens max; longer spans may be truncated
409
  - **Domain**: Trained primarily on historical/literary texts; may underperform on modern social media
410
  - **Subjectivity**: Some marker boundaries are inherently ambiguous
@@ -422,8 +229,8 @@ classifier.bias → randomly initialized
422
  ## References
423
 
424
  - Ong, Walter J. *Orality and Literacy: The Technologizing of the Word*. Routledge, 1982.
425
- - Lin, T.-Y. et al. "Focal Loss for Dense Object Detection." ICCV 2017.
426
 
427
  ---
428
 
429
- *Model version: 668564aa • Trained: February 2026*
 
5
  - bert
6
  - orality
7
  - linguistics
8
+ - multi-label
9
  language:
10
  - en
11
  metrics:
 
22
 
23
  BERT-based token classifier for detecting **oral and literate markers** in text, based on Walter Ong's "Orality and Literacy" (1982).
24
 
25
+ This model performs multi-label span-level detection of 53 rhetorical marker types, where each token independently carries B/I/O labels per type — allowing overlapping spans (e.g. a token that is simultaneously part of a concessive and a nested clause).
26
 
27
  ## Model Details
28
 
29
  | Property | Value |
30
  |----------|-------|
31
  | Base model | `bert-base-uncased` |
32
+ | Task | Multi-label token classification (independent B/I/O per type) |
33
+ | Marker types | 53 (22 oral, 31 literate) |
34
+ | Test macro F1 | **0.388** (per-type detection, binary positive = B or I) |
35
+ | Training | 20 epochs, batch 24, lr 3e-5, fp16 |
36
+ | Regularization | Mixout (p=0.1) stochastic L2 anchor to pretrained weights |
37
+ | Loss | Per-type weighted cross-entropy with inverse-frequency type weights |
38
+ | Min examples | 150 (types below this threshold excluded) |
39
 
40
  ## Usage
41
  ```python
42
+ import json
43
  import torch
44
+ from transformers import AutoTokenizer
45
+ from estimators.tokens.model import MultiLabelTokenClassifier
46
 
47
+ model_path = "models/bert_token_classifier"
48
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
49
+ model = MultiLabelTokenClassifier.load(model_path, device="cpu")
50
+ model.eval()
51
+
52
+ type_to_idx = json.loads((model_path / "type_to_idx.json").read_text())
53
+ idx_to_type = {v: k for k, v in type_to_idx.items()}
54
 
55
  text = "Tell me, O Muse, of that ingenious hero who travelled far and wide"
56
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
 
57
 
58
  with torch.no_grad():
59
+ logits = model(inputs["input_ids"], inputs["attention_mask"])
60
+ preds = logits.argmax(dim=-1) # (1, seq, num_types)
61
 
 
62
  tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
63
+ for i, token in enumerate(tokens):
64
+ active = [
65
+ f"{idx_to_type[t]}={'OBI'[v]}"
66
+ for t, v in enumerate(preds[0, i].tolist())
67
+ if v > 0
68
+ ]
69
+ if active:
70
+ print(f"{token:15} {', '.join(active)}")
 
 
 
 
 
 
71
  ```
72
 
73
  ## Training Data
74
 
 
 
75
  - Sources: Project Gutenberg, textfiles.com, Reddit, Wikipedia talk pages
76
+ - Types with fewer than 150 annotated spans are excluded from training
77
+ - Multi-label BIO annotation: tokens can carry labels for multiple overlapping marker types simultaneously
 
 
 
 
 
 
 
 
 
78
 
79
+ ## Marker Types (53)
80
 
81
+ ### Oral Markers (22 types)
82
 
83
  Characteristics of oral tradition and spoken discourse:
84
 
85
  | Category | Markers |
86
  |----------|---------|
87
+ | **Address & Interaction** | vocative, imperative, second_person, inclusive_we, rhetorical_question, phatic_check, phatic_filler |
88
+ | **Repetition & Pattern** | anaphora, parallelism, tricolon, lexical_repetition, antithesis |
89
+ | **Conjunction** | simple_conjunction |
90
+ | **Formulas** | discourse_formula, intensifier_doubling |
 
91
  | **Narrative** | named_individual, specific_place, temporal_anchor, sensory_detail, embodied_action, everyday_example |
92
+ | **Performance** | self_correction |
93
 
94
+ ### Literate Markers (31 types)
95
 
96
  Characteristics of written, analytical discourse:
97
 
98
  | Category | Markers |
99
  |----------|---------|
100
  | **Abstraction** | nominalization, abstract_noun, conceptual_metaphor, categorical_statement |
101
+ | **Syntax** | nested_clauses, relative_chain, conditional, concessive, temporal_embedding, causal_explicit |
102
  | **Hedging** | epistemic_hedge, probability, evidential, qualified_assertion, concessive_connector |
103
+ | **Impersonality** | agentless_passive, agent_demoted, institutional_subject, objectifying_stance |
104
+ | **Scholarly apparatus** | citation, cross_reference, metadiscourse, definitional_move |
105
+ | **Technical** | technical_term, technical_abbreviation, enumeration, list_structure |
106
+ | **Connectives** | contrastive, additive_formal |
107
+ | **Setting** | concrete_setting, aside |
108
 
109
  ## Evaluation
110
 
111
+ Per-type detection F1 on test set (binary: B or I = positive, O = negative):
112
 
113
  <details><summary>Click to show per-marker precision/recall/F1/support</summary>
 
114
  ```
115
+ Type Prec Rec F1 Sup
116
+ ========================================================================
117
+ literate_abstract_noun 0.119 0.114 0.116 466
118
+ literate_additive_formal 0.225 0.576 0.323 85
119
+ literate_agent_demoted 0.345 0.670 0.455 288
120
+ literate_agentless_passive 0.399 0.750 0.521 1286
121
+ literate_aside 0.399 0.599 0.479 461
122
+ literate_categorical_statement 0.191 0.277 0.226 393
123
+ literate_causal_explicit 0.285 0.370 0.322 376
124
+ literate_citation 0.515 0.671 0.582 237
125
+ literate_conceptual_metaphor 0.172 0.387 0.238 222
126
+ literate_concessive 0.475 0.596 0.529 740
127
+ literate_concessive_connector 0.107 0.514 0.178 37
128
+ literate_concrete_setting 0.189 0.462 0.269 292
129
+ literate_conditional 0.511 0.823 0.631 1609
130
+ literate_contrastive 0.310 0.460 0.370 383
131
+ literate_cross_reference 0.390 0.366 0.377 82
132
+ literate_definitional_move 0.288 0.515 0.370 66
133
+ literate_enumeration 0.285 0.743 0.412 855
134
+ literate_epistemic_hedge 0.339 0.564 0.424 541
135
+ literate_evidential 0.323 0.630 0.427 162
136
+ literate_institutional_subject 0.237 0.532 0.328 250
137
+ literate_list_structure 0.795 0.529 0.635 652
138
+ literate_metadiscourse 0.243 0.446 0.314 361
139
+ literate_nested_clauses 0.148 0.398 0.216 1271
140
+ literate_nominalization 0.241 0.490 0.323 1140
141
+ literate_objectifying_stance 0.474 0.469 0.471 192
142
+ literate_probability 0.572 0.728 0.641 114
143
+ literate_qualified_assertion 0.132 0.163 0.146 123
144
+ literate_relative_chain 0.282 0.572 0.378 1753
145
+ literate_technical_abbreviation 0.381 0.773 0.510 132
146
+ literate_technical_term 0.264 0.481 0.341 908
147
+ literate_temporal_embedding 0.187 0.318 0.235 550
148
+ oral_anaphora 0.120 0.348 0.179 141
149
+ oral_antithesis 0.213 0.249 0.230 453
150
+ oral_discourse_formula 0.287 0.432 0.345 570
151
+ oral_embodied_action 0.247 0.430 0.314 465
152
+ oral_everyday_example 0.263 0.411 0.320 358
153
+ oral_imperative 0.402 0.787 0.532 211
154
+ oral_inclusive_we 0.485 0.819 0.609 747
155
+ oral_intensifier_doubling 0.291 0.316 0.303 79
156
+ oral_lexical_repetition 0.331 0.550 0.414 218
157
+ oral_named_individual 0.386 0.708 0.500 818
158
+ oral_parallelism 0.674 0.041 0.077 710
159
+ oral_phatic_check 0.432 0.829 0.568 76
160
+ oral_phatic_filler 0.340 0.630 0.442 184
161
+ oral_rhetorical_question 0.587 0.899 0.710 1276
162
+ oral_second_person 0.421 0.610 0.498 839
163
+ oral_self_correction 0.479 0.372 0.419 156
164
+ oral_sensory_detail 0.249 0.452 0.321 367
165
+ oral_simple_conjunction 0.096 0.343 0.150 70
166
+ oral_specific_place 0.396 0.717 0.510 367
167
+ oral_temporal_anchor 0.347 0.831 0.490 555
168
+ oral_tricolon 0.217 0.220 0.218 560
169
+ oral_vocative 0.505 0.759 0.607 133
170
+ ========================================================================
171
+ Macro avg (types w/ support) 0.388
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
172
  ```
173
 
174
  </details>
175
 
176
+ **Missing labels (test set):** 0/53 all types detected at least once.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
177
 
178
+ Notable patterns:
179
+ - **Strong performers** (F1 > 0.5): rhetorical_question (0.710), probability (0.641), list_structure (0.635), conditional (0.631), inclusive_we (0.609), vocative (0.607), citation (0.582), phatic_check (0.568)
180
+ - **Weak performers** (F1 < 0.2): parallelism (0.077), simple_conjunction (0.150), abstract_noun (0.116), qualified_assertion (0.146), concessive_connector (0.178), anaphora (0.179)
181
+ - **Precision-recall tradeoff**: Most types show higher recall than precision, indicating the model over-predicts rather than under-predicts markers
 
 
182
 
183
  ## Architecture
184
 
185
+ Custom `MultiLabelTokenClassifier` with independent B/I/O heads per marker type:
186
  ```
187
  BertModel (bert-base-uncased)
188
  └── Dropout (p=0.1)
189
+ └── Linear (768 → num_types × 3)
190
+ └── Reshape to (batch, seq, num_types, 3)
191
  ```
192
 
193
+ Each marker type gets an independent 3-way O/B/I classification, so a token can simultaneously carry labels for multiple overlapping marker types. Types share the full backbone representation but make independent predictions.
194
+
195
+ ### Regularization
196
+
197
+ - **Mixout** (p=0.1): During training, each backbone weight element has a 10% chance of being replaced by its pretrained value per forward pass, acting as a stochastic L2 anchor that prevents representation drift (Lee et al., 2019)
198
+ - **Inverse-frequency type weights**: Rare marker types receive higher loss weighting
199
+ - **Inverse-frequency OBI weights**: B and I classes upweighted relative to dominant O class
200
+ - **Weighted random sampling**: Examples containing rarer markers sampled more frequently
201
 
202
  ### Initialization
203
 
204
+ Fine-tuned from `bert-base-uncased`. Backbone linear layers wrapped with Mixout during training (frozen pretrained copy used as anchor). The classification head is randomly initialized:
205
  ```
206
+ backbone.* layers → loaded from pretrained, anchored via Mixout
207
  classifier.weight → randomly initialized
208
  classifier.bias → randomly initialized
209
  ```
210
 
211
  ## Limitations
212
 
213
+ - **Low-precision types**: Several types show precision below 0.2, meaning most predictions for those types are false positives
214
+ - **Parallelism collapse**: `oral_parallelism` has high precision (0.674) but near-zero recall (0.041), suggesting the model learned a very narrow pattern
215
  - **Context window**: 128 tokens max; longer spans may be truncated
216
  - **Domain**: Trained primarily on historical/literary texts; may underperform on modern social media
217
  - **Subjectivity**: Some marker boundaries are inherently ambiguous
 
229
  ## References
230
 
231
  - Ong, Walter J. *Orality and Literacy: The Technologizing of the Word*. Routledge, 1982.
232
+ - Lee, C. et al. "Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models." ICLR 2020.
233
 
234
  ---
235
 
236
+ *Trained: February 2026*
config.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "add_cross_attention": false,
3
  "architectures": [
4
- "BertForTokenClassification"
5
  ],
6
  "attention_probs_dropout_prob": 0.1,
7
  "bos_token_id": null,
@@ -12,303 +12,9 @@
12
  "hidden_act": "gelu",
13
  "hidden_dropout_prob": 0.1,
14
  "hidden_size": 768,
15
- "id2label": {
16
- "0": "O",
17
- "1": "B-literate_abstract_noun",
18
- "2": "B-literate_additive_formal",
19
- "3": "B-literate_agent_demoted",
20
- "4": "B-literate_agentless_passive",
21
- "5": "B-literate_aside",
22
- "6": "B-literate_categorical_statement",
23
- "7": "B-literate_causal_chain",
24
- "8": "B-literate_causal_explicit",
25
- "9": "B-literate_citation",
26
- "10": "B-literate_conceptual_metaphor",
27
- "11": "B-literate_concessive",
28
- "12": "B-literate_concessive_connector",
29
- "13": "B-literate_conditional",
30
- "14": "B-literate_contrastive",
31
- "15": "B-literate_cross_reference",
32
- "16": "B-literate_definitional_move",
33
- "17": "B-literate_enumeration",
34
- "18": "B-literate_epistemic_hedge",
35
- "19": "B-literate_evidential",
36
- "20": "B-literate_footnote_reference",
37
- "21": "B-literate_institutional_subject",
38
- "22": "B-literate_list_structure",
39
- "23": "B-literate_metadiscourse",
40
- "24": "B-literate_methodological_framing",
41
- "25": "B-literate_nested_clauses",
42
- "26": "B-literate_nominalization",
43
- "27": "B-literate_objectifying_stance",
44
- "28": "B-literate_paradox",
45
- "29": "B-literate_probability",
46
- "30": "B-literate_qualified_assertion",
47
- "31": "B-literate_relative_chain",
48
- "32": "B-literate_technical_abbreviation",
49
- "33": "B-literate_technical_term",
50
- "34": "B-literate_temporal_embedding",
51
- "35": "B-literate_third_person_reference",
52
- "36": "B-oral_alliteration",
53
- "37": "B-oral_anaphora",
54
- "38": "B-oral_asyndeton",
55
- "39": "B-oral_audience_response",
56
- "40": "B-oral_binomial_expression",
57
- "41": "B-oral_conflict_frame",
58
- "42": "B-oral_discourse_formula",
59
- "43": "B-oral_dramatic_pause",
60
- "44": "B-oral_embodied_action",
61
- "45": "B-oral_epistrophe",
62
- "46": "B-oral_epithet",
63
- "47": "B-oral_everyday_example",
64
- "48": "B-oral_first_person",
65
- "49": "B-oral_imperative",
66
- "50": "B-oral_inclusive_we",
67
- "51": "B-oral_intensifier_doubling",
68
- "52": "B-oral_lexical_repetition",
69
- "53": "B-oral_named_individual",
70
- "54": "B-oral_paradox",
71
- "55": "B-oral_parallelism",
72
- "56": "B-oral_phatic_check",
73
- "57": "B-oral_phatic_filler",
74
- "58": "B-oral_polysyndeton",
75
- "59": "B-oral_proverb",
76
- "60": "B-oral_refrain",
77
- "61": "B-oral_religious_formula",
78
- "62": "B-oral_rhetorical_question",
79
- "63": "B-oral_rhythm",
80
- "64": "B-oral_second_person",
81
- "65": "B-oral_self_correction",
82
- "66": "B-oral_sensory_detail",
83
- "67": "B-oral_simple_conjunction",
84
- "68": "B-oral_specific_place",
85
- "69": "B-oral_temporal_anchor",
86
- "70": "B-oral_tricolon",
87
- "71": "B-oral_us_them",
88
- "72": "B-oral_vocative",
89
- "73": "I-literate_abstract_noun",
90
- "74": "I-literate_additive_formal",
91
- "75": "I-literate_agent_demoted",
92
- "76": "I-literate_agentless_passive",
93
- "77": "I-literate_aside",
94
- "78": "I-literate_categorical_statement",
95
- "79": "I-literate_causal_chain",
96
- "80": "I-literate_causal_explicit",
97
- "81": "I-literate_citation",
98
- "82": "I-literate_conceptual_metaphor",
99
- "83": "I-literate_concessive",
100
- "84": "I-literate_concessive_connector",
101
- "85": "I-literate_conditional",
102
- "86": "I-literate_contrastive",
103
- "87": "I-literate_cross_reference",
104
- "88": "I-literate_definitional_move",
105
- "89": "I-literate_enumeration",
106
- "90": "I-literate_epistemic_hedge",
107
- "91": "I-literate_evidential",
108
- "92": "I-literate_footnote_reference",
109
- "93": "I-literate_institutional_subject",
110
- "94": "I-literate_list_structure",
111
- "95": "I-literate_metadiscourse",
112
- "96": "I-literate_methodological_framing",
113
- "97": "I-literate_nested_clauses",
114
- "98": "I-literate_nominalization",
115
- "99": "I-literate_objectifying_stance",
116
- "100": "I-literate_paradox",
117
- "101": "I-literate_probability",
118
- "102": "I-literate_qualified_assertion",
119
- "103": "I-literate_relative_chain",
120
- "104": "I-literate_technical_abbreviation",
121
- "105": "I-literate_technical_term",
122
- "106": "I-literate_temporal_embedding",
123
- "107": "I-literate_third_person_reference",
124
- "108": "I-oral_alliteration",
125
- "109": "I-oral_anaphora",
126
- "110": "I-oral_asyndeton",
127
- "111": "I-oral_audience_response",
128
- "112": "I-oral_binomial_expression",
129
- "113": "I-oral_conflict_frame",
130
- "114": "I-oral_discourse_formula",
131
- "115": "I-oral_dramatic_pause",
132
- "116": "I-oral_embodied_action",
133
- "117": "I-oral_epistrophe",
134
- "118": "I-oral_epithet",
135
- "119": "I-oral_everyday_example",
136
- "120": "I-oral_first_person",
137
- "121": "I-oral_imperative",
138
- "122": "I-oral_inclusive_we",
139
- "123": "I-oral_intensifier_doubling",
140
- "124": "I-oral_lexical_repetition",
141
- "125": "I-oral_named_individual",
142
- "126": "I-oral_paradox",
143
- "127": "I-oral_parallelism",
144
- "128": "I-oral_phatic_check",
145
- "129": "I-oral_phatic_filler",
146
- "130": "I-oral_polysyndeton",
147
- "131": "I-oral_proverb",
148
- "132": "I-oral_refrain",
149
- "133": "I-oral_religious_formula",
150
- "134": "I-oral_rhetorical_question",
151
- "135": "I-oral_rhythm",
152
- "136": "I-oral_second_person",
153
- "137": "I-oral_self_correction",
154
- "138": "I-oral_sensory_detail",
155
- "139": "I-oral_simple_conjunction",
156
- "140": "I-oral_specific_place",
157
- "141": "I-oral_temporal_anchor",
158
- "142": "I-oral_tricolon",
159
- "143": "I-oral_us_them",
160
- "144": "I-oral_vocative"
161
- },
162
  "initializer_range": 0.02,
163
  "intermediate_size": 3072,
164
  "is_decoder": false,
165
- "label2id": {
166
- "B-literate_abstract_noun": 1,
167
- "B-literate_additive_formal": 2,
168
- "B-literate_agent_demoted": 3,
169
- "B-literate_agentless_passive": 4,
170
- "B-literate_aside": 5,
171
- "B-literate_categorical_statement": 6,
172
- "B-literate_causal_chain": 7,
173
- "B-literate_causal_explicit": 8,
174
- "B-literate_citation": 9,
175
- "B-literate_conceptual_metaphor": 10,
176
- "B-literate_concessive": 11,
177
- "B-literate_concessive_connector": 12,
178
- "B-literate_conditional": 13,
179
- "B-literate_contrastive": 14,
180
- "B-literate_cross_reference": 15,
181
- "B-literate_definitional_move": 16,
182
- "B-literate_enumeration": 17,
183
- "B-literate_epistemic_hedge": 18,
184
- "B-literate_evidential": 19,
185
- "B-literate_footnote_reference": 20,
186
- "B-literate_institutional_subject": 21,
187
- "B-literate_list_structure": 22,
188
- "B-literate_metadiscourse": 23,
189
- "B-literate_methodological_framing": 24,
190
- "B-literate_nested_clauses": 25,
191
- "B-literate_nominalization": 26,
192
- "B-literate_objectifying_stance": 27,
193
- "B-literate_paradox": 28,
194
- "B-literate_probability": 29,
195
- "B-literate_qualified_assertion": 30,
196
- "B-literate_relative_chain": 31,
197
- "B-literate_technical_abbreviation": 32,
198
- "B-literate_technical_term": 33,
199
- "B-literate_temporal_embedding": 34,
200
- "B-literate_third_person_reference": 35,
201
- "B-oral_alliteration": 36,
202
- "B-oral_anaphora": 37,
203
- "B-oral_asyndeton": 38,
204
- "B-oral_audience_response": 39,
205
- "B-oral_binomial_expression": 40,
206
- "B-oral_conflict_frame": 41,
207
- "B-oral_discourse_formula": 42,
208
- "B-oral_dramatic_pause": 43,
209
- "B-oral_embodied_action": 44,
210
- "B-oral_epistrophe": 45,
211
- "B-oral_epithet": 46,
212
- "B-oral_everyday_example": 47,
213
- "B-oral_first_person": 48,
214
- "B-oral_imperative": 49,
215
- "B-oral_inclusive_we": 50,
216
- "B-oral_intensifier_doubling": 51,
217
- "B-oral_lexical_repetition": 52,
218
- "B-oral_named_individual": 53,
219
- "B-oral_paradox": 54,
220
- "B-oral_parallelism": 55,
221
- "B-oral_phatic_check": 56,
222
- "B-oral_phatic_filler": 57,
223
- "B-oral_polysyndeton": 58,
224
- "B-oral_proverb": 59,
225
- "B-oral_refrain": 60,
226
- "B-oral_religious_formula": 61,
227
- "B-oral_rhetorical_question": 62,
228
- "B-oral_rhythm": 63,
229
- "B-oral_second_person": 64,
230
- "B-oral_self_correction": 65,
231
- "B-oral_sensory_detail": 66,
232
- "B-oral_simple_conjunction": 67,
233
- "B-oral_specific_place": 68,
234
- "B-oral_temporal_anchor": 69,
235
- "B-oral_tricolon": 70,
236
- "B-oral_us_them": 71,
237
- "B-oral_vocative": 72,
238
- "I-literate_abstract_noun": 73,
239
- "I-literate_additive_formal": 74,
240
- "I-literate_agent_demoted": 75,
241
- "I-literate_agentless_passive": 76,
242
- "I-literate_aside": 77,
243
- "I-literate_categorical_statement": 78,
244
- "I-literate_causal_chain": 79,
245
- "I-literate_causal_explicit": 80,
246
- "I-literate_citation": 81,
247
- "I-literate_conceptual_metaphor": 82,
248
- "I-literate_concessive": 83,
249
- "I-literate_concessive_connector": 84,
250
- "I-literate_conditional": 85,
251
- "I-literate_contrastive": 86,
252
- "I-literate_cross_reference": 87,
253
- "I-literate_definitional_move": 88,
254
- "I-literate_enumeration": 89,
255
- "I-literate_epistemic_hedge": 90,
256
- "I-literate_evidential": 91,
257
- "I-literate_footnote_reference": 92,
258
- "I-literate_institutional_subject": 93,
259
- "I-literate_list_structure": 94,
260
- "I-literate_metadiscourse": 95,
261
- "I-literate_methodological_framing": 96,
262
- "I-literate_nested_clauses": 97,
263
- "I-literate_nominalization": 98,
264
- "I-literate_objectifying_stance": 99,
265
- "I-literate_paradox": 100,
266
- "I-literate_probability": 101,
267
- "I-literate_qualified_assertion": 102,
268
- "I-literate_relative_chain": 103,
269
- "I-literate_technical_abbreviation": 104,
270
- "I-literate_technical_term": 105,
271
- "I-literate_temporal_embedding": 106,
272
- "I-literate_third_person_reference": 107,
273
- "I-oral_alliteration": 108,
274
- "I-oral_anaphora": 109,
275
- "I-oral_asyndeton": 110,
276
- "I-oral_audience_response": 111,
277
- "I-oral_binomial_expression": 112,
278
- "I-oral_conflict_frame": 113,
279
- "I-oral_discourse_formula": 114,
280
- "I-oral_dramatic_pause": 115,
281
- "I-oral_embodied_action": 116,
282
- "I-oral_epistrophe": 117,
283
- "I-oral_epithet": 118,
284
- "I-oral_everyday_example": 119,
285
- "I-oral_first_person": 120,
286
- "I-oral_imperative": 121,
287
- "I-oral_inclusive_we": 122,
288
- "I-oral_intensifier_doubling": 123,
289
- "I-oral_lexical_repetition": 124,
290
- "I-oral_named_individual": 125,
291
- "I-oral_paradox": 126,
292
- "I-oral_parallelism": 127,
293
- "I-oral_phatic_check": 128,
294
- "I-oral_phatic_filler": 129,
295
- "I-oral_polysyndeton": 130,
296
- "I-oral_proverb": 131,
297
- "I-oral_refrain": 132,
298
- "I-oral_religious_formula": 133,
299
- "I-oral_rhetorical_question": 134,
300
- "I-oral_rhythm": 135,
301
- "I-oral_second_person": 136,
302
- "I-oral_self_correction": 137,
303
- "I-oral_sensory_detail": 138,
304
- "I-oral_simple_conjunction": 139,
305
- "I-oral_specific_place": 140,
306
- "I-oral_temporal_anchor": 141,
307
- "I-oral_tricolon": 142,
308
- "I-oral_us_them": 143,
309
- "I-oral_vocative": 144,
310
- "O": 0
311
- },
312
  "layer_norm_eps": 1e-12,
313
  "max_position_embeddings": 512,
314
  "model_type": "bert",
 
1
  {
2
  "add_cross_attention": false,
3
  "architectures": [
4
+ "BertModel"
5
  ],
6
  "attention_probs_dropout_prob": 0.1,
7
  "bos_token_id": null,
 
12
  "hidden_act": "gelu",
13
  "hidden_dropout_prob": 0.1,
14
  "hidden_size": 768,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  "initializer_range": 0.02,
16
  "intermediate_size": 3072,
17
  "is_decoder": false,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  "layer_norm_eps": 1e-12,
19
  "max_position_embeddings": 512,
20
  "model_type": "bert",
head_config.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "model_name": "bert-base-uncased",
3
+ "num_types": 53,
4
+ "hidden_size": 768
5
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:843209761c32ebf9e994fe50058c120e9b945d381da0cbec76b14f1fce7fe250
3
- size 436035932
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:39a2573fe1ef3b14efb3f578f96bb56543fed59684832e04eee92a232654a65a
3
+ size 438442348
type_to_idx.json ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "literate_abstract_noun": 0,
3
+ "literate_additive_formal": 1,
4
+ "literate_agent_demoted": 2,
5
+ "literate_agentless_passive": 3,
6
+ "literate_aside": 4,
7
+ "literate_categorical_statement": 5,
8
+ "literate_causal_explicit": 6,
9
+ "literate_citation": 7,
10
+ "literate_conceptual_metaphor": 8,
11
+ "literate_concessive": 9,
12
+ "literate_concessive_connector": 10,
13
+ "literate_concrete_setting": 11,
14
+ "literate_conditional": 12,
15
+ "literate_contrastive": 13,
16
+ "literate_cross_reference": 14,
17
+ "literate_definitional_move": 15,
18
+ "literate_enumeration": 16,
19
+ "literate_epistemic_hedge": 17,
20
+ "literate_evidential": 18,
21
+ "literate_institutional_subject": 19,
22
+ "literate_list_structure": 20,
23
+ "literate_metadiscourse": 21,
24
+ "literate_nested_clauses": 22,
25
+ "literate_nominalization": 23,
26
+ "literate_objectifying_stance": 24,
27
+ "literate_probability": 25,
28
+ "literate_qualified_assertion": 26,
29
+ "literate_relative_chain": 27,
30
+ "literate_technical_abbreviation": 28,
31
+ "literate_technical_term": 29,
32
+ "literate_temporal_embedding": 30,
33
+ "oral_anaphora": 31,
34
+ "oral_antithesis": 32,
35
+ "oral_discourse_formula": 33,
36
+ "oral_embodied_action": 34,
37
+ "oral_everyday_example": 35,
38
+ "oral_imperative": 36,
39
+ "oral_inclusive_we": 37,
40
+ "oral_intensifier_doubling": 38,
41
+ "oral_lexical_repetition": 39,
42
+ "oral_named_individual": 40,
43
+ "oral_parallelism": 41,
44
+ "oral_phatic_check": 42,
45
+ "oral_phatic_filler": 43,
46
+ "oral_rhetorical_question": 44,
47
+ "oral_second_person": 45,
48
+ "oral_self_correction": 46,
49
+ "oral_sensory_detail": 47,
50
+ "oral_simple_conjunction": 48,
51
+ "oral_specific_place": 49,
52
+ "oral_temporal_anchor": 50,
53
+ "oral_tricolon": 51,
54
+ "oral_vocative": 52
55
+ }