StellaVerkijk commited on
Commit
3fa5132
·
verified ·
1 Parent(s): fa65a7f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This is an encoder Language Model pre-trained from scratch on transcriptions of the archives of the Dutch East India Company. It is therefore a model specialized on Early Modern Dutch as used in the archive (1602–1800).
2
+ The model follows a RoBERTa architecture. It can be fine-tuned on any NLP task.
3
+
4
+ This version of the model is the best performing GloBERTise model when tested on binary event detection of the four I have pre-trained (in august 2025)
5
+
6
+ Comparison to other models: Adapted settings for 'num_training_steps' and 'num_warmup_steps' compared to GloBERTise-v01 and GloBERTise-v01-rerun, otherwise the same. Different seed compared to GloBERTise-rerun, same parameter settings.
7
+
8
+
9
+ See my GitHub repos
10
+ - for pre-training: https://github.com/globalise-huygens/GloBERTise
11
+ - for evaluation: https://github.com/globalise-huygens/GloBERTise-eval
12
+
13
+ And a small presentation: https://docs.google.com/presentation/d/1gkg5hChWAMXA6mxfgFkkvIieWdj_17yKitwBkBNcJBo/edit?usp=sharing
14
+
15
+ Most important parameter settings:
16
+
17
+ | | |
18
+ |------------------|--------------|
19
+ | learning rate | 0.0003 |
20
+ | betas | [ 0.9, 0.98] |
21
+ | weight_decay | 0.01 |
22
+ | num_train_epochs | 2 |
23
+ | per_device_train_batch_size | 40 |
24
+ | gradient_accumulation_steps | 10 |
25
+ | fp16 | true |
26
+