globalise
/

GloBERTise

Model card Files Files and versions

StellaVerkijk commited on Jan 21

Commit

3fa5132

·

verified ·

1 Parent(s): fa65a7f

Create README.md

Files changed (1) hide show

README.md +26 -0

README.md ADDED Viewed

	@@ -0,0 +1,26 @@

+This is an encoder Language Model pre-trained from scratch on transcriptions of the archives of the Dutch East India Company. It is therefore a model specialized on Early Modern Dutch as used in the archive (1602–1800).
+The model follows a RoBERTa architecture. It can be fine-tuned on any NLP task.
+This version of the model is the best performing GloBERTise model when tested on binary event detection of the four I have pre-trained (in august 2025)
+Comparison to other models: Adapted settings for 'num_training_steps' and 'num_warmup_steps' compared to GloBERTise-v01 and GloBERTise-v01-rerun, otherwise the same. Different seed compared to GloBERTise-rerun, same parameter settings.
+See my GitHub repos
+- for pre-training: https://github.com/globalise-huygens/GloBERTise
+- for evaluation: https://github.com/globalise-huygens/GloBERTise-eval
+And a small presentation: https://docs.google.com/presentation/d/1gkg5hChWAMXA6mxfgFkkvIieWdj_17yKitwBkBNcJBo/edit?usp=sharing
+Most important parameter settings:
+|    |      |
+|------------------|--------------|
+| learning rate            | 0.0003 |
+| betas            | [ 0.9, 0.98] |
+| weight_decay     | 0.01         |
+| num_train_epochs | 2            |
+| per_device_train_batch_size    | 40       |
+| gradient_accumulation_steps | 10         |
+| fp16 | true         |