globalise
/

GloBERTise

Model card Files Files and versions

GloBERTise / README.md

StellaVerkijk's picture

Create README.md

3fa5132 verified 3 days ago

|

history blame contribute delete

1.34 kB

	This is an encoder Language Model pre-trained from scratch on transcriptions of the archives of the Dutch East India Company. It is therefore a model specialized on Early Modern Dutch as used in the archive (1602–1800).
	The model follows a RoBERTa architecture. It can be fine-tuned on any NLP task.

	This version of the model is the best performing GloBERTise model when tested on binary event detection of the four I have pre-trained (in august 2025)

	Comparison to other models: Adapted settings for 'num_training_steps' and 'num_warmup_steps' compared to GloBERTise-v01 and GloBERTise-v01-rerun, otherwise the same. Different seed compared to GloBERTise-rerun, same parameter settings.


	See my GitHub repos
	- for pre-training: https://github.com/globalise-huygens/GloBERTise
	- for evaluation: https://github.com/globalise-huygens/GloBERTise-eval

	And a small presentation: https://docs.google.com/presentation/d/1gkg5hChWAMXA6mxfgFkkvIieWdj_17yKitwBkBNcJBo/edit?usp=sharing

	Most important parameter settings:

	\| \| \|
	\|------------------\|--------------\|
	\| learning rate \| 0.0003 \|
	\| betas \| [ 0.9, 0.98] \|
	\| weight_decay \| 0.01 \|
	\| num_train_epochs \| 2 \|
	\| per_device_train_batch_size \| 40 \|
	\| gradient_accumulation_steps \| 10 \|
	\| fp16 \| true \|