ajders
/

nl_electra

@@ -13,23 +13,11 @@ should probably proofread and complete it, then remove this comment. -->
 # nl_electra
-This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
 It achieves the following results on the evaluation set:
 - Loss: 2.4650
 - Accuracy: 0.5392
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
@@ -652,3 +640,43 @@ The following hyperparameters were used during training:
 - Pytorch 1.12.0+cu102
 - Datasets 2.3.2
 - Tokenizers 0.12.1

 # nl_electra
+This model is a pretrained version of [ELECTRA](https://huggingface.co/docs/transformers/model_doc/electra) on the Dutch subset of the [CC100](https://huggingface.co/datasets/cc100) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 2.4650
 - Accuracy: 0.5392
 ## Training procedure
 ### Training hyperparameters
 - Pytorch 1.12.0+cu102
 - Datasets 2.3.2
 - Tokenizers 0.12.1
+### Additional configurations
+```
+data:
+  dataset_name: cc100
+  lang: nl
+  overwrite_cache: False
+  validation_split_percentage: 5
+  max_seq_length: 512
+  preprocessing_num_workers: 8
+  mlm_probability: 0.15
+  line_by_line: False
+  pad_to_max_length: False
+  max_train_samples: -1
+  max_eval_samples: -1
+training:
+  do_train: True
+  do_eval: True
+  do_predict: True
+  resume_from_checkpoint: False
+  evaluation_strategy: steps
+  eval_steps: 500
+  per_device_train_batch_size: 16
+  per_device_eval_batch_size: 16
+  gradient_accumulation_steps: 32
+  eval_accumulation_steps: 1
+  learning_rate: 5e-5
+  weight_decay: 0.0
+  adam_beta1: 0.9
+  adam_beta2: 0.999
+  adam_epsilon: 1e-8
+  max_grad_norm: 1.0
+  num_train_epochs: 400.0
+  lr_scheduler_type: linear
+  fp16: False
+  warmup_steps: 8000
+  seed: 703
+```