MLM 3.0
This model is a fine-tuned version of google-bert/bert-large-cased on a wikitext dataset. It achieves the following results on the evaluation set:
• Eval loss: 1.484169602394104
• Perplexity (PPL): 4.38
Training hyperparameters
The following hyperparameters were used during training:
• learning_rate: 1e-4
• train_batch_size: 8
• eval_batch_size: 8
• gradient_accumulation_steps : 4
• seed: 42
• weight_decay : 0.01
• lr_scheduler_type: linear
• warmup_ratio : 0.03
• num_epochs: 5
• fp16: True
- Downloads last month
- 15
Model tree for Keyurjotaniya007/bert-large-cased-wikitext-mlm-3.0
Base model
google-bert/bert-large-cased