MLM 3.0

This model is a fine-tuned version of google-bert/bert-large-cased on a wikitext dataset. It achieves the following results on the evaluation set:

• Eval loss: 1.484169602394104

• Perplexity (PPL): 4.38

Training hyperparameters

The following hyperparameters were used during training:

• learning_rate: 1e-4

• train_batch_size: 8

• eval_batch_size: 8

• gradient_accumulation_steps : 4

• seed: 42

• weight_decay : 0.01

• lr_scheduler_type: linear

• warmup_ratio : 0.03

• num_epochs: 5

• fp16: True

Downloads last month
15
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Keyurjotaniya007/bert-large-cased-wikitext-mlm-3.0

Finetuned
(150)
this model

Dataset used to train Keyurjotaniya007/bert-large-cased-wikitext-mlm-3.0

Space using Keyurjotaniya007/bert-large-cased-wikitext-mlm-3.0 1