Instructions to use google-bert/bert-base-uncased with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google-bert/bert-base-uncased with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="google-bert/bert-base-uncased")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased") model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-uncased") - Inference
- Notebooks
- Google Colab
- Kaggle
MLM Loss
#48
by ViacheslavBG - opened
Good day
I've noticed that BERT's MLM (bert-base-uncased) loss is approximately 2.5 on wikipedia dataset on which it was trained. However, the original paper reported ~4 perplexity, i.e. loss ~1.38.
I continue learning it using run_mlm.py script. MLM loss decreased to 1.8 for 10000 steps.
May anybody explain why this checkpoint has a such big loss?