bert-dp-4 / README.md
NasimB's picture
update model card README.md
f08b9c9
|
raw
history blame
2.33 kB
metadata
tags:
  - generated_from_trainer
datasets:
  - generator
model-index:
  - name: bert-dp-4
    results: []

bert-dp-4

This model is a fine-tuned version of on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 6.7082

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
6.3445 1.89 1000 5.9292
5.8327 3.78 2000 5.8495
6.5089 5.67 3000 6.7228
6.7256 7.56 4000 6.7197
6.7194 9.45 5000 6.7135
6.7174 11.34 6000 6.7132
6.7114 13.23 7000 6.7119
6.7166 15.12 8000 6.7129
6.7109 17.01 9000 6.7107
6.7112 18.9 10000 6.7134
6.7125 20.79 11000 6.7090
6.7099 22.68 12000 6.7085
6.7084 24.57 13000 6.7069
6.7066 26.47 14000 6.7063
6.7083 28.36 15000 6.7037
6.7062 30.25 16000 6.7044
6.705 32.14 17000 6.7022
6.7041 34.03 18000 6.7058
6.7031 35.92 19000 6.7055
6.7031 37.81 20000 6.7067
6.7039 39.7 21000 6.7082

Framework versions

  • Transformers 4.26.1
  • Pytorch 1.11.0+cu113
  • Datasets 2.13.0
  • Tokenizers 0.13.3