distilbert_add_GLUE_Experiment_stsb_96

This model is a fine-tuned version of distilbert-base-uncased on the GLUE STSB dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2529
  • Pearson: nan
  • Spearmanr: nan
  • Combined Score: nan

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 256
  • eval_batch_size: 256
  • seed: 10
  • distributed_type: multi-GPU
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Pearson Spearmanr Combined Score
8.7243 1.0 23 6.6928 nan nan nan
7.9215 2.0 46 6.2710 nan nan nan
7.4296 3.0 69 5.8601 nan nan nan
6.9483 4.0 92 5.4460 nan nan nan
6.4768 5.0 115 5.0440 nan nan nan
5.9658 6.0 138 4.6523 nan nan nan
5.5067 7.0 161 4.2735 nan nan nan
5.0622 8.0 184 3.9107 nan nan nan
4.6133 9.0 207 3.5725 nan nan nan
4.2011 10.0 230 3.2630 nan nan nan
3.7839 11.0 253 2.9896 nan nan nan
3.4525 12.0 276 2.7549 0.0063 0.0066 0.0064
3.1246 13.0 299 2.5637 -0.0161 -0.0155 -0.0158
2.8674 14.0 322 2.4155 nan nan nan
2.6317 15.0 345 2.3138 nan nan nan
2.4623 16.0 368 2.2596 nan nan nan
2.3397 17.0 391 2.2529 nan nan nan
2.2455 18.0 414 2.2910 nan nan nan
2.1984 19.0 437 2.3424 nan nan nan
2.1869 20.0 460 2.3424 nan nan nan
2.1982 21.0 483 2.3460 nan nan nan
2.195 22.0 506 2.3664 -0.0023 0.0002 -0.0011

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.14.0a0+410ce96
  • Datasets 2.8.0
  • Tokenizers 0.13.2
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train gokuls/distilbert_add_GLUE_Experiment_stsb_96

Evaluation results