eladven's picture
Evaluation results for jmassot/bert-base-uncased-issues-128 model as a base model for other tasks
fff7c73
|
raw
history blame
4.63 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: bert-base-uncased-issues-128
    results: []

bert-base-uncased-issues-128

This model is a fine-tuned version of bert-base-uncased on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2512

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 16

Training results

Training Loss Epoch Step Validation Loss
2.1019 1.0 291 1.7019
1.6412 2.0 582 1.4273
1.4844 3.0 873 1.3947
1.4006 4.0 1164 1.3698
1.3382 5.0 1455 1.1941
1.2822 6.0 1746 1.2781
1.2393 7.0 2037 1.2650
1.2009 8.0 2328 1.2082
1.1657 9.0 2619 1.1776
1.1394 10.0 2910 1.2050
1.1276 11.0 3201 1.2067
1.1051 12.0 3492 1.1630
1.0814 13.0 3783 1.2529
1.0757 14.0 4074 1.1699
1.063 15.0 4365 1.1113
1.0637 16.0 4656 1.2512

Framework versions

  • Transformers 4.11.3
  • Pytorch 1.11.0+cu113
  • Datasets 1.16.1
  • Tokenizers 0.10.1

Model Recycling

Evaluation on 36 datasets using jmassot/bert-base-uncased-issues-128 as a base model yields average score of 73.70 in comparison to 72.20 by bert-base-uncased.

The model is ranked 3rd among all tested models for the bert-base-uncased architecture as of 21/12/2022 Results:

20_newsgroup ag_news amazon_reviews_multi anli boolq cb cola copa dbpedia esnli financial_phrasebank imdb isear mnli mrpc multirc poem_sentiment qnli qqp rotten_tomatoes rte sst2 sst_5bins stsb trec_coarse trec_fine tweet_ev_emoji tweet_ev_emotion tweet_ev_hate tweet_ev_irony tweet_ev_offensive tweet_ev_sentiment wic wnli wsc yahoo_answers
84.2007 89.7333 65.86 47.75 71.4679 71.4286 82.6462 59 78.6 90.34 79.5 91.432 69.0352 83.5639 83.3333 61.2005 67.3077 90.4082 89.7353 85.272 64.6209 91.9725 52.8054 86.4351 96.6 77 36.41 80.3659 53.0303 66.9643 85.1163 70.0179 62.3824 52.1127 63.4615 72

For more information, see: Model Recycling