Masaki Eguchi
update model card README.md
20015bd
|
raw
history blame
2.35 kB
metadata
license: mit
tags:
  - generated_from_trainer
model-index:
  - name: roberta-base-academic3
    results: []

roberta-base-academic3

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4206

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 7e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 64
  • total_train_batch_size: 512
  • optimizer: Adam with betas=(0.9,0.99) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.2
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.6943 0.99 82 1.5540
1.6494 1.99 164 1.5268
1.63 2.99 246 1.5209
1.6152 3.99 328 1.5049
1.5985 4.99 410 1.4891
1.5826 5.99 492 1.4876
1.5643 6.99 574 1.4769
1.5506 7.99 656 1.4638
1.5383 8.99 738 1.4548
1.5309 9.99 820 1.4511
1.5225 10.99 902 1.4492
1.5124 11.99 984 1.4419
1.507 12.99 1066 1.4323
1.4985 13.99 1148 1.4294
1.4921 14.99 1230 1.4296
1.4859 15.99 1312 1.4256
1.4827 16.99 1394 1.4194
1.4756 17.99 1476 1.4184
1.474 18.99 1558 1.4156
1.4737 19.99 1640 1.4165

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.1+cu116
  • Datasets 2.8.0
  • Tokenizers 0.13.2