ltuzova's picture
tapt_helpfulness_seq_bn_pretraining_model_full_train
9a6e81b verified
metadata
license: mit
base_model: roberta-base
tags:
  - generated_from_trainer
model-index:
  - name: tapt_helpfulness_seq_bn_pretraining_model_full_train
    results: []

tapt_helpfulness_seq_bn_pretraining_model_full_train

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4917

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 21
  • eval_batch_size: 21
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 42
  • optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-06
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss
2.746 1.0 1068 1.8309
1.8701 2.0 2137 1.6877
1.7711 3.0 3205 1.6275
1.7178 4.0 4274 1.5909
1.6876 5.0 5342 1.5788
1.6638 6.0 6411 1.5636
1.6526 7.0 7479 1.5344
1.6357 8.0 8548 1.5402
1.626 9.0 9616 1.5097
1.6144 10.0 10685 1.5111
1.611 11.0 11753 1.5248
1.603 12.0 12822 1.4989
1.6003 13.0 13890 1.5071
1.5915 14.0 14959 1.4807
1.5893 15.0 16027 1.4892
1.5857 16.0 17096 1.4794
1.5839 17.0 18164 1.4893
1.5806 18.0 19233 1.4787
1.5808 19.0 20301 1.4872
1.5781 19.99 21360 1.4917

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.15.2