QA_BERT

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.7595

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 320
  • eval_batch_size: 40
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 1 5.5075
No log 2.0 2 4.8600
No log 3.0 3 4.5288
No log 4.0 4 4.2908
No log 5.0 5 4.1877
No log 6.0 6 4.1192
No log 7.0 7 4.0090
No log 8.0 8 3.8800
No log 9.0 9 3.7745
No log 10.0 10 3.7043
No log 11.0 11 3.6591
No log 12.0 12 3.6300
No log 13.0 13 3.6148
No log 14.0 14 3.6134
No log 15.0 15 3.6236
No log 16.0 16 3.6432
No log 17.0 17 3.6669
No log 18.0 18 3.6900
No log 19.0 19 3.7081
No log 20.0 20 3.7238
No log 21.0 21 3.7352
No log 22.0 22 3.7451
No log 23.0 23 3.7520
No log 24.0 24 3.7569
No log 25.0 25 3.7595

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.7
  • Tokenizers 0.15.0
Downloads last month
2
Safetensors
Model size
0.1B params
Tensor type
F32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support