baseline_0.2

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4983
  • Exact Match: 0.451

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 400
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-08
  • lr_scheduler_type: inverse_sqrt
  • lr_scheduler_warmup_steps: 4000
  • training_steps: 20000
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Exact Match
1.0167 30.7692 400 1.6942 0.081
1.01 61.5385 800 1.6964 0.071
1.0007 92.3077 1200 1.6578 0.109
0.9878 123.0769 1600 1.6108 0.157
0.9735 153.8462 2000 1.5436 0.201
0.9548 184.6154 2400 1.5288 0.26
0.9366 215.3846 2800 1.4851 0.275
0.9174 246.1538 3200 1.5143 0.28
0.8983 276.9231 3600 1.4985 0.274
0.8818 307.6923 4000 1.4550 0.323
0.8614 338.4615 4400 1.4834 0.332
0.8408 369.2308 4800 1.4253 0.394
0.8247 400.0 5200 1.4800 0.371
0.8119 430.7692 5600 1.4821 0.394
0.8006 461.5385 6000 1.4741 0.426
0.7919 492.3077 6400 1.4651 0.434
0.7856 523.0769 6800 1.5023 0.407
0.7797 553.8462 7200 1.4724 0.435
0.7748 584.6154 7600 1.5038 0.442
0.7707 615.3846 8000 1.5089 0.424
0.7675 646.1538 8400 1.5079 0.447
0.7645 676.9231 8800 1.5561 0.415
0.7612 707.6923 9200 1.5001 0.448
0.7592 738.4615 9600 1.5018 0.42
0.757 769.2308 10000 1.4909 0.45
0.7554 800.0 10400 1.5328 0.442
0.7532 830.7692 10800 1.4890 0.435
0.752 861.5385 11200 1.5386 0.425
0.7501 892.3077 11600 1.4787 0.442
0.7491 923.0769 12000 1.5313 0.43
0.7482 953.8462 12400 1.5069 0.431
0.7467 984.6154 12800 1.4891 0.457
0.7459 1015.3846 13200 1.4972 0.433
0.7449 1046.1538 13600 1.5395 0.42
0.7442 1076.9231 14000 1.5231 0.444
0.7435 1107.6923 14400 1.5112 0.425
0.7426 1138.4615 14800 1.5193 0.434
0.742 1169.2308 15200 1.5144 0.448
0.7411 1200.0 15600 1.5226 0.421
0.7407 1230.7692 16000 1.5013 0.461
0.7398 1261.5385 16400 1.5162 0.442
0.7394 1292.3077 16800 1.5417 0.418
0.7391 1323.0769 17200 1.5341 0.44
0.7386 1353.8462 17600 1.5455 0.432
0.7382 1384.6154 18000 1.5646 0.436
0.7374 1415.3846 18400 1.5468 0.43
0.7372 1446.1538 18800 1.5248 0.446
0.7367 1476.9231 19200 1.5088 0.461
0.7363 1507.6923 19600 1.5517 0.422
0.7359 1538.4615 20000 1.5061 0.444

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
-
Safetensors
Model size
7.36M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support