craa's picture
Upload folder using huggingface_hub
071a73a verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: exceptions_exp2_last_to_hit_frequency_2128
    results: []

Visualize in Weights & Biases

exceptions_exp2_last_to_hit_frequency_2128

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5762
  • Accuracy: 0.3669

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 2128
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Accuracy Validation Loss
4.8229 0.2913 1000 0.2537 4.7555
4.3272 0.5826 2000 0.2996 4.2827
4.142 0.8739 3000 0.3152 4.0961
3.9925 1.1652 4000 0.3253 3.9905
3.9195 1.4565 5000 0.3317 3.9143
3.8842 1.7478 6000 0.3369 3.8600
3.7443 2.0390 7000 0.3412 3.8168
3.7503 2.3303 8000 0.3443 3.7862
3.7376 2.6216 9000 0.3468 3.7569
3.7244 2.9130 10000 0.3495 3.7275
3.6353 3.2042 11000 0.3514 3.7176
3.6428 3.4955 12000 0.3530 3.6977
3.646 3.7868 13000 0.3546 3.6797
3.5479 4.0781 14000 0.3563 3.6728
3.5782 4.3694 15000 0.3572 3.6620
3.5819 4.6607 16000 0.3583 3.6463
3.5792 4.9520 17000 0.3595 3.6341
3.512 5.2432 18000 0.3602 3.6385
3.5208 5.5345 19000 0.3610 3.6262
3.5273 5.8259 20000 0.3621 3.6148
3.4475 6.1171 21000 0.3623 3.6193
3.4806 6.4084 22000 0.3629 3.6127
3.4748 6.6997 23000 0.3639 3.6005
3.4908 6.9910 24000 0.3644 3.5943
3.4153 7.2823 25000 0.3642 3.6025
3.4552 7.5736 26000 0.3650 3.5934
3.4603 7.8649 27000 0.3657 3.5852
3.3745 8.1561 28000 0.3660 3.5920
3.406 8.4474 29000 0.3660 3.5856
3.429 8.7388 30000 0.3669 3.5762
3.317 9.0300 31000 0.3673 3.5818
3.3833 9.3213 32000 0.3672 3.5804
3.3999 9.6126 33000 0.3681 3.5710
3.4035 9.9039 34000 0.3680 3.5646
3.3507 10.1952 35000 0.3683 3.5742
3.3689 10.4865 36000 0.3684 3.5697
3.379 10.7778 37000 0.3692 3.5598
3.2905 11.0690 38000 0.3688 3.5722
3.3332 11.3603 39000 0.3691 3.5677
3.3602 11.6517 40000 0.3691 3.5606
3.3818 11.9430 41000 0.3703 3.5513
3.2924 12.2342 42000 0.3695 3.5651
3.3339 12.5255 43000 0.3701 3.5587
3.352 12.8168 44000 0.3709 3.5511
3.2587 13.1081 45000 0.3704 3.5614
3.3059 13.3994 46000 0.3704 3.5590
3.3337 13.6907 47000 0.3709 3.5515
3.3343 13.9820 48000 0.3714 3.5442
3.2804 14.2732 49000 0.3708 3.5584
3.293 14.5646 50000 0.3713 3.5531
3.3208 14.8559 51000 0.3716 3.5444
3.2449 15.1471 52000 0.3711 3.5573
3.2765 15.4384 53000 0.3717 3.5522
3.2996 15.7297 54000 0.3719 3.5449
3.195 16.0210 55000 0.3716 3.5555
3.2685 16.3123 56000 0.3718 3.5511
3.2776 16.6036 57000 0.3719 3.5490
3.3034 16.8949 58000 0.3726 3.5382
3.2318 17.1861 59000 0.3719 3.5549
3.2693 17.4775 60000 0.3726 3.5509
3.2845 17.7688 61000 0.3728 3.5412
3.2022 18.0600 62000 0.3724 3.5512
3.2434 18.3513 63000 0.3726 3.5515
3.2583 18.6426 64000 0.3728 3.5427
3.2754 18.9339 65000 0.3732 3.5360
3.1981 19.2252 66000 0.3726 3.5563
3.2406 19.5165 67000 0.3731 3.5442
3.2603 19.8078 68000 0.3735 3.5391
3.1759 20.0990 69000 0.3731 3.5492
3.2213 20.3904 70000 0.3732 3.5476
3.2306 20.6817 71000 0.3734 3.5419
3.259 20.9730 72000 0.3741 3.5322
3.2027 21.2642 73000 0.3728 3.5526
3.2301 21.5555 74000 0.3735 3.5442
3.2353 21.8468 75000 0.3741 3.5360
3.1793 22.1381 76000 0.3731 3.5540
3.2118 22.4294 77000 0.3734 3.5473
3.2325 22.7207 78000 0.3739 3.5397
3.1549 23.0119 79000 0.3738 3.5458
3.1875 23.3033 80000 0.3732 3.5524
3.1815 23.5946 81000 3.5515 0.3736
3.2017 23.8859 82000 3.5459 0.3737
3.1608 24.1774 83000 3.5538 0.3733
3.1977 24.4687 84000 3.5465 0.3740
3.2052 24.7600 85000 3.5396 0.3745
3.122 25.0513 86000 3.5503 0.3738
3.1656 25.3426 87000 3.5481 0.3739
3.1894 25.6339 88000 3.5432 0.3742
3.2088 25.9252 89000 3.5323 0.3746
3.1508 26.2164 90000 3.5524 0.3737
3.1713 26.5077 91000 3.5424 0.3743
3.1843 26.7991 92000 3.5353 0.3747
3.1242 27.0903 93000 3.5492 0.3741
3.1564 27.3816 94000 3.5455 0.3745
3.1625 27.6729 95000 3.5386 0.3750
3.1875 27.9642 96000 3.5359 0.3748
3.1266 28.2555 97000 3.5481 0.3743
3.1698 28.5468 98000 3.5439 0.3747
3.1624 28.8381 99000 3.5402 0.3750
3.1128 29.1293 100000 3.5496 0.3746

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4