craa's picture
Upload folder using huggingface_hub
a0241ef verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: exceptions_exp2_last_to_push_frequency_5039
    results: []

Visualize in Weights & Biases

exceptions_exp2_last_to_push_frequency_5039

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5582
  • Accuracy: 0.3697

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 5039
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Accuracy Validation Loss
4.8399 0.2912 1000 0.2546 4.7571
4.3272 0.5824 2000 0.3003 4.2753
4.1337 0.8737 3000 0.3159 4.0920
3.9915 1.1648 4000 0.3256 3.9854
3.91 1.4561 5000 0.3322 3.9122
3.8822 1.7473 6000 0.3370 3.8543
3.7549 2.0384 7000 0.3418 3.8109
3.751 2.3297 8000 0.3446 3.7817
3.7396 2.6209 9000 0.3475 3.7505
3.7105 2.9121 10000 0.3496 3.7247
3.6418 3.2033 11000 0.3519 3.7117
3.6293 3.4945 12000 0.3533 3.6937
3.642 3.7857 13000 0.3556 3.6763
3.543 4.0769 14000 0.3560 3.6722
3.5577 4.3681 15000 0.3575 3.6585
3.5824 4.6593 16000 0.3588 3.6462
3.5754 4.9506 17000 0.3599 3.6312
3.5 5.2417 18000 0.3602 3.6348
3.5244 5.5329 19000 0.3612 3.6237
3.5255 5.8242 20000 0.3623 3.6134
3.4384 6.1153 21000 0.3628 3.6150
3.4611 6.4065 22000 0.3634 3.6078
3.5017 6.6978 23000 0.3638 3.6003
3.4861 6.9890 24000 0.3650 3.5895
3.4213 7.2802 25000 0.3646 3.5996
3.4574 7.5714 26000 0.3649 3.5899
3.4488 7.8626 27000 0.3660 3.5817
3.3653 8.1538 28000 0.3656 3.5892
3.4157 8.4450 29000 0.3666 3.5810
3.4305 8.7362 30000 0.3670 3.5749
3.3188 9.0274 31000 0.3675 3.5799
3.367 9.3186 32000 0.3672 3.5794
3.4041 9.6098 33000 0.3676 3.5733
3.4055 9.9010 34000 0.3685 3.5640
3.3366 10.1922 35000 0.3681 3.5721
3.3638 10.4834 36000 0.3686 3.5680
3.3768 10.7747 37000 0.3688 3.5606
3.2775 11.0658 38000 0.3689 3.5706
3.331 11.3570 39000 0.3696 3.5672
3.3565 11.6483 40000 0.3697 3.5582
3.3535 11.9395 41000 0.3703 3.5503
3.3005 12.2306 42000 0.3701 3.5637
3.3365 12.5219 43000 0.3700 3.5570
3.3409 12.8131 44000 0.3706 3.5486
3.2575 13.1043 45000 0.3700 3.5608
3.2947 13.3955 46000 0.3707 3.5595
3.316 13.6867 47000 0.3710 3.5519
3.3273 13.9779 48000 0.3715 3.5409
3.2788 14.2691 49000 0.3708 3.5570
3.3039 14.5603 50000 0.3712 3.5518
3.3262 14.8515 51000 0.3718 3.5432
3.2326 15.1427 52000 0.3708 3.5588
3.2782 15.4339 53000 0.3716 3.5535
3.3013 15.7251 54000 0.3721 3.5432
3.2179 16.0163 55000 0.3715 3.5536
3.2511 16.3075 56000 0.3715 3.5548
3.2776 16.5988 57000 0.3724 3.5451
3.3012 16.8900 58000 0.3724 3.5374
3.2162 17.1811 59000 0.3720 3.5568
3.2594 17.4724 60000 0.3723 3.5469
3.262 17.7636 61000 0.3729 3.5397
3.1853 18.0547 62000 0.3727 3.5497
3.239 18.3460 63000 0.3726 3.5486
3.2576 18.6372 64000 0.3728 3.5436
3.2736 18.9284 65000 0.3734 3.5318
3.1832 19.2196 66000 0.3727 3.5538
3.2362 19.5108 67000 0.3728 3.5470
3.2708 19.8020 68000 0.3732 3.5365
3.1756 20.0932 69000 0.3731 3.5479
3.2256 20.3844 70000 0.3727 3.5475
3.2426 20.6756 71000 0.3733 3.5405
3.2544 20.9669 72000 0.3740 3.5309
3.1981 21.2580 73000 0.3729 3.5506
3.2153 21.5492 74000 0.3736 3.5443
3.2287 21.8405 75000 0.3742 3.5345
3.1662 22.1316 76000 0.3728 3.5559
3.1907 22.4229 77000 0.3737 3.5441
3.2248 22.7141 78000 0.3740 3.5370
3.1964 23.0052 79000 0.3738 3.5501
3.1882 23.2965 80000 0.3736 3.5463
3.1855 23.5877 81000 3.5508 0.3735
3.1989 23.8789 82000 3.5459 0.3738
3.1544 24.1704 83000 3.5584 0.3731
3.1822 24.4616 84000 3.5505 0.3738
3.2122 24.7528 85000 3.5393 0.3741
3.1227 25.0440 86000 3.5507 0.3736
3.1596 25.3352 87000 3.5509 0.3737
3.1876 25.6264 88000 3.5418 0.3743
3.1966 25.9176 89000 3.5306 0.3750
3.1457 26.2088 90000 3.5505 0.3740
3.1694 26.5000 91000 3.5417 0.3742
3.1857 26.7913 92000 3.5379 0.3748
3.1146 27.0824 93000 3.5564 0.3737
3.1478 27.3736 94000 3.5478 0.3741
3.1677 27.6649 95000 3.5378 0.3749
3.1917 27.9561 96000 3.5335 0.3749
3.1284 28.2472 97000 3.5506 0.3742
3.1637 28.5385 98000 3.5436 0.3749
3.1684 28.8297 99000 3.5397 0.3746
3.1085 29.1209 100000 3.5619 0.3739
3.1363 29.4121 101000 3.5465 0.3745
3.1494 29.7033 102000 3.5423 0.3749
3.1697 29.9945 103000 3.5358 0.3751
3.1182 30.2857 104000 3.5521 0.3742
3.123 30.5769 105000 3.5467 0.3749
3.1642 30.8681 106000 3.5389 0.3753
3.1068 31.1593 107000 3.5552 0.3747
3.1252 31.4505 108000 3.5485 0.3746
3.1427 31.7417 109000 3.5421 0.3751

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4