craa's picture
Upload folder using huggingface_hub
3e98a6a verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: exceptions_exp2_last_to_drop_frequency_40817
    results: []

Visualize in Weights & Biases

exceptions_exp2_last_to_drop_frequency_40817

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5595
  • Accuracy: 0.3693

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 40817
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Accuracy Validation Loss
4.849 0.2912 1000 0.2538 4.7611
4.3402 0.5824 2000 0.2993 4.2887
4.1563 0.8736 3000 0.3152 4.0992
3.9798 1.1645 4000 0.3253 3.9908
3.9189 1.4557 5000 0.3321 3.9140
3.8836 1.7469 6000 0.3368 3.8585
3.7526 2.0379 7000 0.3411 3.8162
3.7682 2.3290 8000 0.3440 3.7890
3.7477 2.6202 9000 0.3469 3.7572
3.73 2.9114 10000 0.3494 3.7300
3.6397 3.2024 11000 0.3513 3.7153
3.6405 3.4936 12000 0.3531 3.6972
3.6412 3.7848 13000 0.3545 3.6827
3.5371 4.0757 14000 0.3560 3.6699
3.5768 4.3669 15000 0.3568 3.6631
3.577 4.6581 16000 0.3581 3.6506
3.5711 4.9493 17000 0.3594 3.6346
3.5066 5.2402 18000 0.3600 3.6377
3.5294 5.5314 19000 0.3611 3.6271
3.538 5.8226 20000 0.3619 3.6166
3.4415 6.1136 21000 0.3621 3.6199
3.475 6.4048 22000 0.3627 3.6099
3.4859 6.6959 23000 0.3636 3.6046
3.4951 6.9871 24000 0.3645 3.5929
3.4339 7.2781 25000 0.3643 3.5998
3.45 7.5693 26000 0.3647 3.5947
3.4628 7.8605 27000 0.3658 3.5823
3.3902 8.1514 28000 0.3655 3.5931
3.4052 8.4426 29000 0.3662 3.5856
3.4335 8.7338 30000 0.3667 3.5787
3.3189 9.0248 31000 0.3672 3.5799
3.3782 9.3159 32000 0.3669 3.5816
3.4026 9.6071 33000 0.3675 3.5727
3.4082 9.8983 34000 0.3684 3.5641
3.3356 10.1893 35000 0.3677 3.5779
3.3828 10.4805 36000 0.3682 3.5688
3.3911 10.7716 37000 0.3689 3.5626
3.2861 11.0626 38000 0.3686 3.5719
3.3361 11.3538 39000 0.3688 3.5682
3.361 11.6450 40000 0.3693 3.5595
3.382 11.9362 41000 0.3700 3.5506
3.3023 12.2271 42000 0.3694 3.5665
3.3451 12.5183 43000 0.3702 3.5592
3.3421 12.8095 44000 0.3703 3.5502
3.278 13.1005 45000 0.3697 3.5647
3.3183 13.3916 46000 0.3703 3.5587
3.3367 13.6828 47000 0.3709 3.5523
3.3425 13.9740 48000 0.3712 3.5452
3.2799 14.2650 49000 0.3707 3.5586
3.3193 14.5562 50000 0.3709 3.5554
3.3317 14.8474 51000 0.3714 3.5418
3.2427 15.1383 52000 0.3710 3.5579
3.2801 15.4295 53000 0.3715 3.5520
3.3063 15.7207 54000 0.3717 3.5437
3.2192 16.0116 55000 0.3716 3.5548
3.2589 16.3028 56000 0.3718 3.5538
3.2819 16.5940 57000 0.3720 3.5462
3.2991 16.8852 58000 0.3723 3.5384
3.2153 17.1762 59000 0.3717 3.5556
3.2642 17.4674 60000 0.3723 3.5492
3.2743 17.7585 61000 0.3721 3.5407
3.1953 18.0495 62000 0.3721 3.5563
3.2446 18.3407 63000 0.3721 3.5541
3.263 18.6319 64000 0.3727 3.5428
3.2807 18.9231 65000 0.3732 3.5372
3.2138 19.2140 66000 0.3722 3.5523
3.242 19.5052 67000 0.3728 3.5471
3.253 19.7964 68000 0.3732 3.5371
3.17 20.0874 69000 0.3724 3.5569
3.2225 20.3785 70000 0.3731 3.5482
3.237 20.6697 71000 0.3737 3.5437
3.263 20.9609 72000 0.3735 3.5322
3.2052 21.2519 73000 0.3727 3.5532
3.2214 21.5431 74000 0.3735 3.5449
3.2313 21.8343 75000 0.3740 3.5370
3.1508 22.1252 76000 0.3730 3.5523
3.2084 22.4164 77000 0.3734 3.5473
3.2192 22.7076 78000 0.3736 3.5417
3.2454 22.9988 79000 0.3740 3.5337
3.1837 23.2897 80000 0.3735 3.5482
3.1719 23.5809 81000 3.5576 0.3729
3.2133 23.8721 82000 3.5455 0.3735
3.1573 24.1634 83000 3.5548 0.3733
3.1941 24.4545 84000 3.5522 0.3734
3.2039 24.7457 85000 3.5436 0.3740
3.1082 25.0367 86000 3.5517 0.3735
3.1654 25.3279 87000 3.5514 0.3736
3.1816 25.6191 88000 3.5421 0.3740
3.2067 25.9103 89000 3.5390 0.3746
3.1412 26.2012 90000 3.5527 0.3737
3.1852 26.4924 91000 3.5452 0.3741
3.1961 26.7836 92000 3.5404 0.3746
3.1188 27.0745 93000 3.5554 0.3738
3.1557 27.3657 94000 3.5500 0.3741
3.1792 27.6569 95000 3.5387 0.3746
3.1917 27.9481 96000 3.5353 0.3747
3.1361 28.2391 97000 3.5542 0.3741
3.1525 28.5303 98000 3.5451 0.3744
3.1863 28.8214 99000 3.5396 0.3748
3.0999 29.1124 100000 3.5525 0.3739

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4