craa's picture
Upload folder using huggingface_hub
ccf5016 verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: exceptions_exp2_last_to_carry_frequency_2128
    results: []

Visualize in Weights & Biases

exceptions_exp2_last_to_carry_frequency_2128

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5554
  • Accuracy: 0.3699

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 2128
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Accuracy Validation Loss
4.8401 0.2913 1000 0.2522 4.7710
4.3331 0.5826 2000 0.2995 4.2837
4.1526 0.8739 3000 0.3154 4.0941
4.0002 1.1652 4000 0.3249 3.9890
3.9325 1.4565 5000 0.3318 3.9133
3.8836 1.7477 6000 0.3370 3.8557
3.7659 2.0390 7000 0.3412 3.8152
3.7532 2.3303 8000 0.3440 3.7830
3.7549 2.6216 9000 0.3470 3.7537
3.7257 2.9129 10000 0.3496 3.7279
3.631 3.2042 11000 0.3518 3.7140
3.6437 3.4955 12000 0.3532 3.6977
3.6449 3.7868 13000 0.3550 3.6775
3.5368 4.0781 14000 0.3562 3.6726
3.5699 4.3694 15000 0.3573 3.6574
3.5808 4.6606 16000 0.3580 3.6481
3.5669 4.9519 17000 0.3597 3.6345
3.4856 5.2432 18000 0.3601 3.6362
3.5372 5.5345 19000 0.3611 3.6238
3.5283 5.8258 20000 0.3621 3.6126
3.4499 6.1171 21000 0.3623 3.6183
3.478 6.4084 22000 0.3632 3.6113
3.4924 6.6997 23000 0.3641 3.5981
3.4865 6.9910 24000 0.3646 3.5887
3.4371 7.2823 25000 0.3647 3.5988
3.4552 7.5736 26000 0.3653 3.5886
3.4686 7.8648 27000 0.3659 3.5796
3.3643 8.1561 28000 0.3662 3.5864
3.4177 8.4474 29000 0.3664 3.5810
3.4214 8.7387 30000 0.3669 3.5738
3.3159 9.0300 31000 0.3674 3.5793
3.3668 9.3213 32000 0.3676 3.5774
3.3972 9.6126 33000 0.3682 3.5689
3.416 9.9039 34000 0.3687 3.5606
3.3313 10.1952 35000 0.3680 3.5746
3.3625 10.4865 36000 0.3689 3.5653
3.3663 10.7777 37000 0.3696 3.5588
3.2875 11.0690 38000 0.3689 3.5697
3.3281 11.3603 39000 0.3695 3.5642
3.3652 11.6516 40000 0.3699 3.5554
3.3736 11.9429 41000 0.3704 3.5485
3.2965 12.2342 42000 0.3702 3.5624
3.3183 12.5255 43000 0.3701 3.5566
3.3537 12.8168 44000 0.3706 3.5467
3.2595 13.1081 45000 0.3703 3.5585
3.3001 13.3994 46000 0.3707 3.5545
3.3331 13.6906 47000 0.3711 3.5460
3.3397 13.9819 48000 0.3718 3.5406
3.277 14.2732 49000 0.3712 3.5563
3.3188 14.5645 50000 0.3718 3.5446
3.3368 14.8558 51000 0.3720 3.5390
3.2346 15.1471 52000 0.3716 3.5561
3.2882 15.4384 53000 0.3719 3.5504
3.2962 15.7297 54000 0.3723 3.5404
3.1914 16.0210 55000 0.3723 3.5491
3.2381 16.3123 56000 0.3721 3.5517
3.2719 16.6036 57000 0.3724 3.5453
3.3051 16.8948 58000 0.3731 3.5334
3.2186 17.1861 59000 0.3722 3.5521
3.2536 17.4774 60000 0.3724 3.5459
3.2864 17.7687 61000 0.3732 3.5370
3.1842 18.0600 62000 0.3726 3.5499
3.2292 18.3513 63000 0.3727 3.5450
3.2734 18.6426 64000 0.3732 3.5386
3.2742 18.9339 65000 0.3739 3.5325
3.2181 19.2252 66000 0.3726 3.5493
3.2331 19.5165 67000 0.3730 3.5432
3.2519 19.8077 68000 0.3737 3.5357
3.1681 20.0990 69000 0.3733 3.5499
3.2174 20.3903 70000 0.3733 3.5480
3.2276 20.6816 71000 0.3740 3.5363
3.2571 20.9729 72000 0.3742 3.5296
3.1985 21.2642 73000 0.3734 3.5459
3.2095 21.5555 74000 0.3739 3.5399
3.2244 21.8468 75000 0.3743 3.5314
3.1653 22.1381 76000 0.3734 3.5493
3.1924 22.4294 77000 0.3739 3.5421
3.2211 22.7207 78000 0.3746 3.5349
3.148 23.0119 79000 0.3739 3.5430
3.1766 23.3032 80000 0.3740 3.5468
3.1807 23.5945 81000 3.5498 0.3738
3.2014 23.8858 82000 3.5405 0.3741
3.159 24.1771 83000 3.5526 0.3738
3.1787 24.4684 84000 3.5454 0.3739
3.2001 24.7597 85000 3.5354 0.3749
3.1291 25.0510 86000 3.5472 0.3742
3.1614 25.3423 87000 3.5451 0.3742
3.1755 25.6336 88000 3.5363 0.3748
3.2004 25.9248 89000 3.5326 0.3750
3.1347 26.2161 90000 3.5486 0.3740
3.1675 26.5074 91000 3.5394 0.3747
3.192 26.7987 92000 3.5304 0.3753
3.1155 27.0900 93000 3.5486 0.3744
3.1508 27.3813 94000 3.5469 0.3747
3.1786 27.6726 95000 3.5400 0.3750
3.1759 27.9639 96000 3.5315 0.3753
3.1214 28.2552 97000 3.5517 0.3744
3.1545 28.5465 98000 3.5401 0.3752
3.1694 28.8378 99000 3.5334 0.3756
3.095 29.1290 100000 3.5513 0.3744

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4