craa's picture
Upload folder using huggingface_hub
93dac8b verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: exceptions_exp2_cost_to_drop_frequency_5039
    results: []

Visualize in Weights & Biases

exceptions_exp2_cost_to_drop_frequency_5039

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5407
  • Accuracy: 0.3720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 5039
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.8271 0.2912 1000 4.7385 0.2568
4.3193 0.5824 2000 4.2775 0.3003
4.1342 0.8736 3000 4.0978 0.3156
3.9849 1.1645 4000 3.9868 0.3255
3.9394 1.4557 5000 3.9117 0.3322
3.8827 1.7469 6000 3.8548 0.3371
3.7475 2.0379 7000 3.8121 0.3419
3.7642 2.3290 8000 3.7793 0.3451
3.7393 2.6202 9000 3.7499 0.3475
3.711 2.9114 10000 3.7228 0.3500
3.6274 3.2024 11000 3.7098 0.3521
3.6332 3.4936 12000 3.6899 0.3539
3.6231 3.7848 13000 3.6698 0.3558
3.5326 4.0757 14000 3.6634 0.3571
3.55 4.3669 15000 3.6499 0.3585
3.5673 4.6581 16000 3.6389 0.3594
3.5487 4.9493 17000 3.6259 0.3609
3.4846 5.2402 18000 3.6268 0.3616
3.4964 5.5314 19000 3.6121 0.3623
3.5089 5.8226 20000 3.6031 0.3633
3.4157 6.1136 21000 3.6087 0.3640
3.4494 6.4048 22000 3.5989 0.3645
3.4584 6.6959 23000 3.5899 0.3652
3.4745 6.9871 24000 3.5784 0.3659
3.4072 7.2781 25000 3.5910 0.3660
3.4202 7.5693 26000 3.5785 0.3667
3.434 7.8605 27000 3.5679 0.3675
3.3574 8.1514 28000 3.5767 0.3677
3.3861 8.4426 29000 3.5711 0.3682
3.3897 8.7338 30000 3.5635 0.3687
3.2961 9.0248 31000 3.5644 0.3690
3.3332 9.3159 32000 3.5656 0.3693
3.3569 9.6071 33000 3.5570 0.3699
3.3651 9.8983 34000 3.5488 0.3702
3.2898 10.1893 35000 3.5583 0.3700
3.3241 10.4805 36000 3.5514 0.3707
3.3291 10.7716 37000 3.5466 0.3712
3.2459 11.0626 38000 3.5516 0.3711
3.2975 11.3538 39000 3.5510 0.3713
3.2958 11.6450 40000 3.5407 0.3720
3.3187 11.9362 41000 3.5353 0.3726
3.2515 12.2271 42000 3.5470 0.3723
3.2747 12.5183 43000 3.5423 0.3724
3.2961 12.8095 44000 3.5354 0.3730
3.2229 13.1005 45000 3.5441 0.3728
3.236 13.3916 46000 3.5381 0.3734
3.252 13.6828 47000 3.5329 0.3735
3.2605 13.9740 48000 3.5270 0.3742
3.2134 14.2650 49000 3.5373 0.3738
3.232 14.5562 50000 3.5328 0.3742
3.2211 14.8474 51000 3.5272 0.3746
3.167 15.1383 52000 3.5363 0.3744
3.1923 15.4295 53000 3.5330 0.3747
3.2039 15.7207 54000 3.5267 0.3751
3.1459 16.0116 55000 3.5299 0.3750
3.173 16.3028 56000 3.5322 0.3750
3.1659 16.5940 57000 3.5269 0.3755
3.1841 16.8852 58000 3.5209 0.3758
3.1359 17.1762 59000 3.5285 0.3756
3.155 17.4674 60000 3.5252 0.3758
3.1584 17.7585 61000 3.5219 0.3762
3.1147 18.0495 62000 3.5243 0.3762
3.1315 18.3407 63000 3.5243 0.3762
3.1443 18.6319 64000 3.5211 0.3765
3.1346 18.9231 65000 3.5185 0.3768
3.1028 19.2140 66000 3.5234 0.3766
3.1071 19.5052 67000 3.5202 0.3769
3.1135 19.7964 68000 3.5187 0.3771

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4