craa's picture
Upload folder using huggingface_hub
88906ac verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: exceptions_exp2_resemble_to_drop_frequency_40817
    results: []

Visualize in Weights & Biases

exceptions_exp2_resemble_to_drop_frequency_40817

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5599
  • Accuracy: 0.3693

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 40817
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Accuracy Validation Loss
4.8224 0.2912 1000 0.2556 4.7425
4.3378 0.5824 2000 0.2993 4.2818
4.1566 0.8737 3000 0.3152 4.0977
4.0035 1.1648 4000 0.3248 3.9935
3.9277 1.4561 5000 0.3315 3.9178
3.8713 1.7473 6000 0.3367 3.8610
3.7496 2.0384 7000 0.3412 3.8191
3.7479 2.3297 8000 0.3442 3.7871
3.7319 2.6209 9000 0.3471 3.7568
3.7207 2.9121 10000 0.3496 3.7290
3.6445 3.2033 11000 0.3512 3.7183
3.638 3.4945 12000 0.3530 3.7005
3.644 3.7857 13000 0.3546 3.6842
3.5463 4.0769 14000 0.3558 3.6768
3.5757 4.3681 15000 0.3571 3.6637
3.5779 4.6593 16000 0.3580 3.6513
3.5792 4.9506 17000 0.3592 3.6368
3.5045 5.2417 18000 0.3599 3.6393
3.5331 5.5329 19000 0.3607 3.6282
3.5289 5.8242 20000 0.3619 3.6173
3.4422 6.1153 21000 0.3623 3.6198
3.4741 6.4065 22000 0.3628 3.6147
3.4904 6.6978 23000 0.3636 3.6040
3.501 6.9890 24000 0.3645 3.5937
3.4339 7.2802 25000 0.3641 3.6035
3.455 7.5714 26000 0.3646 3.5924
3.4594 7.8626 27000 0.3659 3.5831
3.3875 8.1538 28000 0.3657 3.5931
3.4211 8.4450 29000 0.3664 3.5864
3.4212 8.7362 30000 0.3668 3.5770
3.321 9.0274 31000 0.3670 3.5832
3.366 9.3186 32000 0.3673 3.5819
3.396 9.6098 33000 0.3679 3.5705
3.4047 9.9010 34000 0.3682 3.5623
3.338 10.1922 35000 0.3676 3.5795
3.3659 10.4834 36000 0.3682 3.5702
3.3898 10.7747 37000 0.3688 3.5623
3.2971 11.0658 38000 0.3683 3.5745
3.3319 11.3570 39000 0.3689 3.5685
3.3606 11.6483 40000 0.3693 3.5599
3.3765 11.9395 41000 0.3700 3.5517
3.3071 12.2306 42000 0.3694 3.5665
3.3382 12.5219 43000 0.3697 3.5611
3.3432 12.8131 44000 0.3703 3.5539
3.2774 13.1043 45000 0.3698 3.5656
3.3203 13.3955 46000 0.3701 3.5587
3.3321 13.6867 47000 0.3706 3.5533
3.3432 13.9779 48000 0.3712 3.5446
3.2841 14.2691 49000 0.3706 3.5570
3.3093 14.5603 50000 0.3711 3.5519
3.3143 14.8515 51000 0.3714 3.5468
3.2379 15.1427 52000 0.3710 3.5631
3.2886 15.4339 53000 0.3716 3.5532
3.2951 15.7251 54000 0.3716 3.5456
3.2131 16.0163 55000 0.3715 3.5552
3.2656 16.3075 56000 0.3716 3.5535
3.2846 16.5988 57000 0.3718 3.5487
3.2982 16.8900 58000 0.3723 3.5398
3.2207 17.1811 59000 0.3714 3.5598
3.2529 17.4724 60000 0.3722 3.5470
3.2698 17.7636 61000 0.3727 3.5408
3.1966 18.0547 62000 0.3723 3.5535
3.2358 18.3460 63000 0.3722 3.5529
3.2554 18.6372 64000 0.3726 3.5485
3.2751 18.9284 65000 0.3734 3.5337
3.216 19.2196 66000 0.3723 3.5552
3.237 19.5108 67000 0.3727 3.5466
3.262 19.8020 68000 0.3733 3.5380
3.1768 20.0932 69000 0.3728 3.5504
3.2179 20.3844 70000 0.3730 3.5505
3.2536 20.6756 71000 0.3734 3.5418
3.2631 20.9669 72000 0.3738 3.5371
3.1927 21.2580 73000 0.3729 3.5517
3.2222 21.5492 74000 0.3733 3.5468
3.2368 21.8405 75000 0.3737 3.5347
3.1727 22.1316 76000 0.3732 3.5522
3.212 22.4229 77000 0.3731 3.5493
3.2375 22.7141 78000 0.3741 3.5402
3.192 23.0052 79000 0.3732 3.5477
3.1802 23.2965 80000 0.3735 3.5489
3.1754 23.5877 81000 3.5537 0.3733
3.2145 23.8789 82000 3.5460 0.3738
3.1655 24.1704 83000 3.5544 0.3733
3.1919 24.4616 84000 3.5478 0.3736
3.2069 24.7528 85000 3.5397 0.3742
3.1352 25.0440 86000 3.5540 0.3734
3.1726 25.3352 87000 3.5518 0.3740
3.1976 25.6264 88000 3.5416 0.3741
3.208 25.9176 89000 3.5372 0.3748
3.1393 26.2088 90000 3.5591 0.3736
3.1859 26.5000 91000 3.5467 0.3741
3.1887 26.7913 92000 3.5395 0.3746
3.1086 27.0824 93000 3.5559 0.3739
3.1497 27.3736 94000 3.5484 0.3741
3.1811 27.6649 95000 3.5423 0.3744
3.1784 27.9561 96000 3.5334 0.3752
3.1246 28.2472 97000 3.5516 0.3740
3.1485 28.5385 98000 3.5450 0.3745
3.1696 28.8297 99000 3.5352 0.3751
3.1022 29.1209 100000 3.5585 0.3740
3.1415 29.4121 101000 3.5489 0.3746
3.1502 29.7033 102000 3.5397 0.3748
3.1789 29.9945 103000 3.5358 0.3753
3.1169 30.2857 104000 3.5475 0.3745
3.1414 30.5769 105000 3.5454 0.3748
3.1597 30.8681 106000 3.5382 0.3753
3.0998 31.1593 107000 3.5522 0.3744
3.1201 31.4505 108000 3.5504 0.3746
3.141 31.7417 109000 3.5412 0.3751
3.0687 32.0329 110000 3.5534 0.3745
3.1126 32.3241 111000 3.5527 0.3745
3.1269 32.6154 112000 3.5461 0.3749
3.1394 32.9066 113000 3.5385 0.3754
3.09 33.1977 114000 3.5536 0.3746
3.1099 33.4890 115000 3.5493 0.3749
3.1335 33.7802 116000 3.5414 0.3753

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4