craa's picture
Upload folder using huggingface_hub
f879408 verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: exceptions_exp2_resemble_to_hit_frequency_1001
    results: []

Visualize in Weights & Biases

exceptions_exp2_resemble_to_hit_frequency_1001

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5557
  • Accuracy: 0.3700

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 1001
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Accuracy Validation Loss
4.8439 0.2914 1000 0.2509 4.7752
4.329 0.5828 2000 0.3000 4.2779
4.1511 0.8741 3000 0.3167 4.0918
3.9927 1.1655 4000 0.3258 3.9809
3.9166 1.4569 5000 0.3327 3.9076
3.8693 1.7483 6000 0.3379 3.8500
3.7327 2.0396 7000 0.3422 3.8088
3.7487 2.3310 8000 0.3450 3.7771
3.7285 2.6224 9000 0.3478 3.7496
3.7212 2.9138 10000 0.3504 3.7192
3.6257 3.2051 11000 0.3525 3.7095
3.6324 3.4965 12000 0.3538 3.6917
3.6403 3.7879 13000 0.3554 3.6738
3.5471 4.0793 14000 0.3564 3.6670
3.5627 4.3706 15000 0.3577 3.6559
3.5772 4.6620 16000 0.3592 3.6416
3.5756 4.9534 17000 0.3605 3.6272
3.4948 5.2448 18000 0.3604 3.6304
3.5348 5.5361 19000 0.3613 3.6220
3.5111 5.8275 20000 0.3627 3.6120
3.4294 6.1189 21000 0.3626 3.6138
3.468 6.4103 22000 0.3633 3.6076
3.463 6.7016 23000 0.3641 3.5994
3.4931 6.9930 24000 0.3651 3.5879
3.4302 7.2844 25000 0.3649 3.5963
3.4563 7.5758 26000 0.3654 3.5880
3.4508 7.8671 27000 0.3665 3.5766
3.3689 8.1585 28000 0.3662 3.5876
3.409 8.4499 29000 0.3672 3.5788
3.4213 8.7413 30000 0.3675 3.5736
3.3257 9.0326 31000 0.3677 3.5775
3.3722 9.3240 32000 0.3677 3.5775
3.3982 9.6154 33000 0.3684 3.5674
3.4083 9.9068 34000 0.3689 3.5608
3.3294 10.1981 35000 0.3687 3.5735
3.3591 10.4895 36000 0.3689 3.5666
3.3894 10.7809 37000 0.3694 3.5561
3.2944 11.0723 38000 0.3693 3.5667
3.3381 11.3636 39000 0.3691 3.5636
3.3591 11.6550 40000 0.3700 3.5557
3.357 11.9464 41000 0.3706 3.5481
3.3056 12.2378 42000 0.3699 3.5649
3.3235 12.5291 43000 0.3705 3.5532
3.3355 12.8205 44000 0.3705 3.5489
3.2631 13.1119 45000 0.3701 3.5618
3.3058 13.4033 46000 0.3707 3.5577
3.3095 13.6946 47000 0.3713 3.5500
3.3335 13.9860 48000 0.3718 3.5389
3.2777 14.2774 49000 0.3712 3.5574
3.313 14.5688 50000 0.3716 3.5501
3.3186 14.8601 51000 0.3722 3.5429
3.2468 15.1515 52000 0.3714 3.5522
3.2767 15.4429 53000 0.3715 3.5495
3.2932 15.7343 54000 0.3725 3.5390
3.2057 16.0256 55000 0.3721 3.5518
3.2652 16.3170 56000 0.3720 3.5508
3.2886 16.6084 57000 0.3725 3.5458
3.2946 16.8998 58000 0.3731 3.5339
3.2236 17.1911 59000 0.3719 3.5545
3.268 17.4825 60000 0.3725 3.5442
3.2783 17.7739 61000 0.3730 3.5379
3.1921 18.0653 62000 0.3721 3.5566
3.2344 18.3566 63000 0.3727 3.5482
3.2715 18.6480 64000 0.3732 3.5396
3.2683 18.9394 65000 0.3736 3.5344
3.2061 19.2308 66000 0.3728 3.5469
3.2432 19.5221 67000 0.3731 3.5429
3.2545 19.8135 68000 0.3736 3.5355
3.1751 20.1049 69000 0.3732 3.5499
3.2055 20.3963 70000 0.3729 3.5491
3.2364 20.6876 71000 0.3736 3.5382
3.2603 20.9790 72000 0.3741 3.5327
3.1853 21.2704 73000 0.3733 3.5471
3.2229 21.5618 74000 0.3736 3.5428
3.2503 21.8531 75000 0.3743 3.5331
3.1733 22.1445 76000 0.3730 3.5514
3.1996 22.4359 77000 0.3737 3.5417
3.2241 22.7273 78000 0.3742 3.5335
3.1207 23.0186 79000 0.3738 3.5484
3.1867 23.3100 80000 0.3737 3.5505
3.1736 23.6014 81000 3.5509 0.3738
3.1804 23.8928 82000 3.5459 0.3740
3.166 24.1841 83000 3.5539 0.3734
3.193 24.4755 84000 3.5468 0.3740
3.2055 24.7669 85000 3.5372 0.3745
3.125 25.0583 86000 3.5497 0.3741
3.1831 25.3497 87000 3.5500 0.3737
3.1967 25.6410 88000 3.5418 0.3745
3.198 25.9324 89000 3.5323 0.3752
3.1476 26.2238 90000 3.5518 0.3741
3.1699 26.5152 91000 3.5434 0.3745
3.1932 26.8065 92000 3.5361 0.3746
3.129 27.0979 93000 3.5533 0.3737
3.1585 27.3893 94000 3.5470 0.3741
3.1692 27.6807 95000 3.5368 0.3751
3.2099 27.9720 96000 3.5339 0.3755
3.1312 28.2634 97000 3.5470 0.3748
3.1501 28.5548 98000 3.5463 0.3748
3.1644 28.8462 99000 3.5356 0.3749
3.1114 29.1375 100000 3.5504 0.3742
3.1421 29.4289 101000 3.5466 0.3746
3.1544 29.7203 102000 3.5421 0.3751
3.1002 30.0117 103000 3.5492 0.3748
3.1165 30.3030 104000 3.5489 0.3747
3.1392 30.5944 105000 3.5429 0.3749
3.1409 30.8858 106000 3.5419 0.3754
3.101 31.1772 107000 3.5554 0.3743
3.1215 31.4685 108000 3.5474 0.3751
3.1317 31.7599 109000 3.5389 0.3754

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4