Visualize in Weights & Biases

exceptions_exp2_resemble_to_carry_frequency_1032

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5563
  • Accuracy: 0.3698

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 1032
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.8151 0.2914 1000 4.7505 0.2556
4.3375 0.5827 2000 4.2844 0.2997
4.1552 0.8741 3000 4.1024 0.3146
3.9891 1.1652 4000 3.9927 0.3249
3.9302 1.4566 5000 3.9203 0.3316
3.8696 1.7479 6000 3.8574 0.3371
3.7346 2.0390 7000 3.8154 0.3413
3.7567 2.3304 8000 3.7872 0.3442
3.7472 2.6218 9000 3.7543 0.3471
3.7276 2.9131 10000 3.7289 0.3498
3.6452 3.2042 11000 3.7182 0.3512
3.6426 3.4956 12000 3.6990 0.3532
3.6409 3.7870 13000 3.6796 0.3542
3.5359 4.0781 14000 3.6743 0.3559
3.5631 4.3694 15000 3.6616 0.3571
3.5839 4.6608 16000 3.6494 0.3585
3.5796 4.9522 17000 3.6362 0.3593
3.4993 5.2433 18000 3.6374 0.3601
3.5203 5.5346 19000 3.6254 0.3610
3.5312 5.8260 20000 3.6138 0.3618
3.4332 6.1171 21000 3.6176 0.3624
3.4679 6.4085 22000 3.6101 0.3630
3.4913 6.6998 23000 3.6001 0.3638
3.4932 6.9912 24000 3.5904 0.3647
3.4321 7.2823 25000 3.6017 0.3645
3.4607 7.5737 26000 3.5915 0.3652
3.4683 7.8650 27000 3.5823 0.3661
3.3852 8.1562 28000 3.5911 0.3657
3.4233 8.4475 29000 3.5825 0.3664
3.4189 8.7389 30000 3.5747 0.3671
3.3179 9.0300 31000 3.5819 0.3668
3.3698 9.3214 32000 3.5779 0.3673
3.3953 9.6127 33000 3.5695 0.3677
3.4189 9.9041 34000 3.5615 0.3685
3.3492 10.1952 35000 3.5734 0.3683
3.3692 10.4866 36000 3.5659 0.3682
3.3769 10.7779 37000 3.5614 0.3691
3.2909 11.0691 38000 3.5702 0.3689
3.3467 11.3604 39000 3.5621 0.3692
3.3439 11.6518 40000 3.5563 0.3698
3.379 11.9431 41000 3.5475 0.3703
3.298 12.2343 42000 3.5615 0.3698
3.3236 12.5256 43000 3.5566 0.3702
3.3447 12.8170 44000 3.5487 0.3710
3.2707 13.1081 45000 3.5598 0.3701
3.3002 13.3995 46000 3.5572 0.3709
3.3277 13.6908 47000 3.5512 0.3707
3.3321 13.9822 48000 3.5418 0.3717
3.2824 14.2733 49000 3.5538 0.3711
3.3063 14.5647 50000 3.5496 0.3714
3.3166 14.8560 51000 3.5428 0.3718
3.2526 15.1471 52000 3.5585 0.3710
3.2776 15.4385 53000 3.5513 0.3718
3.2994 15.7299 54000 3.5421 0.3723
3.2006 16.0210 55000 3.5537 0.3717
3.2597 16.3123 56000 3.5512 0.3719
3.2761 16.6037 57000 3.5433 0.3722
3.3034 16.8951 58000 3.5379 0.3726
3.2251 17.1862 59000 3.5503 0.3719
3.2615 17.4775 60000 3.5454 0.3725
3.2727 17.7689 61000 3.5391 0.3728
3.1903 18.0600 62000 3.5508 0.3724
3.2306 18.3514 63000 3.5500 0.3725
3.2571 18.6427 64000 3.5427 0.3729
3.2779 18.9341 65000 3.5332 0.3733
3.1944 19.2252 66000 3.5532 0.3728
3.2425 19.5166 67000 3.5411 0.3732
3.2552 19.8079 68000 3.5351 0.3736
3.179 20.0991 69000 3.5491 0.3730
3.2293 20.3904 70000 3.5452 0.3732
3.2432 20.6818 71000 3.5398 0.3734
3.2542 20.9731 72000 3.5285 0.3742
3.1957 21.2643 73000 3.5454 0.3733
3.2124 21.5556 74000 3.5387 0.3740
3.232 21.8470 75000 3.5344 0.3741
3.1604 22.1381 76000 3.5498 0.3738
3.2007 22.4295 77000 3.5454 0.3737
3.2344 22.7208 78000 3.5340 0.3742
3.1573 23.0119 79000 3.5502 0.3737
3.172 23.3033 80000 3.5457 0.3737
3.1989 23.5947 81000 3.5390 0.3743
3.2253 23.8860 82000 3.5343 0.3747
3.1577 24.1771 83000 3.5499 0.3739
3.1992 24.4685 84000 3.5441 0.3741
3.2114 24.7599 85000 3.5361 0.3744
3.136 25.0510 86000 3.5512 0.3738
3.1778 25.3423 87000 3.5465 0.3740
3.19 25.6337 88000 3.5394 0.3745
3.202 25.9251 89000 3.5281 0.3753
3.143 26.2162 90000 3.5491 0.3742
3.1724 26.5075 91000 3.5433 0.3747
3.191 26.7989 92000 3.5359 0.3749
3.1136 27.0900 93000 3.5501 0.3744
3.1409 27.3814 94000 3.5449 0.3746
3.1708 27.6727 95000 3.5332 0.3753
3.1923 27.9641 96000 3.5295 0.3754
3.1359 28.2552 97000 3.5481 0.3746
3.1533 28.5466 98000 3.5434 0.3750
3.1673 28.8379 99000 3.5337 0.3753
3.1066 29.1291 100000 3.5478 0.3748
3.1326 29.4204 101000 3.5462 0.3748
3.1638 29.7118 102000 3.5371 0.3754
3.1602 30.0029 103000 3.5441 0.3749
3.123 30.2943 104000 3.5486 0.3748
3.1469 30.5856 105000 3.5419 0.3754
3.1581 30.8770 106000 3.5348 0.3755
3.0881 31.1681 107000 3.5499 0.3748
3.1309 31.4595 108000 3.5441 0.3749
3.1413 31.7508 109000 3.5393 0.3755

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support