craa's picture
Upload folder using huggingface_hub
2d6c118 verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: exceptions_exp2_last_to_push_frequency_40817
    results: []

Visualize in Weights & Biases

exceptions_exp2_last_to_push_frequency_40817

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5566
  • Accuracy: 0.3699

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 40817
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Accuracy Validation Loss
4.8226 0.2912 1000 0.2548 4.7595
4.3377 0.5824 2000 0.2989 4.2932
4.1529 0.8737 3000 0.3155 4.0958
3.9874 1.1648 4000 0.3251 3.9857
3.918 1.4561 5000 0.3322 3.9150
3.8691 1.7473 6000 0.3375 3.8541
3.7395 2.0384 7000 0.3417 3.8098
3.75 2.3297 8000 0.3447 3.7835
3.7328 2.6209 9000 0.3474 3.7519
3.7218 2.9121 10000 0.3499 3.7271
3.6198 3.2033 11000 0.3519 3.7125
3.6521 3.4945 12000 0.3538 3.6945
3.6425 3.7857 13000 0.3554 3.6775
3.5318 4.0769 14000 0.3566 3.6691
3.5609 4.3681 15000 0.3576 3.6563
3.5807 4.6593 16000 0.3586 3.6454
3.5623 4.9506 17000 0.3599 3.6315
3.4969 5.2417 18000 0.3605 3.6372
3.515 5.5329 19000 0.3612 3.6257
3.5407 5.8242 20000 0.3623 3.6118
3.4379 6.1153 21000 0.3627 3.6162
3.4808 6.4065 22000 0.3631 3.6114
3.4801 6.6978 23000 0.3639 3.6001
3.5032 6.9890 24000 0.3650 3.5888
3.4223 7.2802 25000 0.3644 3.5988
3.4484 7.5714 26000 0.3656 3.5903
3.4644 7.8626 27000 0.3663 3.5784
3.3616 8.1538 28000 0.3662 3.5913
3.4078 8.4450 29000 0.3665 3.5818
3.4191 8.7362 30000 0.3672 3.5741
3.3218 9.0274 31000 0.3671 3.5782
3.3814 9.3186 32000 0.3672 3.5793
3.3876 9.6098 33000 0.3680 3.5690
3.4114 9.9010 34000 0.3689 3.5615
3.3468 10.1922 35000 0.3682 3.5728
3.3582 10.4834 36000 0.3688 3.5659
3.3759 10.7747 37000 0.3691 3.5594
3.2953 11.0658 38000 0.3690 3.5687
3.32 11.3570 39000 0.3697 3.5641
3.3552 11.6483 40000 0.3699 3.5566
3.3642 11.9395 41000 0.3705 3.5502
3.3008 12.2306 42000 0.3700 3.5611
3.3318 12.5219 43000 0.3704 3.5559
3.3508 12.8131 44000 0.3708 3.5485
3.2646 13.1043 45000 0.3705 3.5609
3.3065 13.3955 46000 0.3705 3.5541
3.333 13.6867 47000 0.3712 3.5471
3.3475 13.9779 48000 0.3716 3.5394
3.2779 14.2691 49000 0.3710 3.5555
3.2964 14.5603 50000 0.3715 3.5499
3.3015 14.8515 51000 0.3717 3.5408
3.2465 15.1427 52000 0.3710 3.5548
3.2841 15.4339 53000 0.3715 3.5491
3.2881 15.7251 54000 0.3720 3.5432
3.1978 16.0163 55000 0.3723 3.5493
3.259 16.3075 56000 0.3721 3.5498
3.2834 16.5988 57000 0.3722 3.5432
3.2864 16.8900 58000 0.3729 3.5378
3.2375 17.1811 59000 0.3720 3.5509
3.2556 17.4724 60000 0.3723 3.5462
3.2668 17.7636 61000 0.3731 3.5369
3.1791 18.0547 62000 0.3726 3.5498
3.2225 18.3460 63000 0.3729 3.5450
3.2548 18.6372 64000 0.3733 3.5395
3.2761 18.9284 65000 0.3737 3.5303
3.1968 19.2196 66000 0.3729 3.5495
3.2385 19.5108 67000 0.3734 3.5390
3.2498 19.8020 68000 0.3739 3.5317
3.1671 20.0932 69000 0.3730 3.5486
3.221 20.3844 70000 0.3734 3.5450
3.2279 20.6756 71000 0.3738 3.5397
3.2567 20.9669 72000 0.3744 3.5256
3.1976 21.2580 73000 0.3736 3.5462
3.2191 21.5492 74000 0.3737 3.5422
3.237 21.8405 75000 0.3742 3.5333
3.1506 22.1316 76000 0.3734 3.5470
3.2 22.4229 77000 0.3737 3.5437
3.2196 22.7141 78000 0.3744 3.5352
3.1812 23.0052 79000 0.3737 3.5444
3.1755 23.2965 80000 0.3743 3.5429
3.1678 23.5877 81000 3.5462 0.3735
3.1976 23.8789 82000 3.5400 0.3741
3.1531 24.1704 83000 3.5527 0.3736
3.1952 24.4616 84000 3.5442 0.3741
3.1897 24.7528 85000 3.5352 0.3748
3.1156 25.0440 86000 3.5469 0.3742
3.1616 25.3352 87000 3.5476 0.3742
3.1868 25.6264 88000 3.5388 0.3746
3.1926 25.9176 89000 3.5338 0.3751
3.14 26.2088 90000 3.5470 0.3746
3.1625 26.5000 91000 3.5408 0.3748
3.1847 26.7913 92000 3.5317 0.3751
3.1093 27.0824 93000 3.5485 0.3745
3.1523 27.3736 94000 3.5423 0.3747
3.1734 27.6649 95000 3.5385 0.3748
3.1821 27.9561 96000 3.5296 0.3756
3.1248 28.2472 97000 3.5502 0.3746
3.1562 28.5385 98000 3.5430 0.3749
3.1632 28.8297 99000 3.5369 0.3753
3.0985 29.1209 100000 3.5492 0.3746

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4