metadata
library_name: transformers
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: exceptions_exp2_cost_to_drop_frequency_5039
results: []
exceptions_exp2_cost_to_drop_frequency_5039
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 3.5407
- Accuracy: 0.3720
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0006
- train_batch_size: 16
- eval_batch_size: 16
- seed: 5039
- gradient_accumulation_steps: 5
- total_train_batch_size: 80
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 20.0
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| 4.8271 | 0.2912 | 1000 | 4.7385 | 0.2568 |
| 4.3193 | 0.5824 | 2000 | 4.2775 | 0.3003 |
| 4.1342 | 0.8736 | 3000 | 4.0978 | 0.3156 |
| 3.9849 | 1.1645 | 4000 | 3.9868 | 0.3255 |
| 3.9394 | 1.4557 | 5000 | 3.9117 | 0.3322 |
| 3.8827 | 1.7469 | 6000 | 3.8548 | 0.3371 |
| 3.7475 | 2.0379 | 7000 | 3.8121 | 0.3419 |
| 3.7642 | 2.3290 | 8000 | 3.7793 | 0.3451 |
| 3.7393 | 2.6202 | 9000 | 3.7499 | 0.3475 |
| 3.711 | 2.9114 | 10000 | 3.7228 | 0.3500 |
| 3.6274 | 3.2024 | 11000 | 3.7098 | 0.3521 |
| 3.6332 | 3.4936 | 12000 | 3.6899 | 0.3539 |
| 3.6231 | 3.7848 | 13000 | 3.6698 | 0.3558 |
| 3.5326 | 4.0757 | 14000 | 3.6634 | 0.3571 |
| 3.55 | 4.3669 | 15000 | 3.6499 | 0.3585 |
| 3.5673 | 4.6581 | 16000 | 3.6389 | 0.3594 |
| 3.5487 | 4.9493 | 17000 | 3.6259 | 0.3609 |
| 3.4846 | 5.2402 | 18000 | 3.6268 | 0.3616 |
| 3.4964 | 5.5314 | 19000 | 3.6121 | 0.3623 |
| 3.5089 | 5.8226 | 20000 | 3.6031 | 0.3633 |
| 3.4157 | 6.1136 | 21000 | 3.6087 | 0.3640 |
| 3.4494 | 6.4048 | 22000 | 3.5989 | 0.3645 |
| 3.4584 | 6.6959 | 23000 | 3.5899 | 0.3652 |
| 3.4745 | 6.9871 | 24000 | 3.5784 | 0.3659 |
| 3.4072 | 7.2781 | 25000 | 3.5910 | 0.3660 |
| 3.4202 | 7.5693 | 26000 | 3.5785 | 0.3667 |
| 3.434 | 7.8605 | 27000 | 3.5679 | 0.3675 |
| 3.3574 | 8.1514 | 28000 | 3.5767 | 0.3677 |
| 3.3861 | 8.4426 | 29000 | 3.5711 | 0.3682 |
| 3.3897 | 8.7338 | 30000 | 3.5635 | 0.3687 |
| 3.2961 | 9.0248 | 31000 | 3.5644 | 0.3690 |
| 3.3332 | 9.3159 | 32000 | 3.5656 | 0.3693 |
| 3.3569 | 9.6071 | 33000 | 3.5570 | 0.3699 |
| 3.3651 | 9.8983 | 34000 | 3.5488 | 0.3702 |
| 3.2898 | 10.1893 | 35000 | 3.5583 | 0.3700 |
| 3.3241 | 10.4805 | 36000 | 3.5514 | 0.3707 |
| 3.3291 | 10.7716 | 37000 | 3.5466 | 0.3712 |
| 3.2459 | 11.0626 | 38000 | 3.5516 | 0.3711 |
| 3.2975 | 11.3538 | 39000 | 3.5510 | 0.3713 |
| 3.2958 | 11.6450 | 40000 | 3.5407 | 0.3720 |
| 3.3187 | 11.9362 | 41000 | 3.5353 | 0.3726 |
| 3.2515 | 12.2271 | 42000 | 3.5470 | 0.3723 |
| 3.2747 | 12.5183 | 43000 | 3.5423 | 0.3724 |
| 3.2961 | 12.8095 | 44000 | 3.5354 | 0.3730 |
| 3.2229 | 13.1005 | 45000 | 3.5441 | 0.3728 |
| 3.236 | 13.3916 | 46000 | 3.5381 | 0.3734 |
| 3.252 | 13.6828 | 47000 | 3.5329 | 0.3735 |
| 3.2605 | 13.9740 | 48000 | 3.5270 | 0.3742 |
| 3.2134 | 14.2650 | 49000 | 3.5373 | 0.3738 |
| 3.232 | 14.5562 | 50000 | 3.5328 | 0.3742 |
| 3.2211 | 14.8474 | 51000 | 3.5272 | 0.3746 |
| 3.167 | 15.1383 | 52000 | 3.5363 | 0.3744 |
| 3.1923 | 15.4295 | 53000 | 3.5330 | 0.3747 |
| 3.2039 | 15.7207 | 54000 | 3.5267 | 0.3751 |
| 3.1459 | 16.0116 | 55000 | 3.5299 | 0.3750 |
| 3.173 | 16.3028 | 56000 | 3.5322 | 0.3750 |
| 3.1659 | 16.5940 | 57000 | 3.5269 | 0.3755 |
| 3.1841 | 16.8852 | 58000 | 3.5209 | 0.3758 |
| 3.1359 | 17.1762 | 59000 | 3.5285 | 0.3756 |
| 3.155 | 17.4674 | 60000 | 3.5252 | 0.3758 |
| 3.1584 | 17.7585 | 61000 | 3.5219 | 0.3762 |
| 3.1147 | 18.0495 | 62000 | 3.5243 | 0.3762 |
| 3.1315 | 18.3407 | 63000 | 3.5243 | 0.3762 |
| 3.1443 | 18.6319 | 64000 | 3.5211 | 0.3765 |
| 3.1346 | 18.9231 | 65000 | 3.5185 | 0.3768 |
| 3.1028 | 19.2140 | 66000 | 3.5234 | 0.3766 |
| 3.1071 | 19.5052 | 67000 | 3.5202 | 0.3769 |
| 3.1135 | 19.7964 | 68000 | 3.5187 | 0.3771 |
Framework versions
- Transformers 4.55.2
- Pytorch 2.8.0+cu128
- Datasets 4.0.0
- Tokenizers 0.21.4