Upload folder using huggingface_hub

93dac8b verified 4 months ago

6.15 kB

library_name: transformers
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: exceptions_exp2_cost_to_drop_frequency_5039
    results: []

exceptions_exp2_cost_to_drop_frequency_5039

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.5407
Accuracy: 0.3720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0006
train_batch_size: 16
eval_batch_size: 16
seed: 5039
gradient_accumulation_steps: 5
total_train_batch_size: 80
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 20.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
4.8271	0.2912	1000	4.7385	0.2568
4.3193	0.5824	2000	4.2775	0.3003
4.1342	0.8736	3000	4.0978	0.3156
3.9849	1.1645	4000	3.9868	0.3255
3.9394	1.4557	5000	3.9117	0.3322
3.8827	1.7469	6000	3.8548	0.3371
3.7475	2.0379	7000	3.8121	0.3419
3.7642	2.3290	8000	3.7793	0.3451
3.7393	2.6202	9000	3.7499	0.3475
3.711	2.9114	10000	3.7228	0.3500
3.6274	3.2024	11000	3.7098	0.3521
3.6332	3.4936	12000	3.6899	0.3539
3.6231	3.7848	13000	3.6698	0.3558
3.5326	4.0757	14000	3.6634	0.3571
3.55	4.3669	15000	3.6499	0.3585
3.5673	4.6581	16000	3.6389	0.3594
3.5487	4.9493	17000	3.6259	0.3609
3.4846	5.2402	18000	3.6268	0.3616
3.4964	5.5314	19000	3.6121	0.3623
3.5089	5.8226	20000	3.6031	0.3633
3.4157	6.1136	21000	3.6087	0.3640
3.4494	6.4048	22000	3.5989	0.3645
3.4584	6.6959	23000	3.5899	0.3652
3.4745	6.9871	24000	3.5784	0.3659
3.4072	7.2781	25000	3.5910	0.3660
3.4202	7.5693	26000	3.5785	0.3667
3.434	7.8605	27000	3.5679	0.3675
3.3574	8.1514	28000	3.5767	0.3677
3.3861	8.4426	29000	3.5711	0.3682
3.3897	8.7338	30000	3.5635	0.3687
3.2961	9.0248	31000	3.5644	0.3690
3.3332	9.3159	32000	3.5656	0.3693
3.3569	9.6071	33000	3.5570	0.3699
3.3651	9.8983	34000	3.5488	0.3702
3.2898	10.1893	35000	3.5583	0.3700
3.3241	10.4805	36000	3.5514	0.3707
3.3291	10.7716	37000	3.5466	0.3712
3.2459	11.0626	38000	3.5516	0.3711
3.2975	11.3538	39000	3.5510	0.3713
3.2958	11.6450	40000	3.5407	0.3720
3.3187	11.9362	41000	3.5353	0.3726
3.2515	12.2271	42000	3.5470	0.3723
3.2747	12.5183	43000	3.5423	0.3724
3.2961	12.8095	44000	3.5354	0.3730
3.2229	13.1005	45000	3.5441	0.3728
3.236	13.3916	46000	3.5381	0.3734
3.252	13.6828	47000	3.5329	0.3735
3.2605	13.9740	48000	3.5270	0.3742
3.2134	14.2650	49000	3.5373	0.3738
3.232	14.5562	50000	3.5328	0.3742
3.2211	14.8474	51000	3.5272	0.3746
3.167	15.1383	52000	3.5363	0.3744
3.1923	15.4295	53000	3.5330	0.3747
3.2039	15.7207	54000	3.5267	0.3751
3.1459	16.0116	55000	3.5299	0.3750
3.173	16.3028	56000	3.5322	0.3750
3.1659	16.5940	57000	3.5269	0.3755
3.1841	16.8852	58000	3.5209	0.3758
3.1359	17.1762	59000	3.5285	0.3756
3.155	17.4674	60000	3.5252	0.3758
3.1584	17.7585	61000	3.5219	0.3762
3.1147	18.0495	62000	3.5243	0.3762
3.1315	18.3407	63000	3.5243	0.3762
3.1443	18.6319	64000	3.5211	0.3765
3.1346	18.9231	65000	3.5185	0.3768
3.1028	19.2140	66000	3.5234	0.3766
3.1071	19.5052	67000	3.5202	0.3769
3.1135	19.7964	68000	3.5187	0.3771

Framework versions

Transformers 4.55.2
Pytorch 2.8.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4