exceptions_exp2_resemble_to_carry_frequency_1032

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.5563
Accuracy: 0.3698

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0006
train_batch_size: 16
eval_batch_size: 16
seed: 1032
gradient_accumulation_steps: 5
total_train_batch_size: 80
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 50.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
4.8151	0.2914	1000	4.7505	0.2556
4.3375	0.5827	2000	4.2844	0.2997
4.1552	0.8741	3000	4.1024	0.3146
3.9891	1.1652	4000	3.9927	0.3249
3.9302	1.4566	5000	3.9203	0.3316
3.8696	1.7479	6000	3.8574	0.3371
3.7346	2.0390	7000	3.8154	0.3413
3.7567	2.3304	8000	3.7872	0.3442
3.7472	2.6218	9000	3.7543	0.3471
3.7276	2.9131	10000	3.7289	0.3498
3.6452	3.2042	11000	3.7182	0.3512
3.6426	3.4956	12000	3.6990	0.3532
3.6409	3.7870	13000	3.6796	0.3542
3.5359	4.0781	14000	3.6743	0.3559
3.5631	4.3694	15000	3.6616	0.3571
3.5839	4.6608	16000	3.6494	0.3585
3.5796	4.9522	17000	3.6362	0.3593
3.4993	5.2433	18000	3.6374	0.3601
3.5203	5.5346	19000	3.6254	0.3610
3.5312	5.8260	20000	3.6138	0.3618
3.4332	6.1171	21000	3.6176	0.3624
3.4679	6.4085	22000	3.6101	0.3630
3.4913	6.6998	23000	3.6001	0.3638
3.4932	6.9912	24000	3.5904	0.3647
3.4321	7.2823	25000	3.6017	0.3645
3.4607	7.5737	26000	3.5915	0.3652
3.4683	7.8650	27000	3.5823	0.3661
3.3852	8.1562	28000	3.5911	0.3657
3.4233	8.4475	29000	3.5825	0.3664
3.4189	8.7389	30000	3.5747	0.3671
3.3179	9.0300	31000	3.5819	0.3668
3.3698	9.3214	32000	3.5779	0.3673
3.3953	9.6127	33000	3.5695	0.3677
3.4189	9.9041	34000	3.5615	0.3685
3.3492	10.1952	35000	3.5734	0.3683
3.3692	10.4866	36000	3.5659	0.3682
3.3769	10.7779	37000	3.5614	0.3691
3.2909	11.0691	38000	3.5702	0.3689
3.3467	11.3604	39000	3.5621	0.3692
3.3439	11.6518	40000	3.5563	0.3698
3.379	11.9431	41000	3.5475	0.3703
3.298	12.2343	42000	3.5615	0.3698
3.3236	12.5256	43000	3.5566	0.3702
3.3447	12.8170	44000	3.5487	0.3710
3.2707	13.1081	45000	3.5598	0.3701
3.3002	13.3995	46000	3.5572	0.3709
3.3277	13.6908	47000	3.5512	0.3707
3.3321	13.9822	48000	3.5418	0.3717
3.2824	14.2733	49000	3.5538	0.3711
3.3063	14.5647	50000	3.5496	0.3714
3.3166	14.8560	51000	3.5428	0.3718
3.2526	15.1471	52000	3.5585	0.3710
3.2776	15.4385	53000	3.5513	0.3718
3.2994	15.7299	54000	3.5421	0.3723
3.2006	16.0210	55000	3.5537	0.3717
3.2597	16.3123	56000	3.5512	0.3719
3.2761	16.6037	57000	3.5433	0.3722
3.3034	16.8951	58000	3.5379	0.3726
3.2251	17.1862	59000	3.5503	0.3719
3.2615	17.4775	60000	3.5454	0.3725
3.2727	17.7689	61000	3.5391	0.3728
3.1903	18.0600	62000	3.5508	0.3724
3.2306	18.3514	63000	3.5500	0.3725
3.2571	18.6427	64000	3.5427	0.3729
3.2779	18.9341	65000	3.5332	0.3733
3.1944	19.2252	66000	3.5532	0.3728
3.2425	19.5166	67000	3.5411	0.3732
3.2552	19.8079	68000	3.5351	0.3736
3.179	20.0991	69000	3.5491	0.3730
3.2293	20.3904	70000	3.5452	0.3732
3.2432	20.6818	71000	3.5398	0.3734
3.2542	20.9731	72000	3.5285	0.3742
3.1957	21.2643	73000	3.5454	0.3733
3.2124	21.5556	74000	3.5387	0.3740
3.232	21.8470	75000	3.5344	0.3741
3.1604	22.1381	76000	3.5498	0.3738
3.2007	22.4295	77000	3.5454	0.3737
3.2344	22.7208	78000	3.5340	0.3742
3.1573	23.0119	79000	3.5502	0.3737
3.172	23.3033	80000	3.5457	0.3737
3.1989	23.5947	81000	3.5390	0.3743
3.2253	23.8860	82000	3.5343	0.3747
3.1577	24.1771	83000	3.5499	0.3739
3.1992	24.4685	84000	3.5441	0.3741
3.2114	24.7599	85000	3.5361	0.3744
3.136	25.0510	86000	3.5512	0.3738
3.1778	25.3423	87000	3.5465	0.3740
3.19	25.6337	88000	3.5394	0.3745
3.202	25.9251	89000	3.5281	0.3753
3.143	26.2162	90000	3.5491	0.3742
3.1724	26.5075	91000	3.5433	0.3747
3.191	26.7989	92000	3.5359	0.3749
3.1136	27.0900	93000	3.5501	0.3744
3.1409	27.3814	94000	3.5449	0.3746
3.1708	27.6727	95000	3.5332	0.3753
3.1923	27.9641	96000	3.5295	0.3754
3.1359	28.2552	97000	3.5481	0.3746
3.1533	28.5466	98000	3.5434	0.3750
3.1673	28.8379	99000	3.5337	0.3753
3.1066	29.1291	100000	3.5478	0.3748
3.1326	29.4204	101000	3.5462	0.3748
3.1638	29.7118	102000	3.5371	0.3754
3.1602	30.0029	103000	3.5441	0.3749
3.123	30.2943	104000	3.5486	0.3748
3.1469	30.5856	105000	3.5419	0.3754
3.1581	30.8770	106000	3.5348	0.3755
3.0881	31.1681	107000	3.5499	0.3748
3.1309	31.4595	108000	3.5441	0.3749
3.1413	31.7508	109000	3.5393	0.3755

Framework versions

Transformers 4.55.2
Pytorch 2.8.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32