tiny_bert_bc_rand_5_v1

This model is a fine-tuned version of on the Hartunka/processed_book_corpus-rand-5 dataset. It achieves the following results on the evaluation set:

Loss: 3.1055
Accuracy: 0.6822

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 96
eval_batch_size: 96
seed: 10
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 10000
num_epochs: 25

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
7.2824	0.4215	10000	7.1207	0.1583
7.1705	0.8431	20000	6.9795	0.1726
4.6634	1.2646	30000	4.2812	0.5084
4.2844	1.6861	40000	3.9252	0.5556
4.0555	2.1077	50000	3.7210	0.5845
3.9068	2.5292	60000	3.5822	0.6047
3.8047	2.9507	70000	3.4940	0.6183
3.734	3.3723	80000	3.4309	0.6281
3.678	3.7938	90000	3.3727	0.6365
3.6347	4.2153	100000	3.3363	0.6426
3.6013	4.6369	110000	3.3058	0.6471
3.5724	5.0584	120000	3.2759	0.6515
3.5458	5.4799	130000	3.2562	0.6551
3.53	5.9014	140000	3.2334	0.6588
3.5054	6.3230	150000	3.2178	0.6610
3.4971	6.7445	160000	3.2043	0.6632
3.4724	7.1660	170000	3.1913	0.6659
3.4615	7.5876	180000	3.1805	0.6671
3.4516	8.0091	190000	3.1677	0.6690
3.4405	8.4306	200000	3.1585	0.6710
3.4284	8.8522	210000	3.1508	0.6722
3.4212	9.2737	220000	3.1426	0.6731
3.4078	9.6952	230000	3.1356	0.6746
3.3963	10.1168	240000	3.1310	0.6762
3.3982	10.5383	250000	3.1214	0.6771
3.3879	10.9598	260000	3.1157	0.6781
3.3778	11.3814	270000	3.1170	0.6780
3.3824	11.8029	280000	3.1097	0.6796
3.3619	12.2244	290000	3.1111	0.6801
3.3641	12.6460	300000	3.1050	0.6809
3.3471	13.0675	310000	3.1109	0.6812
3.3512	13.4890	320000	3.1056	0.6817
3.3523	13.9106	330000	3.1034	0.6820
3.3385	14.3321	340000	3.1033	0.6824
3.3393	14.7536	350000	3.1061	0.6828
3.3183	15.1751	360000	3.1164	0.6829
3.3261	15.5967	370000	3.1082	0.6833
3.2993	16.0182	380000	3.1230	0.6837
3.3079	16.4397	390000	3.1179	0.6836
3.3071	16.8613	400000	3.1099	0.6837
3.2867	17.2828	410000	3.1292	0.6843
3.2879	17.7043	420000	3.1274	0.6841
3.2591	18.1259	430000	3.1492	0.6841
3.265	18.5474	440000	3.1469	0.6839
3.2726	18.9689	450000	3.1432	0.6846
3.2429	19.3905	460000	3.1709	0.6847
3.2518	19.8120	470000	3.1598	0.6846
3.2136	20.2335	480000	3.1932	0.6846
3.2214	20.6551	490000	3.1829	0.6848
3.1855	21.0766	500000	3.1981	0.6848
3.1918	21.4981	510000	3.2118	0.6851
3.2026	21.9197	520000	3.1942	0.6851
3.1785	22.3412	530000	3.2237	0.6851
3.1744	22.7627	540000	3.2269	0.6853
3.1528	23.1843	550000	3.2419	0.6853
3.155	23.6058	560000	3.2405	0.6858
3.1321	24.0273	570000	3.2574	0.6859
3.1351	24.4488	580000	3.2498	0.6862
3.133	24.8704	590000	3.2519	0.6859

Framework versions

Transformers 4.40.0
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.19.1

Downloads last month: 1

Safetensors

Model size

33.3M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

Accuracy on Hartunka/processed_book_corpus-rand-5
self-reported

0.682