eng_wiki_clm_13

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.1319	2000	7.5861
7.672	2.2637	4000	6.6896
7.672	3.3956	6000	6.2889
6.3419	4.5274	8000	6.0171
6.3419	5.6593	10000	5.7843
5.8297	6.7912	12000	5.5648
5.8297	7.9230	14000	5.3653
5.4199	9.0549	16000	5.1962
5.4199	10.1868	18000	5.0526
5.0949	11.3186	20000	4.9287
5.0949	12.4505	22000	4.8257
4.8548	13.5823	24000	4.7442
4.8548	14.7142	26000	4.6666
4.6698	15.8461	28000	4.6017
4.6698	16.9779	30000	4.5493
4.5194	18.1098	32000	4.4992
4.5194	19.2417	34000	4.4609
4.3912	20.3735	36000	4.4266
4.3912	21.5054	38000	4.3924
4.2859	22.6372	40000	4.3671
4.2859	23.7691	42000	4.3373
4.1853	24.9010	44000	4.3159
4.1853	26.0328	46000	4.3016
4.0873	27.1647	48000	4.2892
4.0873	28.2965	50000	4.2784
4.0071	29.4284	52000	4.2694
4.0071	30.5603	54000	4.2630
3.9439	31.6921	56000	4.2547
3.9439	32.8240	58000	4.2468
3.8867	33.9559	60000	4.2447
3.8867	35.0877	62000	4.2471
3.83	36.2196	64000	4.2458
3.83	37.3514	66000	4.2433
3.787	38.4833	68000	4.2433
3.787	39.6152	70000	4.2433
3.7489	40.7470	72000	4.2429
3.7489	41.8789	74000	4.2410
3.7122	43.0108	76000	4.2438
3.7122	44.1426	78000	4.2496
3.6739	45.2745	80000	4.2494
3.6739	46.4063	82000	4.2477
3.6465	47.5382	84000	4.2488
3.6465	48.6701	86000	4.2504
3.6201	49.8019	88000	4.2490
3.6201	50.9338	90000	4.2500
3.5947	52.0656	92000	4.2516
3.5947	53.1975	94000	4.2530
3.5712	54.3294	96000	4.2526
3.5712	55.4612	98000	4.2529
3.5532	56.5931	100000	4.2516

Safetensors

Model size

12.7M params

Tensor type

F32