de_wiki_clm_30

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.0796	2000	7.8191
7.928	2.1592	4000	7.0870
7.928	3.2389	6000	6.6422
6.6946	4.3185	8000	6.2840
6.6946	5.3981	10000	5.9706
6.037	6.4777	12000	5.6935
6.037	7.5574	14000	5.4614
5.5288	8.6370	16000	5.2527
5.5288	9.7166	18000	5.0790
5.1465	10.7962	20000	4.9348
5.1465	11.8758	22000	4.8114
4.8667	12.9555	24000	4.7085
4.8667	14.0351	26000	4.6242
4.6478	15.1147	28000	4.5389
4.6478	16.1943	30000	4.4701
4.4727	17.2740	32000	4.4099
4.4727	18.3536	34000	4.3633
4.3307	19.4332	36000	4.3184
4.3307	20.5128	38000	4.2779
4.2116	21.5924	40000	4.2453
4.2116	22.6721	42000	4.2135
4.1017	23.7517	44000	4.1839
4.1017	24.8313	46000	4.1570
4.0019	25.9109	48000	4.1387
4.0019	26.9906	50000	4.1239
3.9164	28.0702	52000	4.1119
3.9164	29.1498	54000	4.1000
3.8451	30.2294	56000	4.0912
3.8451	31.3090	58000	4.0843
3.7863	32.3887	60000	4.0820
3.7863	33.4683	62000	4.0735
3.7356	34.5479	64000	4.0649
3.7356	35.6275	66000	4.0574
3.6893	36.7072	68000	4.0564
3.6893	37.7868	70000	4.0526
3.6492	38.8664	72000	4.0485
3.6492	39.9460	74000	4.0457
3.6111	41.0256	76000	4.0483
3.6111	42.1053	78000	4.0443
3.5749	43.1849	80000	4.0452
3.5749	44.2645	82000	4.0453
3.5442	45.3441	84000	4.0435
3.5442	46.4238	86000	4.0421
3.5184	47.5034	88000	4.0403
3.5184	48.5830	90000	4.0411
3.4926	49.6626	92000	4.0383
3.4926	50.7422	94000	4.0385
3.4715	51.8219	96000	4.0355
3.4715	52.9015	98000	4.0359
3.4519	53.9811	100000	4.0348

Safetensors

Model size

12.7M params

Tensor type

F32