de_wiki_clm_42

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.0796	2000	7.8013
7.9145	2.1592	4000	7.0872
7.9145	3.2389	6000	6.6353
6.6919	4.3185	8000	6.2753
6.6919	5.3981	10000	5.9640
6.0331	6.4777	12000	5.6939
6.0331	7.5574	14000	5.4531
5.5261	8.6370	16000	5.2534
5.5261	9.7166	18000	5.0814
5.1455	10.7962	20000	4.9420
5.1455	11.8758	22000	4.8196
4.8678	12.9555	24000	4.7144
4.8678	14.0351	26000	4.6195
4.6498	15.1147	28000	4.5415
4.6498	16.1943	30000	4.4792
4.4749	17.2740	32000	4.4163
4.4749	18.3536	34000	4.3686
4.3329	19.4332	36000	4.3249
4.3329	20.5128	38000	4.2838
4.2153	21.5924	40000	4.2503
4.2153	22.6721	42000	4.2175
4.105	23.7517	44000	4.1878
4.105	24.8313	46000	4.1638
4.0056	25.9109	48000	4.1418
4.0056	26.9906	50000	4.1273
3.9206	28.0702	52000	4.1156
3.9206	29.1498	54000	4.1051
3.8488	30.2294	56000	4.0962
3.8488	31.3090	58000	4.0877
3.7907	32.3887	60000	4.0822
3.7907	33.4683	62000	4.0746
3.7405	34.5479	64000	4.0698
3.7405	35.6275	66000	4.0621
3.6943	36.7072	68000	4.0587
3.6943	37.7868	70000	4.0554
3.6534	38.8664	72000	4.0520
3.6534	39.9460	74000	4.0502
3.6158	41.0256	76000	4.0489
3.6158	42.1053	78000	4.0538
3.5796	43.1849	80000	4.0492
3.5796	44.2645	82000	4.0464
3.5501	45.3441	84000	4.0478
3.5501	46.4238	86000	4.0443
3.5235	47.5034	88000	4.0431
3.5235	48.5830	90000	4.0419
3.4985	49.6626	92000	4.0406
3.4985	50.7422	94000	4.0393
3.4768	51.8219	96000	4.0391
3.4768	52.9015	98000	4.0383
3.458	53.9811	100000	4.0375

Safetensors

Model size

12.7M params

Tensor type

F32