de_wiki_mlm_30

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.0796	2000	8.1295
8.165	2.1592	4000	7.4511
8.165	3.2389	6000	7.3578
7.369	4.3185	8000	7.2654
7.369	5.3981	10000	7.1874
7.2089	6.4777	12000	7.1034
7.2089	7.5574	14000	7.0611
7.0701	8.6370	16000	6.9701
7.0701	9.7166	18000	6.9493
6.9567	10.7962	20000	6.8926
6.9567	11.8758	22000	6.8479
6.863	12.9555	24000	6.7844
6.863	14.0351	26000	6.7093
6.7236	15.1147	28000	6.5634
6.7236	16.1943	30000	6.4365
6.4847	17.2740	32000	6.1951
6.4847	18.3536	34000	5.8373
5.9256	19.4332	36000	5.2895
5.9256	20.5128	38000	4.9567
5.0992	21.5924	40000	4.6883
5.0992	22.6721	42000	4.4813
4.6231	23.7517	44000	4.3045
4.6231	24.8313	46000	4.1386
4.2683	25.9109	48000	4.0062
4.2683	26.9906	50000	3.8906
4.0096	28.0702	52000	3.7984
4.0096	29.1498	54000	3.7150
3.8199	30.2294	56000	3.6128
3.8199	31.3090	58000	3.5471
3.6694	32.3887	60000	3.4932
3.6694	33.4683	62000	3.4384
3.5482	34.5479	64000	3.4021
3.5482	35.6275	66000	3.3532
3.4458	36.7072	68000	3.3054
3.4458	37.7868	70000	3.2852
3.3631	38.8664	72000	3.2286
3.3631	39.9460	74000	3.1990
3.3013	41.0256	76000	3.1797
3.3013	42.1053	78000	3.1476
3.2413	43.1849	80000	3.1266
3.2413	44.2645	82000	3.1271
3.1916	45.3441	84000	3.0851
3.1916	46.4238	86000	3.0758
3.1512	47.5034	88000	3.0595
3.1512	48.5830	90000	3.0410
3.1197	49.6626	92000	3.0217
3.1197	50.7422	94000	3.0267
3.0925	51.8219	96000	3.0220
3.0925	52.9015	98000	3.0122
3.0794	53.9811	100000	3.0055

Safetensors

Model size

14.9M params

Tensor type

F32