gpt2_mini_wiki_10M_32768_76

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
7.8547	2.29	2000	7.5703	0.1096
6.6377	4.57	4000	6.8304	0.1425
6.1444	6.86	6000	6.4234	0.1597
5.8048	9.14	8000	6.1487	0.1705
5.5206	11.43	10000	5.9251	0.1797
5.2636	13.71	12000	5.7249	0.1895
5.0262	16.0	14000	5.5745	0.1978
4.8069	18.29	16000	5.4446	0.2056
4.6106	20.57	18000	5.3286	0.2138
4.44	22.86	20000	5.2347	0.2200
4.2848	25.14	22000	5.1628	0.2274
4.1572	27.43	24000	5.1007	0.2334
4.0505	29.71	26000	5.0545	0.2387
3.9584	32.0	28000	5.0119	0.2423
3.8711	34.29	30000	4.9895	0.2449
3.7987	36.57	32000	4.9796	0.2472
3.7352	38.86	34000	4.9712	0.2487
3.6685	41.14	36000	4.9788	0.2494
3.6159	43.43	38000	4.9708	0.2503
3.5701	45.71	40000	4.9766	0.2520
3.5188	48.0	42000	4.9741	0.2528
3.4595	50.29	44000	4.9929	0.2530
3.4161	52.57	46000	5.0111	0.2529
3.3782	54.86	48000	5.0077	0.2545

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support