gpt2_small_wiki_30M_32768_42

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
7.2039	0.73	2000	7.1375	0.1361
6.309	1.45	4000	6.4816	0.1650
5.7803	2.18	6000	6.0273	0.1829
5.3511	2.91	8000	5.6998	0.1980
4.9952	3.64	10000	5.4465	0.2120
4.6769	4.36	12000	5.2268	0.2280
4.3962	5.09	14000	5.0234	0.2458
4.1668	5.82	16000	4.8830	0.2571
3.9872	6.55	18000	4.7580	0.2660
3.8436	7.27	20000	4.6953	0.2713
3.7679	8.0	22000	4.6170	0.2770
3.6628	8.73	24000	4.5636	0.2822
3.5621	9.45	26000	4.5358	0.2852
3.501	10.18	28000	4.5071	0.2874
3.4653	10.91	30000	4.4665	0.2915
3.3857	11.64	32000	4.4574	0.2929
3.2994	12.36	34000	4.4433	0.2949
3.305	13.09	36000	4.4334	0.2964
3.2671	13.82	38000	4.4206	0.2970
3.198	14.55	40000	4.4230	0.2990
3.1285	15.27	42000	4.4186	0.2996
3.1591	16.0	44000	4.3842	0.3027
3.0791	16.73	46000	4.3949	0.3032
2.9947	17.45	48000	4.4071	0.3033
2.9731	18.18	50000	4.4251	0.3037

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support