gpt2_mini_wiki_10M_32768_42

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
7.8668	2.29	2000	7.5727	0.1094
6.6433	4.57	4000	6.8287	0.1424
6.1455	6.86	6000	6.4265	0.1593
5.8047	9.14	8000	6.1424	0.1703
5.5178	11.43	10000	5.9197	0.1792
5.2628	13.71	12000	5.7265	0.1893
5.0215	16.0	14000	5.5729	0.1983
4.8039	18.29	16000	5.4367	0.2067
4.6083	20.57	18000	5.3292	0.2133
4.4403	22.86	20000	5.2371	0.2207
4.2847	25.14	22000	5.1595	0.2275
4.1617	27.43	24000	5.1019	0.2331
4.0524	29.71	26000	5.0581	0.2380
3.9608	32.0	28000	5.0102	0.2415
3.8727	34.29	30000	5.0070	0.2440
3.7981	36.57	32000	4.9780	0.2466
3.7366	38.86	34000	4.9629	0.2491
3.6679	41.14	36000	4.9628	0.2499
3.6185	43.43	38000	4.9748	0.2506
3.5711	45.71	40000	4.9740	0.2516
3.5205	48.0	42000	4.9730	0.2523

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support