art-gpt2-base

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
6.6246	3.23	100	6.0652
5.5673	6.45	200	5.5941
4.8544	9.68	300	5.2210
4.1301	12.9	400	4.9281
3.4252	16.13	500	4.8584
2.8133	19.35	600	4.8369
2.2897	22.58	700	4.8968
1.8635	25.81	800	5.0623
1.4989	29.03	900	5.1647
1.1677	32.26	1000	5.3719
0.9198	35.48	1100	5.4282
0.7353	38.71	1200	5.6292
0.6025	41.94	1300	5.6874
0.5122	45.16	1400	5.7219
0.432	48.39	1500	5.8266
0.3801	51.61	1600	5.8598
0.3457	54.84	1700	5.9109
0.3131	58.06	1800	5.9386
0.2904	61.29	1900	5.9634
0.265	64.52	2000	5.9652
0.2526	67.74	2100	5.9944
0.2363	70.97	2200	6.0083
0.2276	74.19	2300	6.0417
0.2155	77.42	2400	6.0281
0.2083	80.65	2500	6.0560
0.2056	83.87	2600	6.0612
0.2008	87.1	2700	6.0770
0.1958	90.32	2800	6.0843
0.192	93.55	2900	6.0831
0.1889	96.77	3000	6.0930
0.1881	100.0	3100	6.0920

Safetensors

Model size

93.1M params

Tensor type

F32

Base model

Finetuned

this model