gpt2_m070_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.9037	0.0523	1000	2.4444	0.4487
1.9735	0.1045	2000	1.7962	0.5686
1.7239	0.1568	3000	1.6086	0.6016
1.6013	0.2091	4000	1.5058	0.6211
1.5297	0.2613	5000	1.4461	0.6315
1.478	0.3136	6000	1.4016	0.6401
1.4343	0.3659	7000	1.3635	0.6475
1.4042	0.4181	8000	1.3342	0.6532
1.3786	0.4704	9000	1.3108	0.6576
1.3566	0.5227	10000	1.2912	0.6617
1.3389	0.5750	11000	1.2732	0.6651
1.3228	0.6272	12000	1.2624	0.6675
1.3105	0.6795	13000	1.2480	0.6703
1.2968	0.7318	14000	1.2376	0.6723
1.2894	0.7840	15000	1.2294	0.6742
1.2795	0.8363	16000	1.2207	0.6758
1.2718	0.8886	17000	1.2131	0.6774
1.2679	0.9408	18000	1.2084	0.6785
1.2646	0.9931	19000	1.2045	0.6793

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support