gpt2_m080_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.8808	0.0522	1000	2.4260	0.4516
1.9542	0.1043	2000	1.7748	0.5731
1.7122	0.1565	3000	1.6012	0.6042
1.5962	0.2086	4000	1.5008	0.6221
1.5228	0.2608	5000	1.4413	0.6331
1.4706	0.3129	6000	1.3981	0.6413
1.4342	0.3651	7000	1.3601	0.6485
1.4	0.4173	8000	1.3316	0.6542
1.3759	0.4694	9000	1.3088	0.6585
1.3551	0.5216	10000	1.2908	0.6622
1.3322	0.5737	11000	1.2735	0.6657
1.3179	0.6259	12000	1.2587	0.6685
1.3075	0.6780	13000	1.2458	0.6711
1.2997	0.7302	14000	1.2362	0.6730
1.2869	0.7824	15000	1.2277	0.6747
1.2766	0.8345	16000	1.2178	0.6769
1.271	0.8867	17000	1.2117	0.6782
1.2624	0.9388	18000	1.2055	0.6794
1.2593	0.9910	19000	1.2021	0.6801

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support