gpt2_m020_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.9182	0.0525	1000	2.4652	0.4419
1.9788	0.1050	2000	1.7994	0.5669
1.7281	0.1575	3000	1.6132	0.6002
1.6112	0.2100	4000	1.5177	0.6176
1.5356	0.2625	5000	1.4536	0.6300
1.4817	0.3150	6000	1.4055	0.6394
1.4437	0.3675	7000	1.3716	0.6457
1.4136	0.4200	8000	1.3434	0.6511
1.3892	0.4725	9000	1.3195	0.6558
1.3683	0.5250	10000	1.3011	0.6593
1.3477	0.5775	11000	1.2862	0.6623
1.3348	0.6300	12000	1.2694	0.6658
1.3189	0.6825	13000	1.2579	0.6682
1.3068	0.7350	14000	1.2471	0.6702
1.2964	0.7875	15000	1.2372	0.6723
1.2879	0.8400	16000	1.2296	0.6738
1.2855	0.8925	17000	1.2213	0.6756
1.275	0.9450	18000	1.2167	0.6766
1.2738	0.9975	19000	1.2137	0.6773

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support