Se124M10KInfPrompt

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 50
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
0.4014	1.0	267	1.0141
0.2422	2.0	534	0.8523
0.2202	3.0	801	0.8168
0.2129	4.0	1068	0.7993
0.2059	5.0	1335	0.7837
0.2041	6.0	1602	0.7695
0.2031	7.0	1869	0.7635
0.1982	8.0	2136	0.7586
0.1975	9.0	2403	0.7532
0.1974	10.0	2670	0.7483
0.1978	11.0	2937	0.7467
0.1939	12.0	3204	0.7445
0.1953	13.0	3471	0.7439
0.1929	14.0	3738	0.7362
0.1937	15.0	4005	0.7328
0.1934	16.0	4272	0.7329
0.1927	17.0	4539	0.7323
0.1927	18.0	4806	0.7257
0.1909	19.0	5073	0.7276
0.1919	20.0	5340	0.7251
0.1919	21.0	5607	0.7239
0.1912	22.0	5874	0.7260
0.1897	23.0	6141	0.7241
0.1916	24.0	6408	0.7235
0.1905	25.0	6675	0.7225
0.1919	26.0	6942	0.7188
0.1883	27.0	7209	0.7207
0.1898	28.0	7476	0.7198
0.1874	29.0	7743	0.7195
0.188	30.0	8010	0.7194
0.1873	31.0	8277	0.7182
0.1878	32.0	8544	0.7212
0.1866	33.0	8811	0.7171
0.1883	34.0	9078	0.7151
0.1881	35.0	9345	0.7176
0.1868	36.0	9612	0.7149
0.1871	37.0	9879	0.7157
0.1876	38.0	10146	0.7162
0.188	39.0	10413	0.7142
0.1861	40.0	10680	0.7149
0.1862	41.0	10947	0.7144
0.1862	42.0	11214	0.7128
0.186	43.0	11481	0.7136
0.1868	44.0	11748	0.7137
0.1837	45.0	12015	0.7138
0.1868	46.0	12282	0.7141
0.187	47.0	12549	0.7133

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Adapter

this model