Se124M100KInfSimple

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 50
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
0.1445	1.0	2205	0.5442
0.1358	2.0	4410	0.5179
0.1317	3.0	6615	0.5090
0.1297	4.0	8820	0.5015
0.1299	5.0	11025	0.4954
0.1301	6.0	13230	0.4917
0.1258	7.0	15435	0.4875
0.1254	8.0	17640	0.4834
0.1231	9.0	19845	0.4816
0.1254	10.0	22050	0.4798
0.125	11.0	24255	0.4778
0.1225	12.0	26460	0.4775
0.1233	13.0	28665	0.4753
0.1213	14.0	30870	0.4737
0.1231	15.0	33075	0.4719
0.1233	16.0	35280	0.4716
0.1225	17.0	37485	0.4702
0.1218	18.0	39690	0.4696
0.1213	19.0	41895	0.4678
0.1213	20.0	44100	0.4673
0.121	21.0	46305	0.4675
0.122	22.0	48510	0.4663
0.1195	23.0	50715	0.4657
0.1221	24.0	52920	0.4647
0.1212	25.0	55125	0.4647
0.121	26.0	57330	0.4640
0.1213	27.0	59535	0.4637
0.1184	28.0	61740	0.4629
0.12	29.0	63945	0.4627
0.1191	30.0	66150	0.4622
0.1195	31.0	68355	0.4624
0.1188	32.0	70560	0.4619
0.1202	33.0	72765	0.4620
0.119	34.0	74970	0.4605
0.1206	35.0	77175	0.4608
0.1197	36.0	79380	0.4601
0.1199	37.0	81585	0.4597
0.1204	38.0	83790	0.4601
0.1185	39.0	85995	0.4596
0.1184	40.0	88200	0.4591
0.119	41.0	90405	0.4594
0.1181	42.0	92610	0.4591
0.1178	43.0	94815	0.4588
0.1188	44.0	97020	0.4586
0.1189	45.0	99225	0.4584
0.1183	46.0	101430	0.4583
0.1184	47.0	103635	0.4582
0.1185	48.0	105840	0.4581
0.1198	49.0	108045	0.4582
0.1207	50.0	110250	0.4582

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Adapter

this model