gpt2-13K_NC_V4

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 100

Training Loss	Epoch	Step	Validation Loss
No log	1.0	41	3.3578
No log	2.0	82	3.0393
0.8629	3.0	123	2.7994
0.8629	4.0	164	2.6013
0.6962	5.0	205	2.4413
0.6962	6.0	246	2.3348
0.6962	7.0	287	2.2546
0.6088	8.0	328	2.2033
0.6088	9.0	369	2.1726
0.5725	10.0	410	2.1390
0.5725	11.0	451	2.1300
0.5725	12.0	492	2.1048
0.5556	13.0	533	2.0996
0.5556	14.0	574	2.0854
0.5428	15.0	615	2.0780
0.5428	16.0	656	2.0707
0.5428	17.0	697	2.0680
0.534	18.0	738	2.0577
0.534	19.0	779	2.0511
0.5294	20.0	820	2.0464
0.5294	21.0	861	2.0412
0.5239	22.0	902	2.0414
0.5239	23.0	943	2.0338
0.5239	24.0	984	2.0300
0.5221	25.0	1025	2.0279
0.5221	26.0	1066	2.0214
0.5164	27.0	1107	2.0210
0.5164	28.0	1148	2.0211
0.5164	29.0	1189	2.0170
0.5155	30.0	1230	2.0181
0.5155	31.0	1271	2.0138
0.5129	32.0	1312	2.0139
0.5129	33.0	1353	2.0070
0.5129	34.0	1394	2.0102
0.5116	35.0	1435	2.0063
0.5116	36.0	1476	2.0022
0.5115	37.0	1517	2.0008
0.5115	38.0	1558	2.0020
0.5115	39.0	1599	1.9979
0.5105	40.0	1640	1.9982
0.5105	41.0	1681	1.9981

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Adapter

this model