End of fine-tuning training

f7f40e2 verified 12 months ago

2.5 kB

library_name: peft
license: mit
base_model: gpt2
tags:
  - generated_from_trainer
model-index:
  - name: Se124M10KInfPrompt_endtoken
    results: []

Se124M10KInfPrompt_endtoken

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 50
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
0.9762	1.0	610	0.8681
0.8832	2.0	1220	0.8117
0.8432	3.0	1830	0.7922
0.8183	4.0	2440	0.7809
0.8131	5.0	3050	0.7687
0.8004	6.0	3660	0.7640
0.7972	7.0	4270	0.7584
0.7941	8.0	4880	0.7528
0.789	9.0	5490	0.7480
0.7806	10.0	6100	0.7454
0.7736	11.0	6710	0.7442
0.769	12.0	7320	0.7444
0.7734	13.0	7930	0.7402
0.7671	14.0	8540	0.7385
0.7605	15.0	9150	0.7365
0.7651	16.0	9760	0.7357
0.7657	17.0	10370	0.7340
0.763	18.0	10980	0.7318
0.7552	19.0	11590	0.7305
0.7563	20.0	12200	0.7284
0.7558	21.0	12810	0.7285
0.7465	22.0	13420	0.7290