gjonesQ02's picture
Model save
c243dff verified
|
raw
history blame
2.27 kB
metadata
license: apache-2.0
base_model: distilgpt2
tags:
  - generated_from_trainer
model-index:
  - name: StatementOfWork_Generator_Omega2
    results: []

StatementOfWork_Generator_Omega2

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9899

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 50
  • eval_batch_size: 50
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 15 1.0194
No log 2.0 30 1.0190
No log 3.0 45 1.0147
No log 4.0 60 1.0102
No log 5.0 75 1.0135
No log 6.0 90 1.0091
No log 7.0 105 1.0055
No log 8.0 120 1.0024
No log 9.0 135 0.9965
No log 10.0 150 0.9984
No log 11.0 165 1.0013
No log 12.0 180 0.9934
No log 13.0 195 0.9952
No log 14.0 210 0.9917
No log 15.0 225 0.9911
No log 16.0 240 0.9900
No log 17.0 255 0.9918
No log 18.0 270 0.9902
No log 19.0 285 0.9900
No log 20.0 300 0.9899

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2