gjonesQ02's picture
Model save
2a43c92 verified
|
raw
history blame
2.27 kB
metadata
license: apache-2.0
base_model: distilgpt2
tags:
  - generated_from_trainer
model-index:
  - name: StatementOfWork_Generator_Omega2
    results: []

StatementOfWork_Generator_Omega2

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9660

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 50
  • eval_batch_size: 50
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 15 0.9874
No log 2.0 30 0.9890
No log 3.0 45 0.9814
No log 4.0 60 0.9776
No log 5.0 75 0.9778
No log 6.0 90 0.9790
No log 7.0 105 0.9780
No log 8.0 120 0.9726
No log 9.0 135 0.9727
No log 10.0 150 0.9716
No log 11.0 165 0.9711
No log 12.0 180 0.9686
No log 13.0 195 0.9690
No log 14.0 210 0.9693
No log 15.0 225 0.9679
No log 16.0 240 0.9670
No log 17.0 255 0.9660
No log 18.0 270 0.9646
No log 19.0 285 0.9661
No log 20.0 300 0.9660

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2