train_sst2_789_1760637962

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the sst2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0559
  • Num Input Tokens Seen: 67736640

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3456 1.0 15154 0.3410 3386112
0.0128 2.0 30308 0.0674 6772240
0.0664 3.0 45462 0.0585 10157776
0.0085 4.0 60616 0.0584 13544064
0.016 5.0 75770 0.0572 16932736
0.016 6.0 90924 0.0576 20317792
0.1063 7.0 106078 0.0569 23704608
0.0522 8.0 121232 0.0591 27091680
0.1168 9.0 136386 0.0560 30478800
0.0764 10.0 151540 0.0559 33864736
0.0529 11.0 166694 0.0569 37250256
0.0034 12.0 181848 0.0565 40636544
0.1384 13.0 197002 0.0573 44025456
0.0985 14.0 212156 0.0566 47407856
0.0278 15.0 227310 0.0588 50794368
0.0634 16.0 242464 0.0585 54183904
0.0322 17.0 257618 0.0586 57571280
0.035 18.0 272772 0.0585 60960480
0.0613 19.0 287926 0.0585 64347696
0.0152 20.0 303080 0.0586 67736640

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_sst2_789_1760637962

Adapter
(2105)
this model

Evaluation results