train_sst2_789_1760637965

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the sst2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8324
  • Num Input Tokens Seen: 67736640

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.8968 1.0 15154 0.9032 3386112
0.6697 2.0 30308 0.8532 6772240
0.6965 3.0 45462 0.8426 10157776
0.7205 4.0 60616 0.8370 13544064
0.8718 5.0 75770 0.8324 16932736
0.9504 6.0 90924 0.8377 20317792
0.7258 7.0 106078 0.8358 23704608
0.6478 8.0 121232 0.8385 27091680
0.8947 9.0 136386 0.8364 30478800
0.6324 10.0 151540 0.8355 33864736
0.82 11.0 166694 0.8372 37250256
0.6553 12.0 181848 0.8372 40636544
0.9774 13.0 197002 0.8352 44025456
0.8403 14.0 212156 0.8352 47407856
0.818 15.0 227310 0.8352 50794368
0.8611 16.0 242464 0.8352 54183904
0.7869 17.0 257618 0.8352 57571280
0.7168 18.0 272772 0.8352 60960480
0.8928 19.0 287926 0.8352 64347696
0.7413 20.0 303080 0.8352 67736640

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_sst2_789_1760637965

Adapter
(2105)
this model

Evaluation results