train_sst2_789_1760637963

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the sst2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1405
  • Num Input Tokens Seen: 67736640

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.0462 1.0 15154 0.0622 3386112
0.0073 2.0 30308 0.0589 6772240
0.053 3.0 45462 0.0584 10157776
0.0038 4.0 60616 0.0590 13544064
0.0336 5.0 75770 0.0582 16932736
0.0495 6.0 90924 0.0573 20317792
0.1296 7.0 106078 0.0609 23704608
0.0074 8.0 121232 0.0594 27091680
0.1052 9.0 136386 0.0626 30478800
0.076 10.0 151540 0.0622 33864736
0.0051 11.0 166694 0.0662 37250256
0.0007 12.0 181848 0.0686 40636544
0.1467 13.0 197002 0.0768 44025456
0.0389 14.0 212156 0.0783 47407856
0.0045 15.0 227310 0.0875 50794368
0.1181 16.0 242464 0.0990 54183904
0.0012 17.0 257618 0.1037 57571280
0.0008 18.0 272772 0.1107 60960480
0.001 19.0 287926 0.1143 64347696
0.001 20.0 303080 0.1158 67736640

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_sst2_789_1760637963

Adapter
(2154)
this model

Evaluation results