train_wic_42_1767887009

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the wic dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3170
  • Num Input Tokens Seen: 4067384

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6066 0.5002 1222 0.3170 203312
0.096 1.0004 2444 0.3510 406960
0.6999 1.5006 3666 0.3758 610320
0.0235 2.0008 4888 0.3447 814208
0.2523 2.5010 6110 0.4392 1017424
0.4419 3.0012 7332 0.3659 1221232
0.2423 3.5014 8554 0.3769 1424736
0.1568 4.0016 9776 0.3933 1628304
0.411 4.5018 10998 0.3961 1831936
0.6195 5.0020 12220 0.3968 2035280
0.27 5.5023 13442 0.3906 2239168
0.1055 6.0025 14664 0.4025 2442032
0.2196 6.5027 15886 0.4531 2645696
0.4288 7.0029 17108 0.4529 2848960
0.0026 7.5031 18330 0.4631 3052368
0.388 8.0033 19552 0.4850 3255736
0.2477 8.5035 20774 0.4756 3459704
0.1481 9.0037 21996 0.4933 3662296
0.3577 9.5039 23218 0.4952 3866024

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_wic_42_1767887009

Adapter
(2399)
this model