train_wic_456_1760637806

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the wic dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2461
  • Num Input Tokens Seen: 8434688

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.29 1.0 1222 0.2461 421520
0.2825 2.0 2444 0.2747 843032
0.1207 3.0 3666 0.2840 1265032
0.0992 4.0 4888 0.3801 1687192
0.0989 5.0 6110 0.3892 2108776
0.0005 6.0 7332 0.5116 2530232
0.0947 7.0 8554 0.7093 2952296
0.0 8.0 9776 0.7158 3374128
0.0 9.0 10998 0.6899 3795712
0.0 10.0 12220 0.9105 4217816
0.0051 11.0 13442 0.7330 4639632
0.0 12.0 14664 0.8850 5060952
0.0 13.0 15886 1.0206 5482656
0.0 14.0 17108 1.0846 5904024
0.0 15.0 18330 1.1273 6325800
0.0 16.0 19552 1.1668 6747856
0.0 17.0 20774 1.1959 7169800
0.0 18.0 21996 1.2175 7591280
0.0 19.0 23218 1.2244 8013240
0.0 20.0 24440 1.2365 8434688

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_wic_456_1760637806

Adapter
(2117)
this model

Evaluation results