train_wic_456_1760637805

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the wic dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3739
  • Num Input Tokens Seen: 8434688

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3101 1.0 1222 0.3184 421520
0.2245 2.0 2444 0.2564 843032
0.2296 3.0 3666 0.2356 1265032
0.1747 4.0 4888 0.2350 1687192
0.1759 5.0 6110 0.2247 2108776
0.1888 6.0 7332 0.2343 2530232
0.304 7.0 8554 0.2261 2952296
0.1299 8.0 9776 0.2323 3374128
0.1748 9.0 10998 0.2457 3795712
0.1812 10.0 12220 0.2425 4217816
0.221 11.0 13442 0.3058 4639632
0.0642 12.0 14664 0.3305 5060952
0.1347 13.0 15886 0.3712 5482656
0.1404 14.0 17108 0.4667 5904024
0.0056 15.0 18330 0.6260 6325800
0.0004 16.0 19552 0.7366 6747856
0.0005 17.0 20774 0.8914 7169800
0.0006 18.0 21996 0.9300 7591280
0.0004 19.0 23218 0.9629 8013240
0.0004 20.0 24440 0.9570 8434688

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_wic_456_1760637805

Adapter
(2397)
this model