train_wic_789_1760637918

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the wic dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3416
  • Num Input Tokens Seen: 8431032

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3708 1.0 1222 0.3733 421768
0.3019 2.0 2444 0.3525 843296
0.3389 3.0 3666 0.3579 1265072
0.3524 4.0 4888 0.3498 1687136
0.3418 5.0 6110 0.3608 2108680
0.3994 6.0 7332 0.3656 2530168
0.3609 7.0 8554 0.3498 2951208
0.3591 8.0 9776 0.3432 3372504
0.3286 9.0 10998 0.3435 3793768
0.3725 10.0 12220 0.3433 4214928
0.366 11.0 13442 0.3461 4636520
0.3456 12.0 14664 0.3423 5057560
0.3513 13.0 15886 0.3416 5479248
0.3413 14.0 17108 0.3492 5901056
0.3378 15.0 18330 0.3466 6323016
0.3189 16.0 19552 0.3435 6744792
0.3355 17.0 20774 0.3431 7165960
0.3558 18.0 21996 0.3427 7587872
0.3131 19.0 23218 0.3424 8009040
0.3187 20.0 24440 0.3444 8431032

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_wic_789_1760637918

Adapter
(2117)
this model

Evaluation results