train_record_42_1767887029

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3422
  • Num Input Tokens Seen: 437806496

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5063 0.5 31242 0.4476 21898848
0.1726 1.0 62484 0.3965 43776656
0.2595 1.5 93726 0.3683 65652560
0.4387 2.0 124968 0.3669 87565488
0.2084 2.5 156210 0.3661 109452304
0.1666 3.0 187452 0.3422 131339088
0.4472 3.5 218694 0.3466 153229552
0.4956 4.0 249936 0.3591 175123504
0.3682 4.5 281178 0.3471 197013152
0.2085 5.0 312420 0.3514 218911872
0.1899 5.5 343662 0.3630 240801840
0.1348 6.0 374904 0.3488 262683472
0.3945 6.5 406146 0.3498 284576304
0.1822 7.0 437388 0.3578 306462144
0.3186 7.5 468630 0.3653 328347904
0.2993 8.0 499872 0.3467 350243504
0.674 8.5 531114 0.3593 372129760
0.5828 9.0 562356 0.3585 394023312
0.5389 9.5 593598 0.3613 415924976
0.3445 10.0 624840 0.3607 437806496

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_42_1767887029

Adapter
(2392)
this model