train_record_789_1769693143

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3057
  • Num Input Tokens Seen: 928969632

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3777 1.0 31242 0.4123 46450496
0.2694 2.0 62484 0.3361 92891936
0.2715 3.0 93726 0.3147 139349440
0.2601 4.0 124968 0.3057 185796864
0.3844 5.0 156210 0.3073 232235744
0.1761 6.0 187452 0.3192 278704192
0.412 7.0 218694 0.3145 325156032
0.276 8.0 249936 0.3270 371599168
0.1668 9.0 281178 0.3507 418050784
0.1615 10.0 312420 0.3446 464504128
0.342 11.0 343662 0.3588 510961472
0.2353 12.0 374904 0.3750 557400608
0.1512 13.0 406146 0.3819 603828768
0.1639 14.0 437388 0.3826 650269472
0.1389 15.0 468630 0.3933 696703648
0.2677 16.0 499872 0.4045 743153504
0.1971 17.0 531114 0.4039 789592640
0.2313 18.0 562356 0.4147 836057504
0.2264 19.0 593598 0.4170 882513984
0.1414 20.0 624840 0.4171 928969632

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_789_1769693143

Adapter
(2202)
this model