train_record_42_1773159746

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4153
  • Num Input Tokens Seen: 245808128

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6215 0.2500 3906 0.6393 12292032
0.533 0.5001 7812 0.5500 24620672
0.5013 0.7501 11718 0.5069 36894016
0.3826 1.0002 15624 0.4868 49176512
0.5616 1.2502 19530 0.4701 61465280
0.4093 1.5003 23436 0.4595 73739776
0.4707 1.7503 27342 0.4470 86015936
0.4051 2.0004 31248 0.4402 98341056
0.4188 2.2504 35154 0.4361 110649216
0.3232 2.5005 39060 0.4313 122910592
0.3899 2.7505 42966 0.4261 135222656
0.3395 3.0006 46872 0.4219 147516736
0.2609 3.2506 50778 0.4271 159826368
0.493 3.5007 54684 0.4194 172084032
0.4139 3.7507 58590 0.4182 184402752
0.3616 4.0008 62496 0.4153 196687936
0.322 4.2508 66402 0.4205 209017024
0.3264 4.5009 70308 0.4171 221278272
0.3119 4.7509 74214 0.4178 233564288

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
200
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_42_1773159746

Adapter
(578)
this model