train_record_42_1773159747

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4154
  • Num Input Tokens Seen: 245808128

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6189 0.2500 3906 0.6394 12292032
0.5393 0.5001 7812 0.5500 24620672
0.5028 0.7501 11718 0.5072 36894016
0.3815 1.0002 15624 0.4872 49176512
0.5595 1.2502 19530 0.4700 61465280
0.4085 1.5003 23436 0.4594 73739776
0.4661 1.7503 27342 0.4472 86015936
0.4054 2.0004 31248 0.4402 98341056
0.4217 2.2504 35154 0.4363 110649216
0.3233 2.5005 39060 0.4314 122910592
0.3875 2.7505 42966 0.4261 135222656
0.336 3.0006 46872 0.4220 147516736
0.2617 3.2506 50778 0.4272 159826368
0.4998 3.5007 54684 0.4193 172084032
0.4191 3.7507 58590 0.4184 184402752
0.3614 4.0008 62496 0.4154 196687936
0.3211 4.2508 66402 0.4205 209017024
0.3259 4.5009 70308 0.4172 221278272
0.314 4.7509 74214 0.4179 233564288

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
195
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_42_1773159747

Adapter
(578)
this model