train_record_42_1779207275

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3958
  • Num Input Tokens Seen: 245808128

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5223 0.2500 3906 0.4993 12292032
0.4299 0.5001 7812 0.4629 24620672
0.5401 0.7501 11718 0.4265 36894016
0.3044 1.0002 15624 0.4150 49176512
0.3211 1.2502 19530 0.4428 61465280
0.2995 1.5003 23436 0.4194 73739776
0.3781 1.7503 27342 0.4182 86015936
0.2566 2.0004 31248 0.3958 98341056
0.2249 2.2504 35154 0.4533 110649216
0.1628 2.5005 39060 0.4516 122910592
0.2075 2.7505 42966 0.4763 135222656
0.1128 3.0006 46872 0.4709 147516736
0.0907 3.2506 50778 0.5112 159826368
0.1686 3.5007 54684 0.5226 172084032
0.1204 3.7507 58590 0.4965 184402752
0.1592 4.0008 62496 0.5179 196687936
0.0903 4.2508 66402 0.6059 209017024
0.1712 4.5009 70308 0.5779 221278272
0.1391 4.7509 74214 0.5740 233564288

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
317
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
Input a message to start chatting with rbelanec/train_record_42_1779207275.

Model tree for rbelanec/train_record_42_1779207275

Finetuned
(1747)
this model