train_record_42_1779354541

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3557
  • Num Input Tokens Seen: 49166912

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.7605 0.0501 782 0.6366 2474432
0.6538 0.1001 1564 0.5419 4931328
0.5239 0.1502 2346 0.5114 7397056
0.5889 0.2002 3128 0.4917 9832064
0.4277 0.2503 3910 0.4677 12304064
0.3708 0.3004 4692 0.4652 14775488
0.5873 0.3504 5474 0.4432 17259840
0.3556 0.4005 6256 0.4279 19707456
0.3775 0.4505 7038 0.4363 22178432
0.3997 0.5006 7820 0.4178 24646208
0.3435 0.5507 8602 0.4014 27101056
0.4129 0.6007 9384 0.3946 29544576
0.324 0.6508 10166 0.3816 32010176
0.4286 0.7009 10948 0.3744 34475136
0.3097 0.7509 11730 0.3673 36931648
0.3395 0.8010 12512 0.3655 39382144
0.2868 0.8510 13294 0.3591 41847872
0.3511 0.9011 14076 0.3564 44318848
0.2686 0.9512 14858 0.3557 46767552

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_42_1779354541

Finetuned
(1745)
this model