train_mrpc_42_1774791060

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1332
  • Num Input Tokens Seen: 1780000

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2006 0.2518 104 0.2291 89600
0.1908 0.5036 208 0.1950 178688
0.1721 0.7554 312 0.1814 267968
0.1644 1.0073 416 0.1772 357488
0.1866 1.2591 520 0.1557 446896
0.1538 1.5109 624 0.1509 536176
0.1846 1.7627 728 0.1462 626992
0.181 2.0145 832 0.1442 716344
0.0969 2.2663 936 0.1435 806712
0.1447 2.5182 1040 0.1429 895736
0.0919 2.7700 1144 0.1439 985592
0.1187 3.0218 1248 0.1343 1074624
0.1757 3.2736 1352 0.1521 1164544
0.1714 3.5254 1456 0.1346 1253248
0.1512 3.7772 1560 0.1374 1344000
0.0868 4.0291 1664 0.1423 1432880
0.1456 4.2809 1768 0.1339 1522544
0.1223 4.5327 1872 0.1340 1611760
0.1404 4.7845 1976 0.1332 1702832

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
182
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_42_1774791060

Adapter
(599)
this model