test / README.md
rbelanec's picture
End of training
224790f verified
metadata
library_name: peft
license: llama3.2
base_model: meta-llama/Llama-3.2-1B-Instruct
tags:
  - base_model:adapter:meta-llama/Llama-3.2-1B-Instruct
  - llama-factory
  - transformers
pipeline_tag: text-generation
model-index:
  - name: test
    results: []

test

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the wsc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4947
  • Num Input Tokens Seen: 43904

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.9316 0.0522 13 0.9549 2288
1.1199 0.1044 26 0.8822 4656
0.8317 0.1566 39 0.8176 6944
0.7882 0.2088 52 0.7668 9232
0.7909 0.2610 65 0.6973 11424
0.7007 0.3133 78 0.6643 13760
0.7416 0.3655 91 0.6244 16048
0.8212 0.4177 104 0.5990 18272
0.4927 0.4699 117 0.5652 20656
0.5708 0.5221 130 0.5375 23056
0.4855 0.5743 143 0.5332 25312
0.5239 0.6265 156 0.5173 27552
0.4772 0.6787 169 0.5134 29984
0.4958 0.7309 182 0.5051 32080
0.6547 0.7831 195 0.5062 34176
0.6246 0.8353 208 0.5012 36512
0.5174 0.8876 221 0.4947 38912
0.5318 0.9398 234 0.4977 41120
0.445 0.9920 247 0.5010 43600

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4