chchen's picture
End of training
1c6c008 verified
|
raw
history blame
8.31 kB
metadata
library_name: peft
license: llama3.1
base_model: meta-llama/Llama-3.1-8B-Instruct
tags:
  - llama-factory
  - lora
  - generated_from_trainer
model-index:
  - name: Llama-3.1-8B-Instruct-PsyCourse-fold7
    results: []

Llama-3.1-8B-Instruct-PsyCourse-fold7

This model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct on the course-train-fold7 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0333

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss
0.9681 0.0764 50 0.6998
0.2016 0.1528 100 0.1420
0.0868 0.2292 150 0.0756
0.0662 0.3056 200 0.0589
0.0623 0.3820 250 0.0552
0.0458 0.4584 300 0.0502
0.0681 0.5348 350 0.0517
0.0451 0.6112 400 0.0472
0.0596 0.6875 450 0.0469
0.0478 0.7639 500 0.0419
0.0329 0.8403 550 0.0406
0.0545 0.9167 600 0.0410
0.0586 0.9931 650 0.0452
0.0407 1.0695 700 0.0391
0.029 1.1459 750 0.0369
0.0345 1.2223 800 0.0397
0.0399 1.2987 850 0.0395
0.0419 1.3751 900 0.0393
0.0482 1.4515 950 0.0405
0.0329 1.5279 1000 0.0361
0.0306 1.6043 1050 0.0381
0.0308 1.6807 1100 0.0385
0.0612 1.7571 1150 0.0365
0.0369 1.8335 1200 0.0347
0.0394 1.9099 1250 0.0394
0.0325 1.9862 1300 0.0373
0.0267 2.0626 1350 0.0364
0.0236 2.1390 1400 0.0353
0.0178 2.2154 1450 0.0401
0.0261 2.2918 1500 0.0350
0.024 2.3682 1550 0.0350
0.0215 2.4446 1600 0.0339
0.0316 2.5210 1650 0.0384
0.0264 2.5974 1700 0.0362
0.027 2.6738 1750 0.0379
0.0366 2.7502 1800 0.0333
0.0303 2.8266 1850 0.0336
0.0264 2.9030 1900 0.0353
0.0213 2.9794 1950 0.0371
0.0188 3.0558 2000 0.0368
0.0199 3.1322 2050 0.0359
0.0149 3.2086 2100 0.0406
0.0179 3.2850 2150 0.0434
0.0209 3.3613 2200 0.0373
0.0199 3.4377 2250 0.0443
0.0169 3.5141 2300 0.0365
0.024 3.5905 2350 0.0377
0.0182 3.6669 2400 0.0404
0.0138 3.7433 2450 0.0410
0.0197 3.8197 2500 0.0382
0.0201 3.8961 2550 0.0362
0.0129 3.9725 2600 0.0420
0.0084 4.0489 2650 0.0419
0.01 4.1253 2700 0.0444
0.009 4.2017 2750 0.0554
0.0075 4.2781 2800 0.0449
0.0123 4.3545 2850 0.0445
0.0086 4.4309 2900 0.0446
0.0128 4.5073 2950 0.0410
0.011 4.5837 3000 0.0446
0.0079 4.6600 3050 0.0467
0.0063 4.7364 3100 0.0447
0.0081 4.8128 3150 0.0446
0.0092 4.8892 3200 0.0423
0.0105 4.9656 3250 0.0434
0.0049 5.0420 3300 0.0503
0.006 5.1184 3350 0.0521
0.0037 5.1948 3400 0.0545
0.0032 5.2712 3450 0.0743
0.0047 5.3476 3500 0.0558
0.0037 5.4240 3550 0.0517
0.0054 5.5004 3600 0.0526
0.0053 5.5768 3650 0.0507
0.0094 5.6532 3700 0.0504
0.0067 5.7296 3750 0.0492
0.0038 5.8060 3800 0.0524
0.0091 5.8824 3850 0.0443
0.006 5.9587 3900 0.0490
0.0042 6.0351 3950 0.0518
0.0023 6.1115 4000 0.0607
0.0012 6.1879 4050 0.0625
0.0046 6.2643 4100 0.0562
0.0017 6.3407 4150 0.0639
0.0029 6.4171 4200 0.0585
0.0023 6.4935 4250 0.0586
0.0021 6.5699 4300 0.0601
0.0004 6.6463 4350 0.0675
0.0021 6.7227 4400 0.0667
0.0024 6.7991 4450 0.0701
0.0022 6.8755 4500 0.0674
0.0033 6.9519 4550 0.0609
0.0009 7.0283 4600 0.0551
0.0011 7.1047 4650 0.0607
0.0014 7.1811 4700 0.0657
0.0003 7.2574 4750 0.0645
0.0013 7.3338 4800 0.0692
0.0004 7.4102 4850 0.0737
0.0004 7.4866 4900 0.0669
0.0028 7.5630 4950 0.0651
0.0008 7.6394 5000 0.0633
0.0014 7.7158 5050 0.0643
0.0012 7.7922 5100 0.0659
0.0006 7.8686 5150 0.0663
0.0005 7.9450 5200 0.0700
0.0007 8.0214 5250 0.0659
0.0009 8.0978 5300 0.0691
0.0002 8.1742 5350 0.0709
0.0004 8.2506 5400 0.0735
0.0006 8.3270 5450 0.0750
0.0007 8.4034 5500 0.0772
0.0001 8.4798 5550 0.0785
0.0007 8.5561 5600 0.0807
0.0013 8.6325 5650 0.0787
0.0005 8.7089 5700 0.0770
0.0012 8.7853 5750 0.0768
0.0004 8.8617 5800 0.0756
0.0032 8.9381 5850 0.0763
0.0008 9.0145 5900 0.0764
0.0002 9.0909 5950 0.0777
0.0002 9.1673 6000 0.0781
0.0003 9.2437 6050 0.0786
0.0007 9.3201 6100 0.0790
0.0002 9.3965 6150 0.0798
0.0003 9.4729 6200 0.0796
0.0008 9.5493 6250 0.0798
0.0002 9.6257 6300 0.0800
0.0012 9.7021 6350 0.0801
0.0008 9.7785 6400 0.0801
0.0001 9.8549 6450 0.0801
0.0005 9.9312 6500 0.0802

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3