train_codealpacapy_42_1760664611

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6779
  • Num Input Tokens Seen: 24887720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.4991 1.0 1908 0.4799 1243048
0.3548 2.0 3816 0.4697 2489456
0.4295 3.0 5724 0.4684 3733736
0.515 4.0 7632 0.4609 4976128
0.5856 5.0 9540 0.4563 6219592
0.5946 6.0 11448 0.4549 7467968
0.4937 7.0 13356 0.4562 8709168
0.4793 8.0 15264 0.4539 9958360
0.4566 9.0 17172 0.4536 11204000
0.4859 10.0 19080 0.4540 12446408
0.3744 11.0 20988 0.4555 13691904
0.4353 12.0 22896 0.4597 14937216
0.5904 13.0 24804 0.4587 16179624
0.3804 14.0 26712 0.4592 17425368
0.3276 15.0 28620 0.4626 18668536
0.458 16.0 30528 0.4635 19916008
0.2553 17.0 32436 0.4646 21158120
0.4614 18.0 34344 0.4666 22400368
0.402 19.0 36252 0.4667 23645440
0.301 20.0 38160 0.4669 24887720

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_42_1760664611

Adapter
(2101)
this model

Evaluation results