train_conala_456_1760637780

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0248
  • Num Input Tokens Seen: 3043720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.7785 1.0 536 0.6903 152088
0.5883 2.0 1072 0.6959 303784
0.5464 3.0 1608 0.6616 456552
0.568 4.0 2144 0.6424 608704
0.5957 5.0 2680 0.6443 761184
0.723 6.0 3216 0.6424 912912
0.5193 7.0 3752 0.6437 1065128
0.5701 8.0 4288 0.6507 1216496
0.6251 9.0 4824 0.6445 1368880
0.3634 10.0 5360 0.6633 1522016
0.481 11.0 5896 0.6657 1674136
0.6605 12.0 6432 0.6991 1826160
0.4597 13.0 6968 0.7306 1978984
0.3195 14.0 7504 0.7241 2130656
0.1868 15.0 8040 0.7772 2282720
0.4085 16.0 8576 0.7817 2434896
0.4356 17.0 9112 0.8402 2586968
0.225 18.0 9648 0.8539 2738448
0.1774 19.0 10184 0.8778 2891056
0.1177 20.0 10720 0.8808 3043720

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_456_1760637780

Adapter
(2106)
this model

Evaluation results