train_cola_101112_1760638048

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1406
  • Num Input Tokens Seen: 7325256

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.362 1.0 1924 0.1874 366136
0.1343 2.0 3848 0.1626 732880
0.0912 3.0 5772 0.1570 1099816
0.0998 4.0 7696 0.1502 1465464
0.152 5.0 9620 0.1472 1831728
0.1648 6.0 11544 0.1446 2198176
0.1692 7.0 13468 0.1489 2564208
0.1091 8.0 15392 0.1440 2930240
0.1358 9.0 17316 0.1425 3297136
0.0906 10.0 19240 0.1429 3663392
0.1369 11.0 21164 0.1406 4028760
0.1161 12.0 23088 0.1461 4394320
0.1274 13.0 25012 0.1438 4761000
0.1108 14.0 26936 0.1421 5127440
0.1319 15.0 28860 0.1421 5494368
0.1346 16.0 30784 0.1416 5860888
0.1997 17.0 32708 0.1416 6226952
0.0858 18.0 34632 0.1416 6593400
0.082 19.0 36556 0.1418 6959600
0.1285 20.0 38480 0.1421 7325256

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_101112_1760638048

Adapter
(2398)
this model