floflodebilbao's picture
End of training
29a754e verified
metadata
library_name: peft
license: apache-2.0
base_model: allenai/led-base-16384
tags:
  - generated_from_trainer
metrics:
  - rouge
  - bleu
  - precision
  - recall
  - f1
model-index:
  - name: Lora_LED_sum_challenge
    results: []

Lora_LED_sum_challenge

This model is a fine-tuned version of allenai/led-base-16384 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.1010
  • Rouge1: 0.3015
  • Rouge2: 0.0982
  • Rougel: 0.2325
  • Rougelsum: 0.234
  • Gen Len: 27.86
  • Bleu: 0.0493
  • Precisions: 0.1077
  • Brevity Penalty: 0.8669
  • Length Ratio: 0.875
  • Translation Length: 1057.0
  • Reference Length: 1208.0
  • Precision: 0.8803
  • Recall: 0.8763
  • F1: 0.8783
  • Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.002
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Precision Recall F1 Hashcode
8.1516 1.0 7 7.5736 0.2121 0.0491 0.1581 0.1587 32.0 0.0174 0.0574 1.0 1.0728 1296.0 1208.0 0.8534 0.8583 0.8557 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
5.8141 2.0 14 5.0888 0.2526 0.0765 0.1991 0.2004 26.88 0.0316 0.0882 0.822 0.8361 1010.0 1208.0 0.8772 0.8715 0.8742 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
4.3777 3.0 21 4.4191 0.2668 0.0907 0.2057 0.2072 24.04 0.0421 0.1088 0.7134 0.7475 903.0 1208.0 0.8824 0.8719 0.877 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.9067 4.0 28 4.2179 0.2684 0.0813 0.2084 0.2085 25.14 0.0378 0.1006 0.7488 0.7757 937.0 1208.0 0.8799 0.8705 0.8751 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.6847 5.0 35 4.1231 0.2897 0.0861 0.2227 0.2226 29.34 0.0362 0.0876 0.9412 0.9429 1139.0 1208.0 0.8751 0.8761 0.8756 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.5317 6.0 42 4.1113 0.2644 0.0858 0.2097 0.2107 26.66 0.0395 0.0983 0.8063 0.8228 994.0 1208.0 0.8826 0.8744 0.8784 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.4303 7.0 49 4.0934 0.2866 0.0945 0.2219 0.2226 27.02 0.0407 0.1017 0.8413 0.8526 1030.0 1208.0 0.8827 0.8773 0.8799 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.3587 8.0 56 4.0800 0.2956 0.1007 0.2287 0.2302 28.1 0.0467 0.1031 0.8734 0.8808 1064.0 1208.0 0.8805 0.8756 0.878 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.3033 9.0 63 4.0926 0.2924 0.0982 0.2205 0.2225 27.06 0.0481 0.1062 0.8461 0.8568 1035.0 1208.0 0.8813 0.8747 0.8779 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.2828 10.0 70 4.1010 0.3015 0.0982 0.2325 0.234 27.86 0.0493 0.1077 0.8669 0.875 1057.0 1208.0 0.8803 0.8763 0.8783 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Framework versions

  • PEFT 0.15.2
  • Transformers 4.53.1
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1