File size: 5,147 Bytes
c73e37d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29a754e
 
 
 
 
 
 
 
 
 
 
c73e37d
29a754e
 
 
c73e37d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29a754e
 
 
c73e37d
29a754e
 
c73e37d
 
 
 
 
 
 
 
29a754e
 
 
 
 
 
 
 
 
 
c73e37d
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
library_name: peft
license: apache-2.0
base_model: allenai/led-base-16384
tags:
- generated_from_trainer
metrics:
- rouge
- bleu
- precision
- recall
- f1
model-index:
- name: Lora_LED_sum_challenge
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Lora_LED_sum_challenge

This model is a fine-tuned version of [allenai/led-base-16384](https://huggingface.co/allenai/led-base-16384) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 4.1010
- Rouge1: 0.3015
- Rouge2: 0.0982
- Rougel: 0.2325
- Rougelsum: 0.234
- Gen Len: 27.86
- Bleu: 0.0493
- Precisions: 0.1077
- Brevity Penalty: 0.8669
- Length Ratio: 0.875
- Translation Length: 1057.0
- Reference Length: 1208.0
- Precision: 0.8803
- Recall: 0.8763
- F1: 0.8783
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.002
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu   | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1     | Hashcode                                                  |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
| 8.1516        | 1.0   | 7    | 7.5736          | 0.2121 | 0.0491 | 0.1581 | 0.1587    | 32.0    | 0.0174 | 0.0574     | 1.0             | 1.0728       | 1296.0             | 1208.0           | 0.8534    | 0.8583 | 0.8557 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 5.8141        | 2.0   | 14   | 5.0888          | 0.2526 | 0.0765 | 0.1991 | 0.2004    | 26.88   | 0.0316 | 0.0882     | 0.822           | 0.8361       | 1010.0             | 1208.0           | 0.8772    | 0.8715 | 0.8742 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 4.3777        | 3.0   | 21   | 4.4191          | 0.2668 | 0.0907 | 0.2057 | 0.2072    | 24.04   | 0.0421 | 0.1088     | 0.7134          | 0.7475       | 903.0              | 1208.0           | 0.8824    | 0.8719 | 0.877  | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 3.9067        | 4.0   | 28   | 4.2179          | 0.2684 | 0.0813 | 0.2084 | 0.2085    | 25.14   | 0.0378 | 0.1006     | 0.7488          | 0.7757       | 937.0              | 1208.0           | 0.8799    | 0.8705 | 0.8751 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 3.6847        | 5.0   | 35   | 4.1231          | 0.2897 | 0.0861 | 0.2227 | 0.2226    | 29.34   | 0.0362 | 0.0876     | 0.9412          | 0.9429       | 1139.0             | 1208.0           | 0.8751    | 0.8761 | 0.8756 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 3.5317        | 6.0   | 42   | 4.1113          | 0.2644 | 0.0858 | 0.2097 | 0.2107    | 26.66   | 0.0395 | 0.0983     | 0.8063          | 0.8228       | 994.0              | 1208.0           | 0.8826    | 0.8744 | 0.8784 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 3.4303        | 7.0   | 49   | 4.0934          | 0.2866 | 0.0945 | 0.2219 | 0.2226    | 27.02   | 0.0407 | 0.1017     | 0.8413          | 0.8526       | 1030.0             | 1208.0           | 0.8827    | 0.8773 | 0.8799 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 3.3587        | 8.0   | 56   | 4.0800          | 0.2956 | 0.1007 | 0.2287 | 0.2302    | 28.1    | 0.0467 | 0.1031     | 0.8734          | 0.8808       | 1064.0             | 1208.0           | 0.8805    | 0.8756 | 0.878  | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 3.3033        | 9.0   | 63   | 4.0926          | 0.2924 | 0.0982 | 0.2205 | 0.2225    | 27.06   | 0.0481 | 0.1062     | 0.8461          | 0.8568       | 1035.0             | 1208.0           | 0.8813    | 0.8747 | 0.8779 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 3.2828        | 10.0  | 70   | 4.1010          | 0.3015 | 0.0982 | 0.2325 | 0.234     | 27.86   | 0.0493 | 0.1077     | 0.8669          | 0.875        | 1057.0             | 1208.0           | 0.8803    | 0.8763 | 0.8783 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |


### Framework versions

- PEFT 0.15.2
- Transformers 4.53.1
- Pytorch 2.7.0+cu126
- Datasets 3.6.0
- Tokenizers 0.21.1