File size: 5,167 Bytes
e16b534
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0a4f9c8
 
 
 
 
 
 
 
 
 
 
e16b534
0a4f9c8
 
 
e16b534
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0a4f9c8
e16b534
 
 
 
 
 
 
 
 
 
 
 
 
0a4f9c8
 
 
 
 
 
 
 
 
 
e16b534
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
library_name: peft
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
- generated_from_trainer
metrics:
- rouge
- bleu
- precision
- recall
- f1
model-index:
- name: Lora_long_T5_sum_approach
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Lora_long_T5_sum_approach

This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.8610
- Rouge1: 0.4804
- Rouge2: 0.2605
- Rougel: 0.4126
- Rougelsum: 0.4141
- Gen Len: 28.18
- Bleu: 0.1536
- Precisions: 0.2428
- Brevity Penalty: 0.772
- Length Ratio: 0.7944
- Translation Length: 970.0
- Reference Length: 1221.0
- Precision: 0.914
- Recall: 0.9024
- F1: 0.9081
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.002
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu   | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1     | Hashcode                                                  |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
| 20.3106       | 1.0   | 7    | 4.7803          | 0.0666 | 0.0125 | 0.0566 | 0.057     | 31.0    | 0.0062 | 0.0248     | 0.5139          | 0.6003       | 733.0              | 1221.0           | 0.7656    | 0.817  | 0.79   | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 6.3299        | 2.0   | 14   | 4.0381          | 0.3252 | 0.1232 | 0.2298 | 0.2295    | 30.3    | 0.0656 | 0.1136     | 0.8066          | 0.8231       | 1005.0             | 1221.0           | 0.8607    | 0.8665 | 0.8635 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 3.881         | 3.0   | 21   | 3.2332          | 0.3357 | 0.141  | 0.2638 | 0.2643    | 28.78   | 0.0835 | 0.1391     | 0.8086          | 0.8247       | 1007.0             | 1221.0           | 0.8722    | 0.8719 | 0.872  | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 3.138         | 4.0   | 28   | 2.8019          | 0.3883 | 0.1806 | 0.3285 | 0.3283    | 29.14   | 0.0964 | 0.1631     | 0.7978          | 0.8157       | 996.0              | 1221.0           | 0.8856    | 0.8835 | 0.8845 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 2.6873        | 5.0   | 35   | 2.2161          | 0.452  | 0.2271 | 0.3854 | 0.3859    | 27.96   | 0.1276 | 0.2114     | 0.781           | 0.8018       | 979.0              | 1221.0           | 0.9067    | 0.8967 | 0.9016 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 2.0184        | 6.0   | 42   | 1.3080          | 0.463  | 0.2487 | 0.4009 | 0.4028    | 27.62   | 0.1481 | 0.239      | 0.764           | 0.7879       | 962.0              | 1221.0           | 0.9111    | 0.8991 | 0.9049 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 1.3413        | 7.0   | 49   | 0.9692          | 0.4678 | 0.2529 | 0.401  | 0.4025    | 28.06   | 0.1473 | 0.2354     | 0.773           | 0.7952       | 971.0              | 1221.0           | 0.9109    | 0.8996 | 0.9051 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 1.0888        | 8.0   | 56   | 0.8996          | 0.4784 | 0.259  | 0.4102 | 0.4118    | 28.2    | 0.1468 | 0.2363     | 0.775           | 0.7969       | 973.0              | 1221.0           | 0.9126    | 0.9013 | 0.9068 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 0.9722        | 9.0   | 63   | 0.8690          | 0.4824 | 0.262  | 0.4112 | 0.4129    | 28.22   | 0.1523 | 0.2416     | 0.776           | 0.7977       | 974.0              | 1221.0           | 0.9131    | 0.9019 | 0.9074 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
| 0.948         | 10.0  | 70   | 0.8610          | 0.4804 | 0.2605 | 0.4126 | 0.4141    | 28.18   | 0.1536 | 0.2428     | 0.772           | 0.7944       | 970.0              | 1221.0           | 0.914     | 0.9024 | 0.9081 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |


### Framework versions

- PEFT 0.15.2
- Transformers 4.53.1
- Pytorch 2.7.0+cu126
- Datasets 3.6.0
- Tokenizers 0.21.1