floflodebilbao commited on
Commit
e1da713
·
verified ·
1 Parent(s): 2d418f3

End of training

Browse files
README.md CHANGED
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
22
 
23
  This model is a fine-tuned version of [allenai/led-base-16384](https://huggingface.co/allenai/led-base-16384) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
- - Loss: 5.5433
26
- - Rouge1: 0.2187
27
- - Rouge2: 0.0623
28
- - Rougel: 0.1668
29
- - Rougelsum: 0.1667
30
- - Gen Len: 20.44
31
- - Bleu: 0.0207
32
- - Precisions: 0.0758
33
- - Brevity Penalty: 0.6408
34
- - Length Ratio: 0.6921
35
- - Translation Length: 836.0
36
  - Reference Length: 1208.0
37
- - Precision: 0.8764
38
- - Recall: 0.8604
39
- - F1: 0.8683
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
41
 
42
  ## Model description
@@ -64,17 +64,23 @@ The following hyperparameters were used during training:
64
  - total_train_batch_size: 16
65
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
66
  - lr_scheduler_type: linear
67
- - num_epochs: 4
68
  - mixed_precision_training: Native AMP
69
 
70
  ### Training results
71
 
72
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
73
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
74
- | No log | 1.0 | 7 | 8.2286 | 0.2197 | 0.0522 | 0.1659 | 0.1664 | 21.0 | 0.0189 | 0.0718 | 0.6574 | 0.7045 | 851.0 | 1208.0 | 0.8675 | 0.8576 | 0.8624 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
75
- | No log | 2.0 | 14 | 7.2616 | 0.214 | 0.052 | 0.1635 | 0.1643 | 21.0 | 0.0123 | 0.0672 | 0.6563 | 0.7036 | 850.0 | 1208.0 | 0.8696 | 0.8595 | 0.8644 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
76
- | No log | 3.0 | 21 | 6.2084 | 0.2388 | 0.0663 | 0.182 | 0.1819 | 20.42 | 0.0192 | 0.0832 | 0.6275 | 0.6821 | 824.0 | 1208.0 | 0.8782 | 0.862 | 0.8699 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
77
- | No log | 4.0 | 28 | 5.5433 | 0.2187 | 0.0623 | 0.1668 | 0.1667 | 20.44 | 0.0207 | 0.0758 | 0.6408 | 0.6921 | 836.0 | 1208.0 | 0.8764 | 0.8604 | 0.8683 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
 
 
 
 
 
 
78
 
79
 
80
  ### Framework versions
 
22
 
23
  This model is a fine-tuned version of [allenai/led-base-16384](https://huggingface.co/allenai/led-base-16384) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
+ - Loss: 3.8042
26
+ - Rouge1: 0.2495
27
+ - Rouge2: 0.0724
28
+ - Rougel: 0.1912
29
+ - Rougelsum: 0.192
30
+ - Gen Len: 20.5
31
+ - Bleu: 0.0232
32
+ - Precisions: 0.0926
33
+ - Brevity Penalty: 0.6016
34
+ - Length Ratio: 0.6631
35
+ - Translation Length: 801.0
36
  - Reference Length: 1208.0
37
+ - Precision: 0.8797
38
+ - Recall: 0.8676
39
+ - F1: 0.8736
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
41
 
42
  ## Model description
 
64
  - total_train_batch_size: 16
65
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
66
  - lr_scheduler_type: linear
67
+ - num_epochs: 10
68
  - mixed_precision_training: Native AMP
69
 
70
  ### Training results
71
 
72
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
73
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
74
+ | No log | 1.0 | 7 | 8.1739 | 0.2255 | 0.0527 | 0.1686 | 0.1688 | 21.0 | 0.0157 | 0.069 | 0.6607 | 0.707 | 854.0 | 1208.0 | 0.8668 | 0.8574 | 0.862 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
75
+ | No log | 2.0 | 14 | 6.9457 | 0.2251 | 0.0588 | 0.1702 | 0.1685 | 20.7 | 0.0171 | 0.0737 | 0.6408 | 0.6921 | 836.0 | 1208.0 | 0.8737 | 0.8597 | 0.8666 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
76
+ | No log | 3.0 | 21 | 5.4862 | 0.2391 | 0.0632 | 0.181 | 0.1805 | 20.52 | 0.021 | 0.0825 | 0.6431 | 0.6937 | 838.0 | 1208.0 | 0.8798 | 0.862 | 0.8708 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
77
+ | No log | 4.0 | 28 | 4.7435 | 0.243 | 0.0758 | 0.1901 | 0.1892 | 20.72 | 0.0266 | 0.0886 | 0.6095 | 0.6689 | 808.0 | 1208.0 | 0.8775 | 0.8662 | 0.8717 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
78
+ | No log | 5.0 | 35 | 4.3805 | 0.2557 | 0.0788 | 0.1924 | 0.1921 | 20.48 | 0.0248 | 0.1003 | 0.5857 | 0.6515 | 787.0 | 1208.0 | 0.8811 | 0.8686 | 0.8747 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
79
+ | No log | 6.0 | 42 | 4.1441 | 0.2485 | 0.0701 | 0.1886 | 0.1894 | 20.52 | 0.0209 | 0.0929 | 0.5982 | 0.6606 | 798.0 | 1208.0 | 0.8816 | 0.868 | 0.8747 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
80
+ | No log | 7.0 | 49 | 3.9952 | 0.2574 | 0.0713 | 0.1994 | 0.1997 | 20.54 | 0.0213 | 0.0954 | 0.6073 | 0.6672 | 806.0 | 1208.0 | 0.8811 | 0.8689 | 0.8749 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
81
+ | No log | 8.0 | 56 | 3.8994 | 0.2524 | 0.067 | 0.192 | 0.192 | 20.58 | 0.0203 | 0.0908 | 0.614 | 0.6722 | 812.0 | 1208.0 | 0.8782 | 0.8675 | 0.8727 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
82
+ | No log | 9.0 | 63 | 3.8355 | 0.2512 | 0.0676 | 0.1917 | 0.1925 | 20.54 | 0.0201 | 0.0901 | 0.6062 | 0.6664 | 805.0 | 1208.0 | 0.8793 | 0.8681 | 0.8736 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
83
+ | No log | 10.0 | 70 | 3.8042 | 0.2495 | 0.0724 | 0.1912 | 0.192 | 20.5 | 0.0232 | 0.0926 | 0.6016 | 0.6631 | 801.0 | 1208.0 | 0.8797 | 0.8676 | 0.8736 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0) |
84
 
85
 
86
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3a07bdda305d53b98d918e37f5cca091f748da54aceec91cc96d8bef28599160
3
  size 647614116
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d035bfc685423dc13f2d7f1a0b04d9c407339063a7c8c60d2ebb26c3d5db0e65
3
  size 647614116
runs/Jul01_10-33-09_tardis/events.out.tfevents.1751358790.tardis.47057.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dbd4b0e7025cc9156c97516effb96b571d5096cb85ed9b44a9ff7bbcdb805991
3
+ size 17254
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b5eea38dfaf301d7be04ff3dd9dff0c43cbfafbb2d93f0c0a654870586775b33
3
  size 5905
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aa21559a837b627e52472d81d7041b89c20c04f5cbe8c6ef952379c192f1a3db
3
  size 5905