Archeane commited on
Commit
744c159
·
1 Parent(s): 1f13a49

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -21
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- license: apache-2.0
3
  tags:
4
  - generated_from_trainer
5
  metrics:
@@ -14,14 +14,12 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  # first_tldr
16
 
17
- This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 2.2734
20
- - Rouge1: 0.2199
21
- - Rouge2: 0.1062
22
- - Rougel: 0.1867
23
- - Rougelsum: 0.1869
24
- - Gen Len: 18.8003
25
 
26
  ## Model description
27
 
@@ -40,24 +38,22 @@ More information needed
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
- - learning_rate: 2e-05
44
- - train_batch_size: 16
45
- - eval_batch_size: 16
46
  - seed: 42
 
 
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
- - lr_scheduler_type: linear
49
- - num_epochs: 5
50
- - mixed_precision_training: Native AMP
51
 
52
  ### Training results
53
 
54
- | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
55
- |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
56
- | No log | 1.0 | 194 | 2.3676 | 0.2112 | 0.0983 | 0.1783 | 0.1782 | 18.8131 |
57
- | No log | 2.0 | 388 | 2.3128 | 0.2152 | 0.1022 | 0.182 | 0.1819 | 18.8209 |
58
- | 2.5218 | 3.0 | 582 | 2.2884 | 0.2185 | 0.1046 | 0.1849 | 0.1849 | 18.8222 |
59
- | 2.5218 | 4.0 | 776 | 2.2772 | 0.2196 | 0.1062 | 0.1865 | 0.1866 | 18.799 |
60
- | 2.5218 | 5.0 | 970 | 2.2734 | 0.2199 | 0.1062 | 0.1867 | 0.1869 | 18.8003 |
61
 
62
 
63
  ### Framework versions
 
1
  ---
2
+ license: bsd-3-clause
3
  tags:
4
  - generated_from_trainer
5
  metrics:
 
14
 
15
  # first_tldr
16
 
17
+ This model is a fine-tuned version of [pszemraj/long-t5-tglobal-base-16384-book-summary](https://huggingface.co/pszemraj/long-t5-tglobal-base-16384-book-summary) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 1.4427
20
+ - Rouge1: 0.3859
21
+ - Rouge2: 0.1307
22
+ - Rougel: 0.2443
 
 
23
 
24
  ## Model description
25
 
 
38
  ### Training hyperparameters
39
 
40
  The following hyperparameters were used during training:
41
+ - learning_rate: 0.0005
42
+ - train_batch_size: 1
43
+ - eval_batch_size: 1
44
  - seed: 42
45
+ - gradient_accumulation_steps: 128
46
+ - total_train_batch_size: 128
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
+ - lr_scheduler_type: cosine
49
+ - num_epochs: 2
 
50
 
51
  ### Training results
52
 
53
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel |
54
+ |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|
55
+ | No log | 0.98 | 47 | 1.4361 | 0.3848 | 0.1323 | 0.2419 |
56
+ | No log | 1.97 | 94 | 1.4427 | 0.3859 | 0.1307 | 0.2443 |
 
 
 
57
 
58
 
59
  ### Framework versions