emilstabil
/

DanSumT5-largeV_84227V_29665

+---
+license: apache-2.0
+base_model: emilstabil/DanSumT5-largeV_84227
+tags:
+- generated_from_trainer
+metrics:
+- rouge
+model-index:
+- name: DanSumT5-largeV_84227V_29665
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# DanSumT5-largeV_84227V_29665
+This model is a fine-tuned version of [emilstabil/DanSumT5-largeV_84227](https://huggingface.co/emilstabil/DanSumT5-largeV_84227) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.4261
+- Rouge1: 32.3558
+- Rouge2: 8.5749
+- Rougel: 19.1338
+- Rougelsum: 29.7211
+- Gen Len: 122.67
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 1
+- eval_batch_size: 1
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 4
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 20
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2 | Rougel  | Rougelsum | Gen Len |
+|:-------------:|:-----:|:----:|:---------------:|:-------:|:------:|:-------:|:---------:|:-------:|
+| No log        | 1.0   | 200  | 2.3019          | 31.5752 | 8.1099 | 18.3519 | 29.1739   | 124.3   |
+| No log        | 2.0   | 400  | 2.3070          | 31.9164 | 8.7613 | 19.0514 | 29.6226   | 124.78  |
+| 1.7988        | 3.0   | 600  | 2.3043          | 32.1339 | 8.5015 | 18.9179 | 29.2816   | 124.63  |
+| 1.7988        | 4.0   | 800  | 2.3136          | 31.6595 | 8.1675 | 18.7604 | 29.2322   | 124.76  |
+| 1.6472        | 5.0   | 1000 | 2.3283          | 32.2728 | 8.3023 | 18.9738 | 29.8167   | 124.88  |
+| 1.6472        | 6.0   | 1200 | 2.3351          | 32.0678 | 8.5944 | 18.7781 | 29.595    | 123.61  |
+| 1.6472        | 7.0   | 1400 | 2.3524          | 32.3119 | 8.2027 | 19.0197 | 29.8383   | 123.66  |
+| 1.5261        | 8.0   | 1600 | 2.3595          | 32.3179 | 8.2699 | 18.869  | 29.7615   | 124.09  |
+| 1.5261        | 9.0   | 1800 | 2.3669          | 31.6759 | 8.2793 | 18.7806 | 29.3134   | 123.97  |
+| 1.4329        | 10.0  | 2000 | 2.3767          | 32.3248 | 8.6537 | 19.0477 | 29.9719   | 124.87  |
+| 1.4329        | 11.0  | 2200 | 2.3960          | 31.6656 | 8.2021 | 18.932  | 29.5141   | 125.2   |
+| 1.4329        | 12.0  | 2400 | 2.4107          | 31.8985 | 8.0749 | 18.8192 | 29.4881   | 124.02  |
+| 1.3681        | 13.0  | 2600 | 2.4135          | 31.5421 | 8.233  | 18.9548 | 29.2292   | 125.34  |
+| 1.3681        | 14.0  | 2800 | 2.4136          | 32.1472 | 8.0262 | 18.8115 | 29.7394   | 123.28  |
+| 1.3366        | 15.0  | 3000 | 2.4125          | 31.8903 | 8.4487 | 18.9192 | 29.3627   | 124.31  |
+| 1.3366        | 16.0  | 3200 | 2.4199          | 31.9141 | 8.1944 | 18.8633 | 29.6124   | 123.63  |
+| 1.3366        | 17.0  | 3400 | 2.4205          | 32.0654 | 8.333  | 18.6209 | 29.5897   | 124.23  |
+| 1.3135        | 18.0  | 3600 | 2.4248          | 32.1559 | 8.4127 | 18.9745 | 29.8147   | 122.88  |
+| 1.3135        | 19.0  | 3800 | 2.4244          | 32.5409 | 8.7406 | 19.0785 | 29.9493   | 123.22  |
+| 1.3006        | 20.0  | 4000 | 2.4261          | 32.3558 | 8.5749 | 19.1338 | 29.7211   | 122.67  |
+### Framework versions
+- Transformers 4.32.1
+- Pytorch 2.1.0
+- Datasets 2.12.0
+- Tokenizers 0.13.3

generation_config.json ADDED Viewed

	@@ -0,0 +1,11 @@

+{
+  "decoder_start_token_id": 0,
+  "eos_token_id": 1,
+  "length_penalty": 5.0,
+  "max_length": 128,
+  "min_length": 9,
+  "no_repeat_ngram_size": 3,
+  "num_beams": 2,
+  "pad_token_id": 0,
+  "transformers_version": "4.32.1"
+}