Model save

Browse files

Files changed (3) hide show

README.md +81 -0
generation_config.json +7 -0
model.safetensors +1 -1

README.md ADDED Viewed

	@@ -0,0 +1,81 @@

+---
+tags:
+- generated_from_trainer
+metrics:
+- rouge
+model-index:
+- name: Swin-Bert_Mimic
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# Swin-Bert_Mimic
+This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.1025
+- Rouge1: 35.8104
+- Rouge2: 22.5915
+- Rougel: 34.3056
+- Rougelsum: 35.1416
+- Gen Len: 21.289
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 20
+### Training results
+| Training Loss | Epoch | Step   | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
+|:-------------:|:-----:|:------:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
+| 0.0677        | 1.0   | 7500   | 0.0742          | 34.0952 | 25.4639 | 34.0546 | 34.0407   | 14.412  |
+| 0.0621        | 2.0   | 15000  | 0.0686          | 37.767  | 26.9356 | 37.0596 | 37.4647   | 18.921  |
+| 0.0595        | 3.0   | 22500  | 0.0670          | 38.07   | 26.9203 | 37.1384 | 37.7633   | 22.422  |
+| 0.0536        | 4.0   | 30000  | 0.0655          | 38.064  | 27.0799 | 37.3483 | 37.7981   | 18.476  |
+| 0.0484        | 5.0   | 37500  | 0.0655          | 38.8419 | 27.551  | 37.992  | 38.573    | 19.552  |
+| 0.0436        | 6.0   | 45000  | 0.0672          | 39.2556 | 27.3445 | 38.1583 | 38.9199   | 19.699  |
+| 0.0394        | 7.0   | 52500  | 0.0680          | 38.6881 | 27.1077 | 37.6518 | 38.3678   | 19.322  |
+| 0.0355        | 8.0   | 60000  | 0.0697          | 39.2775 | 27.1638 | 38.1169 | 38.786    | 20.125  |
+| 0.0318        | 9.0   | 67500  | 0.0719          | 38.8973 | 27.0819 | 37.8138 | 38.4725   | 20.237  |
+| 0.0265        | 10.0  | 75000  | 0.0746          | 38.2854 | 26.3015 | 37.0627 | 37.8955   | 20.799  |
+| 0.0241        | 11.0  | 82500  | 0.0769          | 37.7814 | 25.9821 | 36.6626 | 37.3682   | 20.437  |
+| 0.0204        | 12.0  | 90000  | 0.0810          | 37.7945 | 26.012  | 36.5089 | 37.3188   | 20.945  |
+| 0.0172        | 13.0  | 97500  | 0.0846          | 37.5296 | 25.3082 | 36.2752 | 36.9433   | 20.397  |
+| 0.0147        | 14.0  | 105000 | 0.0876          | 36.6675 | 24.5001 | 35.264  | 36.034    | 22.044  |
+| 0.012         | 15.0  | 112500 | 0.0907          | 35.8928 | 23.4706 | 34.3812 | 35.2234   | 21.344  |
+| 0.0103        | 16.0  | 120000 | 0.0947          | 35.6648 | 22.8131 | 34.1013 | 35.0637   | 22.095  |
+| 0.0084        | 17.0  | 127500 | 0.0971          | 35.7702 | 22.9984 | 34.2882 | 35.1362   | 21.501  |
+| 0.0068        | 18.0  | 135000 | 0.0996          | 35.4212 | 22.3513 | 33.9646 | 34.8255   | 22.152  |
+| 0.0058        | 19.0  | 142500 | 0.1019          | 35.9704 | 23.1195 | 34.4672 | 35.3553   | 21.404  |
+| 0.0048        | 20.0  | 150000 | 0.1025          | 35.8104 | 22.5915 | 34.3056 | 35.1416   | 21.289  |
+### Framework versions
+- Transformers 4.37.1
+- Pytorch 1.13.1+cu117
+- Datasets 2.15.0
+- Tokenizers 0.15.1

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "decoder_start_token_id": 101,
+  "eos_token_id": 102,
+  "max_length": 200,
+  "pad_token_id": 0,
+  "transformers_version": "4.37.1"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ca81b1c72885738b551cb7ce3432ff331f76becd57f2d71124855278fe241c79
 size 906269048

 version https://git-lfs.github.com/spec/v1
+oid sha256:41c790b4b4c30828476a2183fc1de35cca6457f3d937f6efc0825fd1f5f685ad
 size 906269048