srvmishra832
/

Amazon_MultiLingual_Review_Summarization_with_google_mT5_small

+---
+library_name: transformers
+license: apache-2.0
+base_model: google/mt5-small
+tags:
+- summarization
+- generated_from_trainer
+metrics:
+- rouge
+model-index:
+- name: mt5-small
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# mt5-small
+This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.9368
+- Model Preparation Time: 0.0038
+- Rouge1: 16.1955
+- Rouge2: 8.1292
+- Rougel: 15.9218
+- Rougelsum: 15.9516
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5.6e-05
+- train_batch_size: 16
+- eval_batch_size: 16
+- seed: 42
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- num_epochs: 10
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Rouge1  | Rouge2 | Rougel  | Rougelsum |
+|:-------------:|:-----:|:----:|:---------------:|:----------------------:|:-------:|:------:|:-------:|:---------:|
+| 9.0889        | 1.0   | 500  | 3.4117          | 0.0038                 | 12.541  | 5.1023 | 11.9039 | 11.8749   |
+| 4.3977        | 2.0   | 1000 | 3.1900          | 0.0038                 | 15.342  | 6.747  | 14.9223 | 14.8598   |
+| 3.9595        | 3.0   | 1500 | 3.0817          | 0.0038                 | 15.3976 | 6.2063 | 15.0635 | 15.069    |
+| 3.7525        | 4.0   | 2000 | 3.0560          | 0.0038                 | 15.7991 | 6.8536 | 15.4657 | 15.5263   |
+| 3.6191        | 5.0   | 2500 | 3.0048          | 0.0038                 | 16.3791 | 7.3671 | 16.0817 | 16.059    |
+| 3.5155        | 6.0   | 3000 | 2.9779          | 0.0038                 | 16.2311 | 7.5629 | 15.7492 | 15.758    |
+| 3.4497        | 7.0   | 3500 | 2.9663          | 0.0038                 | 16.2554 | 8.1464 | 15.9499 | 15.9152   |
+| 3.3889        | 8.0   | 4000 | 2.9438          | 0.0038                 | 16.5764 | 8.3698 | 16.3225 | 16.2848   |
+| 3.3656        | 9.0   | 4500 | 2.9365          | 0.0038                 | 16.1416 | 8.0266 | 15.8921 | 15.8913   |
+| 3.3562        | 10.0  | 5000 | 2.9368          | 0.0038                 | 16.1955 | 8.1292 | 15.9218 | 15.9516   |
+### Framework versions
+- Transformers 4.50.0
+- Pytorch 2.6.0+cu124
+- Datasets 3.4.1
+- Tokenizers 0.21.1

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "decoder_start_token_id": 0,
+  "eos_token_id": 1,
+  "pad_token_id": 0,
+  "transformers_version": "4.50.0"
+}

runs/Mar27_06-35-11_fb6522921b78/events.out.tfevents.1743057325.fb6522921b78.695.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b02825bfd1b7754a138a5f200fc317e17bc9d81d6c875db0b571bc3245811dfd
-size 13439

 version https://git-lfs.github.com/spec/v1
+oid sha256:53af83ce91149e34f3fe6abeb387fca8c4f791b7b7f08b9aec1c851e1a955d75
+size 14333