MahdiSUST's picture
Update README.md
999ccf4 verified
metadata
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: mt5-large-bn_sum_total_data
    results: []

mt5-large-bn_sum_total_data

This model is a fine-tuned version of google/mt5-large on this dataset.

Model description

Our fine-tuned MT5-large model has significantly outperformed the state-of-the-art model (CSEBUETNLP/mT5_multilingual_XLSum) for Bangla text summarization. We achieved a ROUGE score of approximately 43, surpassing the previous record of 29.5653. This improvement demonstrates the effectiveness of our approach and highlights the potential for further advancements in Bangla NLP.

Intended uses & limitations

Has a limitation for banglish texts

Training and evaluation data

Dataset

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 6
  • eval_batch_size: 6
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Framework versions

  • Transformers 4.27.1
  • Pytorch 1.13.1+cu116
  • Datasets 2.10.1
  • Tokenizers 0.13.2