Finetuned Text Summarization Model

This repository contains a fine-tuned version of csebuetnlp/mT5_m2o_english_crossSum for abstractive text summarization.
The model has been optimized specifically for generating concise, coherent English summaries from long-form text.


Model Description

This model is based on the multilingual T5 architecture (mT5) and has been fine-tuned to improve performance on English abstractive summarization tasks. It is capable of generating well-structured summaries that preserve essential meaning while reducing verbosity.

The model uses the encoder–decoder architecture of mT5 and benefits from the pretrained multilingual representations, which can also help generalize to noisy or domain-specific English text.


Intended Uses & Limitations

Intended Uses

  • Abstractive summarization of news articles, reports, social media text, academic paragraphs, and general long-form English content.
  • Use in applications such as:
    • content condensation tools
    • research assistants
    • note-generation tools
    • automated documentation systems

Limitations

  • May hallucinate facts in cases where the input is ambiguous or overly short.
  • Not optimized for:
    • non-English summarization
    • extractive summarization
    • legal, medical, or highly specialized summaries requiring domain accuracy
  • Summary quality may decline on extremely long inputs unless chunking is applied.

Training and Evaluation Data

This model was trained on a combined dataset of English news, long-form articles, and instructional text.
Data was preprocessed to remove duplicates, extremely short samples, and malformed text.

The validation set consisted of structurally similar English articles to ensure reliable ROUGE evaluation.


Training Procedure

Training Hyperparameters

The following hyperparameters were used:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam (β1 = 0.9, β2 = 0.999, ε = 1e-08)
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training Results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 RougeL RougeLSum Generated Length
4.1823 1.0 190 3.7432 0.1825 0.0547 0.1382 0.1383 33.99
3.5210 2.0 380 3.1028 0.2496 0.0913 0.1987 0.1994 36.41
2.9844 3.0 570 2.8471 0.2874 0.1185 0.2312 0.2320 37.22

Framework Versions

  • Transformers 4.44.2
  • PyTorch 2.4.1+cu121
  • Datasets 3.0.0
  • Tokenizers 0.19.1

Downloads last month
9
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RamshaAnwar/summarization_model_trained_on_reduced_dataset

Finetuned
(2)
this model