Finetuned Text Summarization Model
This repository contains a fine-tuned version of csebuetnlp/mT5_m2o_english_crossSum for abstractive text summarization.
The model has been optimized specifically for generating concise, coherent English summaries from long-form text.
Model Description
This model is based on the multilingual T5 architecture (mT5) and has been fine-tuned to improve performance on English abstractive summarization tasks. It is capable of generating well-structured summaries that preserve essential meaning while reducing verbosity.
The model uses the encoder–decoder architecture of mT5 and benefits from the pretrained multilingual representations, which can also help generalize to noisy or domain-specific English text.
Intended Uses & Limitations
Intended Uses
- Abstractive summarization of news articles, reports, social media text, academic paragraphs, and general long-form English content.
- Use in applications such as:
- content condensation tools
- research assistants
- note-generation tools
- automated documentation systems
Limitations
- May hallucinate facts in cases where the input is ambiguous or overly short.
- Not optimized for:
- non-English summarization
- extractive summarization
- legal, medical, or highly specialized summaries requiring domain accuracy
- Summary quality may decline on extremely long inputs unless chunking is applied.
Training and Evaluation Data
This model was trained on a combined dataset of English news, long-form articles, and instructional text.
Data was preprocessed to remove duplicates, extremely short samples, and malformed text.
The validation set consisted of structurally similar English articles to ensure reliable ROUGE evaluation.
Training Procedure
Training Hyperparameters
The following hyperparameters were used:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam (β1 = 0.9, β2 = 0.999, ε = 1e-08)
- lr_scheduler_type: linear
- num_epochs: 3
- mixed_precision_training: Native AMP
Training Results
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | RougeL | RougeLSum | Generated Length |
|---|---|---|---|---|---|---|---|---|
| 4.1823 | 1.0 | 190 | 3.7432 | 0.1825 | 0.0547 | 0.1382 | 0.1383 | 33.99 |
| 3.5210 | 2.0 | 380 | 3.1028 | 0.2496 | 0.0913 | 0.1987 | 0.1994 | 36.41 |
| 2.9844 | 3.0 | 570 | 2.8471 | 0.2874 | 0.1185 | 0.2312 | 0.2320 | 37.22 |
Framework Versions
- Transformers 4.44.2
- PyTorch 2.4.1+cu121
- Datasets 3.0.0
- Tokenizers 0.19.1
- Downloads last month
- 9
Model tree for RamshaAnwar/summarization_model_trained_on_reduced_dataset
Base model
csebuetnlp/mT5_m2o_english_crossSum