Finetuned Text Summarization Model

This repository contains a fine-tuned version of csebuetnlp/mT5_m2o_english_crossSum for abstractive text summarization.
The model has been optimized specifically for generating concise, coherent English summaries from long-form text.

Model Description

This model is based on the multilingual T5 architecture (mT5) and has been fine-tuned to improve performance on English abstractive summarization tasks. It is capable of generating well-structured summaries that preserve essential meaning while reducing verbosity.

The model uses the encoder–decoder architecture of mT5 and benefits from the pretrained multilingual representations, which can also help generalize to noisy or domain-specific English text.

Intended Uses & Limitations

Intended Uses

Abstractive summarization of news articles, reports, social media text, academic paragraphs, and general long-form English content.
Use in applications such as:
- content condensation tools
- research assistants
- note-generation tools
- automated documentation systems

Limitations

May hallucinate facts in cases where the input is ambiguous or overly short.
Not optimized for:
- non-English summarization
- extractive summarization
- legal, medical, or highly specialized summaries requiring domain accuracy
Summary quality may decline on extremely long inputs unless chunking is applied.

Training and Evaluation Data

This model was trained on a combined dataset of English news, long-form articles, and instructional text.
Data was preprocessed to remove duplicates, extremely short samples, and malformed text.

The validation set consisted of structurally similar English articles to ensure reliable ROUGE evaluation.

Training Procedure

Training Hyperparameters

The following hyperparameters were used:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam (β1 = 0.9, β2 = 0.999, ε = 1e-08)
lr_scheduler_type: linear
num_epochs: 3
mixed_precision_training: Native AMP

Training Results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	RougeL	RougeLSum	Generated Length
4.1823	1.0	190	3.7432	0.1825	0.0547	0.1382	0.1383	33.99
3.5210	2.0	380	3.1028	0.2496	0.0913	0.1987	0.1994	36.41
2.9844	3.0	570	2.8471	0.2874	0.1185	0.2312	0.2320	37.22

Framework Versions

Transformers 4.44.2
PyTorch 2.4.1+cu121
Datasets 3.0.0
Tokenizers 0.19.1

Downloads last month: 1

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RamshaAnwar/summarization_model_trained_on_reduced_dataset

Base model

csebuetnlp/mT5_m2o_english_crossSum

Finetuned

(2)

this model