Fathi7ma
/

news_text_summarizer

text2text-generation

text-generation

fine-tuned-model

Eval Results (legacy)

Model card Files Files and versions

news_text_summarizer / README.md

Fathi7ma's picture

Update README.md

4414ea1 verified 4 months ago

|

history blame contribute delete

3.01 kB

	---
	language: en
	library_name: transformers
	license: apache-2.0
	base_model: sshleifer/distilbart-cnn-12-6
	tags:
	- summarization
	- text-generation
	- fine-tuned-model
	- bart
	model-index:
	- name: General Text Summarizer
	results:
	- task:
	type: summarization
	name: Text Summarization
	dataset:
	name: CNN/DailyMail
	type: cnn_dailymail
	metrics:
	- name: Rouge1
	type: rouge
	value: 36.61
	- name: Rouge2
	type: rouge
	value: 16.51
	- name: RougeL
	type: rouge
	value: 26.24
	- name: RougeLsum
	type: rouge
	value: 33.45
	---

	# 🧠 General Text Summarizer

	This model is a fine-tuned version of [sshleifer/distilbart-cnn-12-6](https://huggingface.co/sshleifer/distilbart-cnn-12-6), trained to generate concise and fluent summaries of general English text — including news articles, essays, stories, and blog posts.

	---

	## 🚀 Model Description

	- Base model: DistilBART (CNN/DailyMail)
	- Framework: 🤗 Transformers (PyTorch)
	- Training goal: Summarize text across multiple domains (not limited to one topic)
	- Device optimized: CPU & Apple M-series chips (MPS compatible)

	This model is suitable for lightweight summarization tasks on laptops or limited-resource machines.

	---

	## 🧾 Example Usage

	from transformers import pipeline

	summarizer = pipeline("summarization", model="Fathi7ma/general_text_summarizer_cpu")

	text = """
	Climate change continues to affect weather patterns across the globe.
	Scientists warn that without immediate action, rising temperatures may lead
	to irreversible damage to ecosystems and human livelihoods.
	"""

	summary = summarizer(text, max_length=80, min_length=25, do_sample=False)
	print(summary[0]['summary_text'])

	## Intended uses

	This model can summarize:
	• News articles
	• Research abstracts
	• Reports and blogs
	• Long paragraphs of general English text

	Example domains: general news, education, business summaries, and everyday content.

	## Training

	• Dataset: A subset of CNN/DailyMail, filtered and balanced for general summarization.
	• Approx. 10,000 samples used for CPU-efficient fine-tuning.
	• Texts are trimmed and normalized for readability.

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|
	\| 2.2534 \| 1.0 \| 600 \| 2.1023 \| 36.61 \| 16.51 \| 26.24 \| 33.45 \|


	### Framework versions

	- Transformers 4.57.1
	- Pytorch 2.9.0
	- Datasets 4.3.0
	- Tokenizers 0.22.1