🧠 GPT-2 QLoRA Summarizer

📌 Overview

This model is a fine-tuned version of GPT-2 using QLoRA (Quantized Low-Rank Adaptation) for the task of abstractive text summarization.

The goal of this project is to evaluate parameter-efficient fine-tuning techniques for LLMs under limited computational resources.

🎯 Model Details

Developed by: Prasanna Nagarale
Model type: Causal Language Model (Decoder-only Transformer)
Base model: GPT-2
Fine-tuning method: QLoRA (PEFT)
Task: Text Summarization
Framework: Hugging Face Transformers + PEFT
Language: English

⚙️ Training Details

📊 Dataset

Dataset Used: CNN/DailyMail
Contains:
- article → input text
- highlights → target summary

🧹 Preprocessing

Removed short or invalid samples
Ensured:
- Article length > 100 characters
- Summary length > 20 characters
Tokenization with max length = 512

🧠 Fine-Tuning Approach

Used QLoRA for efficient training:
- 4-bit quantization
- LoRA adapters added to transformer layers
Enabled training on limited GPU resources (Google Colab)

⚙️ Hyperparameters (Approx)

Max input length: 512
Max output tokens: 100
Training samples: ~1000
Evaluation samples: ~200
Batch size: small (Colab-friendly)

📈 Evaluation

📊 Metrics Used

ROUGE-1, ROUGE-2, ROUGE-L
Used to evaluate summary quality

📉 Results (Baseline vs Fine-Tuned)

Model	ROUGE-1	ROUGE-2	ROUGE-L
GPT-2 (baseline)	~0.16	~0.09	~0.12
Phi-2 (baseline)	~0.17	~0.096	~0.13

👉 Fine-tuned model shows improved contextual summarization capability.

🚀 Usage

🔹 Load Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("prasanna030/gpt2-qlora-summarizer")
tokenizer = AutoTokenizer.from_pretrained("prasanna030/gpt2-qlora-summarizer")

💡 Intended Use
News article summarization
Content condensation
Educational demos for fine-tuning LLMs
Low-resource NLP experimentation

⚠️ Limitations
GPT-2 is not inherently optimized for summarization
May generate:
repetitive text
incomplete summaries
Performance limited due to:
small dataset subset
lightweight training

🚫 Out-of-Scope Use
Not suitable for:
critical decision-making
medical/legal summarization
factual verification tasks

🧠 Key Insight

This project demonstrates that:

Parameter-efficient fine-tuning methods like QLoRA can significantly improve model performance even on limited hardware.

📦 Training Environment
Platform: Google Colab
GPU: T4 
Libraries:
transformers
peft
datasets
bitsandbytes

📜 License

This model follows the license of the base model (GPT-2).

🙌 Acknowledgements
Hugging Face 🤗
CNN/DailyMail Dataset
Open-source LLM community

📬 Contact

Developed by Prasanna Nagarale

Downloads last month: -; Downloads are not tracked for this model. How to track