Model Card for Model ID
It is a lightweight, instruction-tuned model.
Base: Started with flan-t5-small, which is Google’s most compact version of the T5 model (only about 80 million parameters).
Technique: Then used LoRA (Low-Rank Adaptation).
Model Details
- Base Model: google/flan-t5-small
- Training Technique: LoRA (Low-Rank Adaptation)
- Task: Abstractive Summarization
- Language: English
Model Description
This model is a fine-tuned version of google/flan-t5-small using Low-Rank Adaptation (LoRA) on the SAMSum dataset.
It is designed to take messenger-style dialogues and condense them into concise, third-person summaries. Because it uses PEFT (Parameter-Efficient Fine-Tuning), the model is lightweight and highly efficient for inference.
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: [Vishal Roy]
- Funded by : [Community / Self-funded]
- Shared by : [Vishal1095]
- Model type: [Seq2Seq Language Model (Fine-tuned with LoRA adapters)]
- Language(s) (NLP): [English (en)]
- License: [Apache 2.0 (Inherited from the base FLAN-T5 model)]
- Finetuned from model [optional]: [google/flan-t5-small]
How to Get Started with the Model
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from peft import PeftModel, PeftConfig
base_model_id = "google/flan-t5-small"
peft_model_id = "results" # This should match your trainer.model.save_pretrained path
tokenizer = AutoTokenizer.from_pretrained(peft_model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(base_model_id, device_map="auto")
model = PeftModel.from_pretrained(model, peft_model_id)
model.eval() # Set to evaluation mode
def summarize_text(text):
# SAMSum/T5 usually performs better with a prompt prefix
inputs = tokenizer("summarize: " + text, return_tensors="pt", truncation=True).to("cuda")
with torch.no_grad():
outputs = model.generate(
input_ids=inputs["input_ids"],
max_new_tokens=100,
do_sample=True,
top_p=0.9
)
return tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
sample_dialogue = """
Amanda: I baked some cookies. Do you want some?
Jerry: Sure! I'll be over in 10 minutes.
Amanda: Great, they are still warm.
Ashley: Guys, you have to read this book! <file_photo>
Marcus: Why, what's so special about it?
Erin: I think I've already heard about it from someone. Is it really that good?
Ashley: It's the best thing I've ever read! Completely life-changing! It's opened my eyes to a lot of things.
Seamus: Sorry, but I don't like books that are written to change my life. I prefer books that are simply fun to read :P
Marcus: I get what you mean. I feel like some authors are so concentrated on making their books full of wisdom that they completely forget that they should also be readable.
Erin: Do you mean Coelho? XD
Marcus: No, while I'm not a fan of his, at least I've never fallen asleep while reading his books. I meant this one for example: <file_other>
Ashley: Erm, I quite like his books.
Seamus: Did they change your life too? :D
Ashley: Wait, I meant Coelho. I've never read the other guy.
Marcus: Trust me, don't. There are lots of better ways of wasting your time.
Ashley: LOL, okay, I trust you. But the one I posted at the beginning is really good. It's not just some philosophical gibberish, it's actually a crime novel, so there's a lot of action too.
Erin: Does it have a cute detective? ;)
Ashley: Even two of them, actually. Believe me, you won't be able to decide which one to love more!
Erin: Okay, I'm already sold :D
"""
print(f"Summary: {summarize_text(sample_dialogue)}")
Training Hyperparameters
The following hyperparameters were used during training:
| Parameter | Value |
|---|---|
| Learning Rate | 1e-3 |
| Train Epochs | 3 |
| LoRA R | 16 |
| LoRA Alpha | 32 |
| Target Modules | Query (q), Value (v) |
Model tree for Vishal1095/Summarizer-Flan-t5-small-Samsum
Base model
google/flan-t5-small