Model Card for Model ID

It is a lightweight, instruction-tuned model.

Base: Started with flan-t5-small, which is Google’s most compact version of the T5 model (only about 80 million parameters).

Technique: Then used LoRA (Low-Rank Adaptation).

Model Details

  • Base Model: google/flan-t5-small
  • Training Technique: LoRA (Low-Rank Adaptation)
  • Task: Abstractive Summarization
  • Language: English

Model Description

This model is a fine-tuned version of google/flan-t5-small using Low-Rank Adaptation (LoRA) on the SAMSum dataset.

It is designed to take messenger-style dialogues and condense them into concise, third-person summaries. Because it uses PEFT (Parameter-Efficient Fine-Tuning), the model is lightweight and highly efficient for inference.

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Developed by: [Vishal Roy]
  • Funded by : [Community / Self-funded]
  • Shared by : [Vishal1095]
  • Model type: [Seq2Seq Language Model (Fine-tuned with LoRA adapters)]
  • Language(s) (NLP): [English (en)]
  • License: [Apache 2.0 (Inherited from the base FLAN-T5 model)]
  • Finetuned from model [optional]: [google/flan-t5-small]

How to Get Started with the Model


import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from peft import PeftModel, PeftConfig

base_model_id = "google/flan-t5-small"
peft_model_id = "results" # This should match your trainer.model.save_pretrained path

tokenizer = AutoTokenizer.from_pretrained(peft_model_id)

model = AutoModelForSeq2SeqLM.from_pretrained(base_model_id, device_map="auto")

model = PeftModel.from_pretrained(model, peft_model_id)
model.eval() # Set to evaluation mode

def summarize_text(text):
    # SAMSum/T5 usually performs better with a prompt prefix
    inputs = tokenizer("summarize: " + text, return_tensors="pt", truncation=True).to("cuda")
    
    with torch.no_grad():
        outputs = model.generate(
            input_ids=inputs["input_ids"], 
            max_new_tokens=100,
            do_sample=True,
            top_p=0.9
        )
    
    return tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]

sample_dialogue = """
Amanda: I baked some cookies. Do you want some?
Jerry: Sure! I'll be over in 10 minutes.
Amanda: Great, they are still warm.	
Ashley: Guys, you have to read this book! <file_photo>
Marcus: Why, what's so special about it?
Erin: I think I've already heard about it from someone. Is it really that good?
Ashley: It's the best thing I've ever read! Completely life-changing! It's opened my eyes to a lot of things.
Seamus: Sorry, but I don't like books that are written to change my life. I prefer books that are simply fun to read :P
Marcus: I get what you mean. I feel like some authors are so concentrated on making their books full of wisdom that they completely forget that they should also be readable.
Erin: Do you mean Coelho? XD
Marcus: No, while I'm not a fan of his, at least I've never fallen asleep while reading his books. I meant this one for example: <file_other>
Ashley: Erm, I quite like his books.
Seamus: Did they change your life too? :D
Ashley: Wait, I meant Coelho. I've never read the other guy.
Marcus: Trust me, don't. There are lots of better ways of wasting your time.
Ashley: LOL, okay, I trust you. But the one I posted at the beginning is really good. It's not just some philosophical gibberish, it's actually a crime novel, so there's a lot of action too.
Erin: Does it have a cute detective? ;)
Ashley: Even two of them, actually. Believe me, you won't be able to decide which one to love more!
Erin: Okay, I'm already sold :D
"""

print(f"Summary: {summarize_text(sample_dialogue)}")

Training Hyperparameters

The following hyperparameters were used during training:

Parameter Value
Learning Rate 1e-3
Train Epochs 3
LoRA R 16
LoRA Alpha 32
Target Modules Query (q), Value (v)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Vishal1095/Summarizer-Flan-t5-small-Samsum

Adapter
(74)
this model

Dataset used to train Vishal1095/Summarizer-Flan-t5-small-Samsum