knkarthick/samsum
Viewer • Updated • 16.4k • 9.77k • 40
This repository contains a fine-tuned T5-small model for abstractive conversational text summarization.
Given a multi-speaker dialogue, the model generates a concise natural-language summary that captures the main points of the conversation.
The model was fine-tuned on the SAMSum dataset, which consists of chat-style conversations paired with human-written summaries.
dialogue: conversation text (input)summary: reference summary (target)Seq2SeqTrainer) The model was evaluated on the test split of the SAMSum dataset using ROUGE metrics.
(Replace the values with the scores obtained in the notebook.)
This model can be used for:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
repo_id = "marvingoenner/470finalprojectmodel"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSeq2SeqLM.from_pretrained(repo_id)
dialogue = "Amanda: I baked cookies. Jerry: Sounds great! Amanda: I will bring some tomorrow."
inputs = tokenizer("summarize: " + dialogue, return_tensors="pt", truncation=True)
output_ids = model.generate(**inputs, max_new_tokens=64)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))
Base model
google-t5/t5-small