---
language: en
license: apache-2.0
tags:
- summarization
- conversational-ai
- text2text-generation
- t5
datasets:
- knkarthick/samsum
metrics:
- rouge
base_model:
- google-t5/t5-small
---

# 470 Final Project Model -> Summary Model 

## Model Overview
This repository contains a fine-tuned **T5-small** model for **abstractive conversational text summarization**.  
Given a multi-speaker dialogue, the model generates a concise natural-language summary that captures the main points of the conversation.

- **Base model:** google-t5/t5-small  
- **Task:** Abstractive text summarization  
- **Model type:** Encoder–decoder transformer (T5)

---

## Dataset
The model was fine-tuned on the **SAMSum** dataset, which consists of chat-style conversations paired with human-written summaries.

- **Dataset name:** knkarthick/samsum  
- **Fields:**
  - `dialogue`: conversation text (input)
  - `summary`: reference summary (target)
- **Splits:** train / validation / test

---

## Training Details
- **Epochs:** 3  
- **Learning rate:** 3e-4  
- **Batch size:** 8  
- **Max input length:** 512 tokens  
- **Max target length:** 128 tokens  
- **Training framework:** Hugging Face Transformers (`Seq2SeqTrainer`)  
- **Hardware:** GPU (Google Colab)

---

## Evaluation
The model was evaluated on the test split of the SAMSum dataset using ROUGE metrics.

- **ROUGE-1:**  0.4538
- **ROUGE-2:**  0.2123
- **ROUGE-L:**  0.3762

(Replace the values with the scores obtained in the notebook.)

---

## Intended Uses
This model can be used for:
- Summarizing chat conversations or dialogues
- Demonstrations of abstractive summarization
- Educational purposes in NLP and machine learning

---

## Limitations
- The model may omit important details in long or complex conversations.
- Generated summaries may occasionally be imprecise or incomplete.
- The model is trained on informal dialogue and may not generalize well to other domains.

---

## How to Use

```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

repo_id = "marvingoenner/470finalprojectmodel"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSeq2SeqLM.from_pretrained(repo_id)

dialogue = "Amanda: I baked cookies. Jerry: Sounds great! Amanda: I will bring some tomorrow."
inputs = tokenizer("summarize: " + dialogue, return_tensors="pt", truncation=True)
output_ids = model.generate(**inputs, max_new_tokens=64)

print(tokenizer.decode(output_ids[0], skip_special_tokens=True))