marvingoenner's picture
Update README.md
914c59b verified
---
language: en
license: apache-2.0
tags:
- summarization
- conversational-ai
- text2text-generation
- t5
datasets:
- knkarthick/samsum
metrics:
- rouge
base_model:
- google-t5/t5-small
---
# 470 Final Project Model -> Summary Model
## Model Overview
This repository contains a fine-tuned **T5-small** model for **abstractive conversational text summarization**.
Given a multi-speaker dialogue, the model generates a concise natural-language summary that captures the main points of the conversation.
- **Base model:** google-t5/t5-small
- **Task:** Abstractive text summarization
- **Model type:** Encoder–decoder transformer (T5)
---
## Dataset
The model was fine-tuned on the **SAMSum** dataset, which consists of chat-style conversations paired with human-written summaries.
- **Dataset name:** knkarthick/samsum
- **Fields:**
- `dialogue`: conversation text (input)
- `summary`: reference summary (target)
- **Splits:** train / validation / test
---
## Training Details
- **Epochs:** 3
- **Learning rate:** 3e-4
- **Batch size:** 8
- **Max input length:** 512 tokens
- **Max target length:** 128 tokens
- **Training framework:** Hugging Face Transformers (`Seq2SeqTrainer`)
- **Hardware:** GPU (Google Colab)
---
## Evaluation
The model was evaluated on the test split of the SAMSum dataset using ROUGE metrics.
- **ROUGE-1:** 0.4538
- **ROUGE-2:** 0.2123
- **ROUGE-L:** 0.3762
(Replace the values with the scores obtained in the notebook.)
---
## Intended Uses
This model can be used for:
- Summarizing chat conversations or dialogues
- Demonstrations of abstractive summarization
- Educational purposes in NLP and machine learning
---
## Limitations
- The model may omit important details in long or complex conversations.
- Generated summaries may occasionally be imprecise or incomplete.
- The model is trained on informal dialogue and may not generalize well to other domains.
---
## How to Use
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
repo_id = "marvingoenner/470finalprojectmodel"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSeq2SeqLM.from_pretrained(repo_id)
dialogue = "Amanda: I baked cookies. Jerry: Sounds great! Amanda: I will bring some tomorrow."
inputs = tokenizer("summarize: " + dialogue, return_tensors="pt", truncation=True)
output_ids = model.generate(**inputs, max_new_tokens=64)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))