|
|
--- |
|
|
language: en |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- summarization |
|
|
- conversational-ai |
|
|
- text2text-generation |
|
|
- t5 |
|
|
datasets: |
|
|
- knkarthick/samsum |
|
|
metrics: |
|
|
- rouge |
|
|
base_model: |
|
|
- google-t5/t5-small |
|
|
--- |
|
|
|
|
|
# 470 Final Project Model -> Summary Model |
|
|
|
|
|
## Model Overview |
|
|
This repository contains a fine-tuned **T5-small** model for **abstractive conversational text summarization**. |
|
|
Given a multi-speaker dialogue, the model generates a concise natural-language summary that captures the main points of the conversation. |
|
|
|
|
|
- **Base model:** google-t5/t5-small |
|
|
- **Task:** Abstractive text summarization |
|
|
- **Model type:** Encoder–decoder transformer (T5) |
|
|
|
|
|
--- |
|
|
|
|
|
## Dataset |
|
|
The model was fine-tuned on the **SAMSum** dataset, which consists of chat-style conversations paired with human-written summaries. |
|
|
|
|
|
- **Dataset name:** knkarthick/samsum |
|
|
- **Fields:** |
|
|
- `dialogue`: conversation text (input) |
|
|
- `summary`: reference summary (target) |
|
|
- **Splits:** train / validation / test |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Details |
|
|
- **Epochs:** 3 |
|
|
- **Learning rate:** 3e-4 |
|
|
- **Batch size:** 8 |
|
|
- **Max input length:** 512 tokens |
|
|
- **Max target length:** 128 tokens |
|
|
- **Training framework:** Hugging Face Transformers (`Seq2SeqTrainer`) |
|
|
- **Hardware:** GPU (Google Colab) |
|
|
|
|
|
--- |
|
|
|
|
|
## Evaluation |
|
|
The model was evaluated on the test split of the SAMSum dataset using ROUGE metrics. |
|
|
|
|
|
- **ROUGE-1:** 0.4538 |
|
|
- **ROUGE-2:** 0.2123 |
|
|
- **ROUGE-L:** 0.3762 |
|
|
|
|
|
(Replace the values with the scores obtained in the notebook.) |
|
|
|
|
|
--- |
|
|
|
|
|
## Intended Uses |
|
|
This model can be used for: |
|
|
- Summarizing chat conversations or dialogues |
|
|
- Demonstrations of abstractive summarization |
|
|
- Educational purposes in NLP and machine learning |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
- The model may omit important details in long or complex conversations. |
|
|
- Generated summaries may occasionally be imprecise or incomplete. |
|
|
- The model is trained on informal dialogue and may not generalize well to other domains. |
|
|
|
|
|
--- |
|
|
|
|
|
## How to Use |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
|
|
|
|
|
repo_id = "marvingoenner/470finalprojectmodel" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(repo_id) |
|
|
model = AutoModelForSeq2SeqLM.from_pretrained(repo_id) |
|
|
|
|
|
dialogue = "Amanda: I baked cookies. Jerry: Sounds great! Amanda: I will bring some tomorrow." |
|
|
inputs = tokenizer("summarize: " + dialogue, return_tensors="pt", truncation=True) |
|
|
output_ids = model.generate(**inputs, max_new_tokens=64) |
|
|
|
|
|
print(tokenizer.decode(output_ids[0], skip_special_tokens=True)) |