File size: 2,483 Bytes
8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 914c59b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b 8925adb 1133e6b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 |
---
language: en
license: apache-2.0
tags:
- summarization
- conversational-ai
- text2text-generation
- t5
datasets:
- knkarthick/samsum
metrics:
- rouge
base_model:
- google-t5/t5-small
---
# 470 Final Project Model -> Summary Model
## Model Overview
This repository contains a fine-tuned **T5-small** model for **abstractive conversational text summarization**.
Given a multi-speaker dialogue, the model generates a concise natural-language summary that captures the main points of the conversation.
- **Base model:** google-t5/t5-small
- **Task:** Abstractive text summarization
- **Model type:** Encoder–decoder transformer (T5)
---
## Dataset
The model was fine-tuned on the **SAMSum** dataset, which consists of chat-style conversations paired with human-written summaries.
- **Dataset name:** knkarthick/samsum
- **Fields:**
- `dialogue`: conversation text (input)
- `summary`: reference summary (target)
- **Splits:** train / validation / test
---
## Training Details
- **Epochs:** 3
- **Learning rate:** 3e-4
- **Batch size:** 8
- **Max input length:** 512 tokens
- **Max target length:** 128 tokens
- **Training framework:** Hugging Face Transformers (`Seq2SeqTrainer`)
- **Hardware:** GPU (Google Colab)
---
## Evaluation
The model was evaluated on the test split of the SAMSum dataset using ROUGE metrics.
- **ROUGE-1:** 0.4538
- **ROUGE-2:** 0.2123
- **ROUGE-L:** 0.3762
(Replace the values with the scores obtained in the notebook.)
---
## Intended Uses
This model can be used for:
- Summarizing chat conversations or dialogues
- Demonstrations of abstractive summarization
- Educational purposes in NLP and machine learning
---
## Limitations
- The model may omit important details in long or complex conversations.
- Generated summaries may occasionally be imprecise or incomplete.
- The model is trained on informal dialogue and may not generalize well to other domains.
---
## How to Use
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
repo_id = "marvingoenner/470finalprojectmodel"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSeq2SeqLM.from_pretrained(repo_id)
dialogue = "Amanda: I baked cookies. Jerry: Sounds great! Amanda: I will bring some tomorrow."
inputs = tokenizer("summarize: " + dialogue, return_tensors="pt", truncation=True)
output_ids = model.generate(**inputs, max_new_tokens=64)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True)) |