--- language: en license: apache-2.0 tags: - summarization - conversational-ai - text2text-generation - t5 datasets: - knkarthick/samsum metrics: - rouge base_model: - google-t5/t5-small --- # 470 Final Project Model -> Summary Model ## Model Overview This repository contains a fine-tuned **T5-small** model for **abstractive conversational text summarization**. Given a multi-speaker dialogue, the model generates a concise natural-language summary that captures the main points of the conversation. - **Base model:** google-t5/t5-small - **Task:** Abstractive text summarization - **Model type:** Encoder–decoder transformer (T5) --- ## Dataset The model was fine-tuned on the **SAMSum** dataset, which consists of chat-style conversations paired with human-written summaries. - **Dataset name:** knkarthick/samsum - **Fields:** - `dialogue`: conversation text (input) - `summary`: reference summary (target) - **Splits:** train / validation / test --- ## Training Details - **Epochs:** 3 - **Learning rate:** 3e-4 - **Batch size:** 8 - **Max input length:** 512 tokens - **Max target length:** 128 tokens - **Training framework:** Hugging Face Transformers (`Seq2SeqTrainer`) - **Hardware:** GPU (Google Colab) --- ## Evaluation The model was evaluated on the test split of the SAMSum dataset using ROUGE metrics. - **ROUGE-1:** 0.4538 - **ROUGE-2:** 0.2123 - **ROUGE-L:** 0.3762 (Replace the values with the scores obtained in the notebook.) --- ## Intended Uses This model can be used for: - Summarizing chat conversations or dialogues - Demonstrations of abstractive summarization - Educational purposes in NLP and machine learning --- ## Limitations - The model may omit important details in long or complex conversations. - Generated summaries may occasionally be imprecise or incomplete. - The model is trained on informal dialogue and may not generalize well to other domains. --- ## How to Use ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM repo_id = "marvingoenner/470finalprojectmodel" tokenizer = AutoTokenizer.from_pretrained(repo_id) model = AutoModelForSeq2SeqLM.from_pretrained(repo_id) dialogue = "Amanda: I baked cookies. Jerry: Sounds great! Amanda: I will bring some tomorrow." inputs = tokenizer("summarize: " + dialogue, return_tensors="pt", truncation=True) output_ids = model.generate(**inputs, max_new_tokens=64) print(tokenizer.decode(output_ids[0], skip_special_tokens=True))