| # T5 Text Summarizer | |
| This repository contains a simple text summarization script using a pre-trained T5 model from the Hugging Face Transformers library. The script demonstrates how to use prompt-based summarization to generate a concise summary of an input text. | |
| ## Overview | |
| The main script (`model.py`) defines a function `summarize_text` that: | |
| - Loads the T5 tokenizer and T5 model. | |
| - Adds a summarization prompt (`"summarize: "`) to the input text. | |
| - Tokenizes the input text and truncates it to a maximum length. | |
| - Generates a summary using beam search. | |
| - Decodes the generated token sequence back into human-readable text while skipping special tokens. | |
| ## Code Explanation | |
| ### Tokenization and Decoding | |
| - **Tokenization:** | |
| The input text is first prefixed with the summarization prompt and then tokenized using: | |
| ```python | |
| input_ids = tokenizer.encode(input_text, return_tensors="pt", max_length=512, truncation=True) | |
| --- | |
| license: apache-2.0 | |
| --- | |