eungyu kim
feat: Add T5 Text Summarizer code and README documentation.
bfde142
# T5 Text Summarizer
This repository contains a simple text summarization script using a pre-trained T5 model from the Hugging Face Transformers library. The script demonstrates how to use prompt-based summarization to generate a concise summary of an input text.
## Overview
The main script (`model.py`) defines a function `summarize_text` that:
- Loads the T5 tokenizer and T5 model.
- Adds a summarization prompt (`"summarize: "`) to the input text.
- Tokenizes the input text and truncates it to a maximum length.
- Generates a summary using beam search.
- Decodes the generated token sequence back into human-readable text while skipping special tokens.
## Code Explanation
### Tokenization and Decoding
- **Tokenization:**
The input text is first prefixed with the summarization prompt and then tokenized using:
```python
input_ids = tokenizer.encode(input_text, return_tensors="pt", max_length=512, truncation=True)
---
license: apache-2.0
---