| # Model Card for t5_small Summarization Model | |
| ## Model Details | |
| This model is a fine-tuned version of t5_small on the CNN/Daily Mail dataset | |
| for summarization tasks. | |
| ## Training Data | |
| The model was trained on the CNN/Daily Mail dataset. | |
| ## Training Procedure | |
| - **Learning Rate**: 5e-5 | |
| - **Epochs**: 3 | |
| - **Batch Size**: 16 | |
| ## How to Use | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForSeq2SeqLM | |
| tokenizer = AutoTokenizer.from_pretrained("hskang/cnn_dailymail_t5_small") | |
| model = AutoModelForSeq2SeqLM.from_pretrained("hskang/cnn_dailymail_t5_small") | |
| input_text = "upstage tutorial text summarization code" | |
| inputs = tokenizer.encode(input_text, return_tensors="pt") | |
| outputs = model.generate(inputs) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ``` | |
| ## Evaluation | |
| - **ROUGE-1**: 23.45 | |
| - **ROUGE-2**: 7.89 | |
| - **ROUGE-L**: 21.34 | |
| - **BLEU**: 13.56 | |
| ## Limitations | |
| The model may generate biased or inappropriate content due to the nature | |
| of the training data. | |
| It is recommended to use the model with caution and apply necessary filters. | |
| ## Ethical Considerations | |
| Bias: The model may inherit biases present in the training data. | |
| Misuse: The model can be misused to generate misleading or harmful content. | |
| Copyright and License | |
| This model is licensed under the MIT License. | |