Instructions to use harsharajkumar273/T5-Base-Story-Generation with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use harsharajkumar273/T5-Base-Story-Generation with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
| language: en | |
| license: apache-2.0 | |
| base_model: harsharajkumar273/T5-Base-Summarization | |
| tags: | |
| - text-generation | |
| - story-generation | |
| - research-paper | |
| - seq2seq | |
| - t5 | |
| - lora | |
| - peft | |
| datasets: | |
| - custom | |
| metrics: | |
| - bertscore | |
| - sbert | |
| # T5-Base-Story-Generation | |
| A fine-tuned model for transforming research paper summaries into engaging short stories. This is the second stage of a two-step **Research Paper Simplifier** pipeline, built on top of [harsharajkumar273/T5-Base-Summarization](https://huggingface.co/harsharajkumar273/T5-Base-Summarization). | |
| ## Model Description | |
| This model takes a summary of a research paper and generates an immersive, narrative-style short story. Fine-tuned using LoRA (PEFT). | |
| ## Pipeline | |
| ``` | |
| Research Paper βββΊ [T5-Base-Summarization] βββΊ Summary βββΊ [T5-Base-Story-Generation] βββΊ Story | |
| ``` | |
| ## Training Details | |
| | Parameter | Value | | |
| |-----------|-------| | |
| | Base model | harsharajkumar273/T5-Base-Summarization | | |
| | Task | Story Generation | | |
| | Max input length | 512 tokens | | |
| | Max target length | 256 tokens | | |
| | Learning rate | 1e-4 | | |
| | Batch size | 4 | | |
| | Gradient accumulation steps | 4 | | |
| | Warmup steps | 500 | | |
| | Weight decay | 0.01 | | |
| | Fine-tuning method | LoRA (r=16, alpha=32, targets: q, v) | | |
| ## Usage | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForSeq2SeqLM | |
| # Stage 1: Summarize the paper | |
| sum_tokenizer = AutoTokenizer.from_pretrained("harsharajkumar273/T5-Base-Summarization") | |
| sum_model = AutoModelForSeq2SeqLM.from_pretrained("harsharajkumar273/T5-Base-Summarization") | |
| paper_text = "Your research paper text here..." | |
| word_count = len(paper_text.split()) | |
| sum_prompt = f"Summarize this part of the research paper to less than {word_count // 10} words:\n{paper_text}" | |
| sum_inputs = sum_tokenizer(sum_prompt, return_tensors="pt", max_length=1024, truncation=True) | |
| sum_outputs = sum_model.generate(**sum_inputs, max_length=128, num_beams=4) | |
| summary = sum_tokenizer.decode(sum_outputs[0], skip_special_tokens=True) | |
| # Stage 2: Generate a story from the summary | |
| story_tokenizer = AutoTokenizer.from_pretrained("harsharajkumar273/T5-Base-Story-Generation") | |
| story_model = AutoModelForSeq2SeqLM.from_pretrained("harsharajkumar273/T5-Base-Story-Generation") | |
| story_inputs = story_tokenizer(summary, return_tensors="pt", max_length=512, truncation=True) | |
| story_outputs = story_model.generate(**story_inputs, max_length=256, num_beams=4) | |
| story = story_tokenizer.decode(story_outputs[0], skip_special_tokens=True) | |
| print(story) | |
| ``` | |
| ## Evaluation Metrics | |
| Evaluated using BERTScore and SBERTScore on a held-out 10% test split. | |
| ## Related Models | |
| - [harsharajkumar273/T5-Base-Summarization](https://huggingface.co/harsharajkumar273/T5-Base-Summarization) β previous stage | |
| - [harsharajkumar273/Bart-Base-Story-Generation](https://huggingface.co/harsharajkumar273/Bart-Base-Story-Generation) | |
| - [harsharajkumar273/ProphetNet-Large-Story-Generation](https://huggingface.co/harsharajkumar273/ProphetNet-Large-Story-Generation) | |