harsharajkumar273 commited on
Commit
d45a935
·
verified ·
1 Parent(s): c9013ab

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +83 -0
README.md ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ base_model: harsharajkumar273/T5-Base-Summarization
5
+ tags:
6
+ - text-generation
7
+ - story-generation
8
+ - research-paper
9
+ - seq2seq
10
+ - t5
11
+ - lora
12
+ - peft
13
+ datasets:
14
+ - custom
15
+ metrics:
16
+ - bertscore
17
+ - sbert
18
+ ---
19
+
20
+ # T5-Base-Story-Generation
21
+
22
+ A fine-tuned model for transforming research paper summaries into engaging short stories. This is the second stage of a two-step **Research Paper Simplifier** pipeline, built on top of [harsharajkumar273/T5-Base-Summarization](https://huggingface.co/harsharajkumar273/T5-Base-Summarization).
23
+
24
+ ## Model Description
25
+
26
+ This model takes a summary of a research paper and generates an immersive, narrative-style short story. Fine-tuned using LoRA (PEFT).
27
+
28
+ ## Pipeline
29
+
30
+ ```
31
+ Research Paper ──► [T5-Base-Summarization] ──► Summary ──► [T5-Base-Story-Generation] ──► Story
32
+ ```
33
+
34
+ ## Training Details
35
+
36
+ | Parameter | Value |
37
+ |-----------|-------|
38
+ | Base model | harsharajkumar273/T5-Base-Summarization |
39
+ | Task | Story Generation |
40
+ | Max input length | 512 tokens |
41
+ | Max target length | 256 tokens |
42
+ | Learning rate | 1e-4 |
43
+ | Batch size | 4 |
44
+ | Gradient accumulation steps | 4 |
45
+ | Warmup steps | 500 |
46
+ | Weight decay | 0.01 |
47
+ | Fine-tuning method | LoRA (r=16, alpha=32, targets: q, v) |
48
+
49
+ ## Usage
50
+
51
+ ```python
52
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
53
+
54
+ # Stage 1: Summarize the paper
55
+ sum_tokenizer = AutoTokenizer.from_pretrained("harsharajkumar273/T5-Base-Summarization")
56
+ sum_model = AutoModelForSeq2SeqLM.from_pretrained("harsharajkumar273/T5-Base-Summarization")
57
+
58
+ paper_text = "Your research paper text here..."
59
+ word_count = len(paper_text.split())
60
+ sum_prompt = f"Summarize this part of the research paper to less than {word_count // 10} words:\n{paper_text}"
61
+ sum_inputs = sum_tokenizer(sum_prompt, return_tensors="pt", max_length=1024, truncation=True)
62
+ sum_outputs = sum_model.generate(**sum_inputs, max_length=128, num_beams=4)
63
+ summary = sum_tokenizer.decode(sum_outputs[0], skip_special_tokens=True)
64
+
65
+ # Stage 2: Generate a story from the summary
66
+ story_tokenizer = AutoTokenizer.from_pretrained("harsharajkumar273/T5-Base-Story-Generation")
67
+ story_model = AutoModelForSeq2SeqLM.from_pretrained("harsharajkumar273/T5-Base-Story-Generation")
68
+
69
+ story_inputs = story_tokenizer(summary, return_tensors="pt", max_length=512, truncation=True)
70
+ story_outputs = story_model.generate(**story_inputs, max_length=256, num_beams=4)
71
+ story = story_tokenizer.decode(story_outputs[0], skip_special_tokens=True)
72
+ print(story)
73
+ ```
74
+
75
+ ## Evaluation Metrics
76
+
77
+ Evaluated using BERTScore and SBERTScore on a held-out 10% test split.
78
+
79
+ ## Related Models
80
+
81
+ - [harsharajkumar273/T5-Base-Summarization](https://huggingface.co/harsharajkumar273/T5-Base-Summarization) — previous stage
82
+ - [harsharajkumar273/Bart-Base-Story-Generation](https://huggingface.co/harsharajkumar273/Bart-Base-Story-Generation)
83
+ - [harsharajkumar273/ProphetNet-Large-Story-Generation](https://huggingface.co/harsharajkumar273/ProphetNet-Large-Story-Generation)