abersbail commited on
Commit
f82ff10
·
verified ·
1 Parent(s): 6ec00a8

Improve story quality with structured composer and educational story mode

Browse files
Files changed (4) hide show
  1. README.md +1 -0
  2. app.py +19 -7
  3. story_gpt/data.py +8 -0
  4. story_gpt/service.py +251 -14
README.md CHANGED
@@ -17,6 +17,7 @@ This is a tiny story-writing GPT-style language model project written in Python
17
  - Word-level tokenizer
18
  - Causal transformer decoder with self-attention
19
  - Story-focused local training corpus
 
20
  - Local CPU training loop
21
  - Checkpoint save and load
22
  - Gradio user interface
 
17
  - Word-level tokenizer
18
  - Causal transformer decoder with self-attention
19
  - Story-focused local training corpus
20
+ - Structured local story composer for cleaner paragraph output
21
  - Local CPU training loop
22
  - Checkpoint save and load
23
  - Gradio user interface
app.py CHANGED
@@ -41,32 +41,44 @@ with gr.Blocks(
41
  - Causal transformer decoder
42
  - Word-level tokenizer
43
  - Story-focused local training corpus
 
44
  - No external pretrained LLM
45
  """
46
  )
47
 
48
  with gr.Tab("Write Story"):
49
  with gr.Row():
50
- title_input = gr.Textbox(label="Title", value="The Lantern in the Rain")
51
  genre_input = gr.Dropdown(
52
  label="Genre",
53
- choices=["Fantasy", "Adventure", "Mystery", "Sci-Fi", "Friendship", "Folktale"],
54
- value="Fantasy",
 
 
 
 
 
 
 
 
55
  )
56
  tone_input = gr.Dropdown(
57
  label="Tone",
58
- choices=["Warm", "Wonder", "Suspense", "Playful", "Calm", "Heroic"],
59
- value="Wonder",
60
  )
61
 
62
  idea_input = gr.Textbox(
63
  label="Story Idea",
64
- value="A child finds a glowing lantern that reveals hidden paths after a storm.",
 
 
 
65
  lines=5,
66
  )
67
  opening_line_input = gr.Textbox(
68
  label="Opening Line",
69
- value="When the rain stopped, the alley behind Mira's house began to shine.",
70
  lines=2,
71
  )
72
 
 
41
  - Causal transformer decoder
42
  - Word-level tokenizer
43
  - Story-focused local training corpus
44
+ - Structured local story composer for clean long-form output
45
  - No external pretrained LLM
46
  """
47
  )
48
 
49
  with gr.Tab("Write Story"):
50
  with gr.Row():
51
+ title_input = gr.Textbox(label="Title", value="The Intelligent Project")
52
  genre_input = gr.Dropdown(
53
  label="Genre",
54
+ choices=[
55
+ "Fantasy",
56
+ "Adventure",
57
+ "Mystery",
58
+ "Sci-Fi",
59
+ "Friendship",
60
+ "Folktale",
61
+ "Educational",
62
+ ],
63
+ value="Educational",
64
  )
65
  tone_input = gr.Dropdown(
66
  label="Tone",
67
+ choices=["Warm", "Wonder", "Suspense", "Playful", "Calm", "Heroic", "Inspiring"],
68
+ value="Inspiring",
69
  )
70
 
71
  idea_input = gr.Textbox(
72
  label="Story Idea",
73
+ value=(
74
+ "A student builds an intelligent AI project step by step using Python, data analysis, "
75
+ "machine learning, deep learning, and language models."
76
+ ),
77
  lines=5,
78
  )
79
  opening_line_input = gr.Textbox(
80
  label="Opening Line",
81
+ value="Arman was a student who loved technology.",
82
  lines=2,
83
  )
84
 
story_gpt/data.py CHANGED
@@ -54,6 +54,14 @@ Tone: Calm
54
  Idea: A moon rabbit sews dreams into a blanket for a sleepless child.
55
  Opening: Above the sleeping city, one window was still lit.
56
  Story: Above the sleeping city, one window was still lit, and from the moon a rabbit noticed at once. The little rabbit gathered threads from cloud edges, silver dust from quiet stars, and a single feather from the night wind. It hopped down a beam of moonlight and sat on the child's windowsill, sewing without a sound. Into one square it stitched a forest with glowing fireflies. Into another it stitched a lake so still it could hold a whole sky. When the blanket touched the child, the room filled with the smell of rain and jasmine, and the tired eyes finally closed. By dawn the rabbit was back on the moon, but one silver thread remained on the pillow. The child kept it for years and slept as if the sky itself had tucked them in.
 
 
 
 
 
 
 
 
57
  """.strip()
58
 
59
 
 
54
  Idea: A moon rabbit sews dreams into a blanket for a sleepless child.
55
  Opening: Above the sleeping city, one window was still lit.
56
  Story: Above the sleeping city, one window was still lit, and from the moon a rabbit noticed at once. The little rabbit gathered threads from cloud edges, silver dust from quiet stars, and a single feather from the night wind. It hopped down a beam of moonlight and sat on the child's windowsill, sewing without a sound. Into one square it stitched a forest with glowing fireflies. Into another it stitched a lake so still it could hold a whole sky. When the blanket touched the child, the room filled with the smell of rain and jasmine, and the tired eyes finally closed. By dawn the rabbit was back on the moon, but one silver thread remained on the pillow. The child kept it for years and slept as if the sky itself had tucked them in.
57
+
58
+ Instruction: Write an inspiring educational story.
59
+ Title: The Intelligent Project
60
+ Genre: Educational
61
+ Tone: Inspiring
62
+ Idea: A student builds an intelligent AI project step by step using Python, data analysis, machine learning, deep learning, and language models.
63
+ Opening: Arman was a student who loved technology.
64
+ Story: Arman was a student who loved technology. One day, he decided to build an intelligent project using Artificial Intelligence and Machine Learning. At the beginning, he did not know much, but he was ready to learn. He started with Python and simple libraries such as NumPy and pandas so he could understand data clearly. After that, he used Matplotlib and Seaborn to visualize patterns and trends, which made his project easier to improve. Then he moved to Machine Learning with scikit-learn and built small predictive models step by step. Later, he explored Deep Learning with PyTorch and TensorFlow, and his project became powerful enough to work with images and text. In the final stage, he used Hugging Face tools so the system could answer questions and understand language. When he looked back at the finished project, Arman realized that the real success was not only the system he built, but the confidence he gained while building it.
65
  """.strip()
66
 
67
 
story_gpt/service.py CHANGED
@@ -1,3 +1,4 @@
 
1
  import shutil
2
 
3
  import torch
@@ -27,6 +28,7 @@ class StoryGPTService:
27
  temperature: float,
28
  top_k: int,
29
  ):
 
30
  clean_prompt = build_story_prompt(
31
  title=title,
32
  genre=genre,
@@ -34,28 +36,49 @@ class StoryGPTService:
34
  idea=idea,
35
  opening_line=opening_line,
36
  )
37
- self._ensure_ready()
38
- encoded = self.tokenizer.encode(clean_prompt, add_bos=True)
39
- idx = torch.tensor(encoded, dtype=torch.long).unsqueeze(0)
40
- self.model.eval()
41
 
42
- with torch.inference_mode():
43
- output = self.model.generate(
44
- idx=idx,
45
  max_new_tokens=max_new_tokens,
46
- eos_id=self.tokenizer.eos_id,
47
  temperature=temperature,
48
  top_k=top_k,
49
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
- text = self.tokenizer.decode(output[0].tolist())
52
- if "Story:" in text:
53
- text = text.split("Story:", 1)[1].strip()
54
  status = (
55
- f"Generated with Story GPT Python. "
56
- f"Architecture=causal transformer, Vocab={self.tokenizer.vocab_size}, Layers={self.config.n_layers}."
57
  )
58
- return text, status
59
 
60
  def train(self, extra_story_text: str, steps: int):
61
  steps = max(1, steps)
@@ -96,6 +119,220 @@ class StoryGPTService:
96
  return
97
  self._load_or_initialize(extra_text="")
98
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
99
  def _load_or_initialize(self, extra_text: str):
100
  checkpoint = self.config.checkpoint_path
101
  if checkpoint.exists():
 
1
+ import re
2
  import shutil
3
 
4
  import torch
 
28
  temperature: float,
29
  top_k: int,
30
  ):
31
+ clean_title = (title or "A New Story").strip()
32
  clean_prompt = build_story_prompt(
33
  title=title,
34
  genre=genre,
 
36
  idea=idea,
37
  opening_line=opening_line,
38
  )
39
+ candidate_story = ""
40
+ used_mode = "structured"
 
 
41
 
42
+ if self.model is not None and self.tokenizer is not None:
43
+ candidate_story = self._generate_with_model(
44
+ prompt=clean_prompt,
45
  max_new_tokens=max_new_tokens,
 
46
  temperature=temperature,
47
  top_k=top_k,
48
  )
49
+ candidate_story = self._clean_generated_story(candidate_story)
50
+ if candidate_story and not self._looks_broken_story(candidate_story):
51
+ used_mode = "model"
52
+
53
+ if used_mode == "model":
54
+ story_text = self._format_story_output(clean_title, candidate_story)
55
+ else:
56
+ story_text = self._compose_story(
57
+ title=clean_title,
58
+ genre=genre,
59
+ tone=tone,
60
+ idea=idea,
61
+ opening_line=opening_line,
62
+ )
63
+
64
+ if self.model is None and self.config.checkpoint_path.exists():
65
+ self._ensure_ready()
66
+
67
+ if self.model is not None and self.tokenizer is not None:
68
+ model_info = (
69
+ f"Architecture=causal transformer, Vocab={self.tokenizer.vocab_size}, Layers={self.config.n_layers}."
70
+ )
71
+ else:
72
+ model_info = (
73
+ f"Architecture=causal transformer, Layers={self.config.n_layers}. "
74
+ f"Model training is available in the Train tab."
75
+ )
76
 
 
 
 
77
  status = (
78
+ f"Generated with Story GPT Python using {used_mode} generation. "
79
+ f"{model_info}"
80
  )
81
+ return story_text, status
82
 
83
  def train(self, extra_story_text: str, steps: int):
84
  steps = max(1, steps)
 
119
  return
120
  self._load_or_initialize(extra_text="")
121
 
122
+ def _generate_with_model(self, prompt: str, max_new_tokens: int, temperature: float, top_k: int) -> str:
123
+ encoded = self.tokenizer.encode(prompt, add_bos=True)
124
+ idx = torch.tensor(encoded, dtype=torch.long).unsqueeze(0)
125
+ self.model.eval()
126
+
127
+ with torch.inference_mode():
128
+ output = self.model.generate(
129
+ idx=idx,
130
+ max_new_tokens=max_new_tokens,
131
+ eos_id=self.tokenizer.eos_id,
132
+ temperature=temperature,
133
+ top_k=top_k,
134
+ )
135
+
136
+ text = self.tokenizer.decode(output[0].tolist())
137
+ if "Story:" in text:
138
+ text = text.split("Story:", 1)[1].strip()
139
+ return text
140
+
141
+ def _clean_generated_story(self, text: str) -> str:
142
+ cleaned_lines = []
143
+ for raw_line in (text or "").splitlines():
144
+ line = raw_line.strip()
145
+ if not line:
146
+ continue
147
+ if re.match(r"^(Instruction|Title|Genre|Tone|Idea|Opening|Story)\s*:", line):
148
+ continue
149
+ cleaned_lines.append(line)
150
+
151
+ cleaned = " ".join(cleaned_lines)
152
+ cleaned = re.sub(r"\s+", " ", cleaned).strip()
153
+ cleaned = re.sub(r"([.!?])\1+", r"\1", cleaned)
154
+ return cleaned
155
+
156
+ def _looks_broken_story(self, text: str) -> bool:
157
+ if not text or len(text) < 220:
158
+ return True
159
+ if re.search(r"\b(Title|Genre|Tone|Idea|Opening|Story)\s*:", text):
160
+ return True
161
+
162
+ tokens = re.findall(r"[A-Za-z']+", text.lower())
163
+ if len(tokens) < 40:
164
+ return True
165
+
166
+ repeated_run = 1
167
+ current_run = 1
168
+ for prev, curr in zip(tokens, tokens[1:]):
169
+ if prev == curr:
170
+ current_run += 1
171
+ repeated_run = max(repeated_run, current_run)
172
+ else:
173
+ current_run = 1
174
+
175
+ unique_ratio = len(set(tokens)) / max(1, len(tokens))
176
+ sentence_count = len(re.findall(r"[.!?]", text))
177
+ return repeated_run >= 4 or unique_ratio < 0.45 or sentence_count < 5
178
+
179
+ def _format_story_output(self, title: str, story: str) -> str:
180
+ paragraphs = self._split_paragraphs(story)
181
+ if not paragraphs:
182
+ paragraphs = [story.strip()]
183
+ return f"Story: {title}\n\n" + "\n\n".join(paragraphs)
184
+
185
+ def _split_paragraphs(self, text: str):
186
+ sentences = re.split(r"(?<=[.!?])\s+", text.strip())
187
+ sentences = [sentence.strip() for sentence in sentences if sentence.strip()]
188
+ paragraphs = []
189
+ chunk = []
190
+
191
+ for sentence in sentences:
192
+ chunk.append(sentence)
193
+ if len(chunk) == 2:
194
+ paragraphs.append(" ".join(chunk))
195
+ chunk = []
196
+
197
+ if chunk:
198
+ paragraphs.append(" ".join(chunk))
199
+ return paragraphs
200
+
201
+ def _compose_story(self, title: str, genre: str, tone: str, idea: str, opening_line: str) -> str:
202
+ if self._is_project_story(title, idea, genre):
203
+ return self._compose_project_story(title, tone, idea, opening_line)
204
+ return self._compose_narrative_story(title, genre, tone, idea, opening_line)
205
+
206
+ def _is_project_story(self, title: str, idea: str, genre: str) -> bool:
207
+ text = f"{title} {idea} {genre}".lower()
208
+ keywords = [
209
+ "project",
210
+ "technology",
211
+ "ai",
212
+ "artificial intelligence",
213
+ "machine learning",
214
+ "deep learning",
215
+ "python",
216
+ "data",
217
+ "model",
218
+ "language",
219
+ ]
220
+ return genre == "Educational" or any(keyword in text for keyword in keywords)
221
+
222
+ def _compose_project_story(self, title: str, tone: str, idea: str, opening_line: str) -> str:
223
+ hero = self._choose_project_name(opening_line)
224
+ goal = self._project_goal(idea)
225
+ energy = {
226
+ "Inspiring": "with patience and belief",
227
+ "Heroic": "with determination and courage",
228
+ "Wonder": "with curiosity and excitement",
229
+ "Calm": "with steady focus",
230
+ }.get(tone, "with steady determination")
231
+
232
+ paragraphs = [
233
+ (
234
+ f"{opening_line.strip()} One day, {hero} decided to build {goal}. "
235
+ f"At the beginning, the path looked difficult, but {hero} moved forward {energy}."
236
+ ),
237
+ (
238
+ f"He started with Python and simple libraries such as NumPy and pandas so he could understand data clearly. "
239
+ f"Step by step, he learned how to collect, clean, and organize information, because he knew strong data is the base of every reliable system."
240
+ ),
241
+ (
242
+ f"After that, he used Matplotlib and Seaborn to visualize patterns and trends. "
243
+ f"The graphs showed him what the data was trying to say, and each chart made the next improvement easier to plan."
244
+ ),
245
+ (
246
+ f"Then he moved into Machine Learning with scikit-learn. "
247
+ f"He trained small models, checked their predictions, and improved them little by little until the results became more accurate and useful."
248
+ ),
249
+ (
250
+ f"Once the basics were strong, he explored Deep Learning with PyTorch and TensorFlow. "
251
+ f"Neural networks helped the project work with richer patterns in text and images, and the system started feeling more intelligent with every experiment."
252
+ ),
253
+ (
254
+ f"In the final stage, he connected modern language tools from Hugging Face so the project could answer questions, generate text, and understand language more naturally. "
255
+ f"When the work was finished, {hero} realized the greatest result was not only the intelligent system itself, but the confidence he had built while creating it."
256
+ ),
257
+ ]
258
+ return f"Story: {title}\n\n" + "\n\n".join(paragraphs)
259
+
260
+ def _choose_project_name(self, opening_line: str) -> str:
261
+ match = re.match(r"^([A-Z][a-z]+)\b", (opening_line or "").strip())
262
+ if match:
263
+ return match.group(1)
264
+ return "Arman"
265
+
266
+ def _project_goal(self, idea: str) -> str:
267
+ clean_idea = (idea or "").strip().rstrip(".")
268
+ if not clean_idea:
269
+ return "an intelligent project"
270
+ lowered = clean_idea.lower()
271
+ match = re.search(r"\bbuilds?\s+(.+)", lowered)
272
+ if match:
273
+ goal = clean_idea[match.start(1) :].strip().rstrip(".")
274
+ goal = re.sub(r"\s+using\s+.+$", "", goal, flags=re.IGNORECASE)
275
+ if goal:
276
+ if goal.lower().startswith(("a ", "an ", "the ")):
277
+ return goal
278
+ return f"a {goal}"
279
+ if lowered.startswith("a ") or lowered.startswith("an ") or lowered.startswith("the "):
280
+ return clean_idea
281
+ return f"a project about {clean_idea}"
282
+
283
+ def _compose_narrative_story(
284
+ self,
285
+ title: str,
286
+ genre: str,
287
+ tone: str,
288
+ idea: str,
289
+ opening_line: str,
290
+ ) -> str:
291
+ scene = {
292
+ "Fantasy": "The world around that moment felt slightly magical, as if the ordinary street had made room for a secret.",
293
+ "Adventure": "It was the kind of beginning that quietly promises a journey.",
294
+ "Mystery": "Something about the air suggested that a hidden answer was waiting nearby.",
295
+ "Sci-Fi": "Even the smallest details seemed charged with future possibility.",
296
+ "Friendship": "It was the kind of moment that becomes important because someone chooses not to face it alone.",
297
+ "Folktale": "The elders would later say that such moments arrive only when the heart is ready.",
298
+ "Educational": "It was the kind of beginning that turns curiosity into steady progress.",
299
+ }.get(genre, "It was the kind of beginning that changes everything.")
300
+ mood = {
301
+ "Warm": "gently",
302
+ "Wonder": "with growing amazement",
303
+ "Suspense": "with careful attention",
304
+ "Playful": "with bright energy",
305
+ "Calm": "with quiet patience",
306
+ "Heroic": "with open courage",
307
+ "Inspiring": "with steady belief",
308
+ }.get(tone, "with steady purpose")
309
+ subject = (idea or "an unexpected challenge").strip().rstrip(".")
310
+ subject_text = subject[0].lower() + subject[1:] if subject else "an unexpected challenge"
311
+
312
+ paragraphs = [
313
+ (
314
+ f"{opening_line.strip()} {scene} The story truly began when the main character faced {subject_text} "
315
+ f"and chose to move forward {mood}."
316
+ ),
317
+ (
318
+ f"At first, the task felt larger than expected. "
319
+ f"Small clues, quiet observations, and a few brave decisions slowly revealed what needed to be done next."
320
+ ),
321
+ (
322
+ f"Along the way, each mistake turned into a lesson. "
323
+ f"The character learned to trust patience, notice details, and use both imagination and effort at the same time."
324
+ ),
325
+ (
326
+ f"When the hardest moment finally arrived, the answer came not from luck alone but from everything learned earlier. "
327
+ f"That was the point when uncertainty changed into action."
328
+ ),
329
+ (
330
+ f"By the end, the world around the character had shifted in a meaningful way. "
331
+ f"The problem was solved, but more importantly, the character had become wiser, stronger, and more ready for whatever came next."
332
+ ),
333
+ ]
334
+ return f"Story: {title}\n\n" + "\n\n".join(paragraphs)
335
+
336
  def _load_or_initialize(self, extra_text: str):
337
  checkpoint = self.config.checkpoint_path
338
  if checkpoint.exists():