Toya0421 commited on
Commit
5f01a82
·
verified ·
1 Parent(s): d5d6aa5

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +8 -3
app.py CHANGED
@@ -247,15 +247,20 @@ def extract_main_body_llm(text: str) -> str:
247
  """
248
  prompt = f"""
249
  Extract ONLY the main body text from the following passage.
 
250
  Rules:
251
- - Completely EXCLUDE titles, headings, chapter labels, author names, source information,
252
- footnotes, annotations, introductions, and any non-body content.
253
- - Preserve the original paragraph structure of the main text.
 
 
 
254
  - Insert exactly ONE blank line between paragraphs.
255
  - Do NOT create new section breaks, chapter divisions, or headings.
256
  - Output only the extracted main body text.
257
  - Do not include explanations, comments, or metadata.
258
  - Do not include [TEXT START] and [TEXT END] in the output.
 
259
  [TEXT START]
260
  {text}
261
  [TEXT END]
 
247
  """
248
  prompt = f"""
249
  Extract ONLY the main body text from the following passage.
250
+
251
  Rules:
252
+ - Completely EXCLUDE titles, headings, chapter labels, section numbers,
253
+ author names, source information, footnotes, annotations, and introductions.
254
+ - Treat verse numbers, line numbers, and numbering markers (e.g., "001:001") as non-body content and remove them.
255
+ - Do NOT rewrite, paraphrase, summarize, or otherwise modify the text content.
256
+ - Preserve the original paragraph structure of the main body text.
257
+ (Do not treat line breaks caused by formatting or verse layout as paragraph breaks.)
258
  - Insert exactly ONE blank line between paragraphs.
259
  - Do NOT create new section breaks, chapter divisions, or headings.
260
  - Output only the extracted main body text.
261
  - Do not include explanations, comments, or metadata.
262
  - Do not include [TEXT START] and [TEXT END] in the output.
263
+
264
  [TEXT START]
265
  {text}
266
  [TEXT END]