Guiyom commited on
Commit
319411a
·
verified ·
1 Parent(s): 1bd735b

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +17 -5
app.py CHANGED
@@ -1232,6 +1232,7 @@ Important:
1232
  - You will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
1233
  - You must have at least 20 of such occurences - this is important for the grounding of the report in verifiable facts and sources
1234
  - Don't mention the safe-formatting in the report or at the end, just do it - This is just for regex processing purpose
 
1235
 
1236
  // Important
1237
  - Make it real, with anecdotes from the content
@@ -1243,6 +1244,7 @@ Important:
1243
  result = openai_call(prompt, model="o3-mini", max_tokens_param=10000)
1244
  result = result.strip().strip("```").strip()
1245
  result = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', result)
 
1246
  logging.info(f"The code produced for this focus placeholder:\n{placeholder_text}\n\n {result}\n\n")
1247
  return result
1248
 
@@ -1450,10 +1452,12 @@ Note: the output will be processed through regex and the identifiers removed, bu
1450
  Important:
1451
  - You will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
1452
  - You must have at least 20 of such occurences - this is important for the grounding of the report in verifiable facts and sources
1453
- - Don't mention the safe-formatting in the report or at the end, just do it - This is just for regex processing purpose"""
 
1454
  )
1455
  summary_chunk = openai_call(prompt=chunk_prompt, model="gpt-4o-mini", max_tokens_param=500, temperature=0.7)
1456
  summary_chunk = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', summary_chunk)
 
1457
  global SUMMARIZATION_REQUEST_COUNT, TOTAL_SUMMARIZED_WORDS
1458
  SUMMARIZATION_REQUEST_COUNT += 1
1459
  TOTAL_SUMMARIZED_WORDS += len(summary_chunk.split())
@@ -1478,10 +1482,12 @@ Note: the output will be processed through regex and the identifiers removed, bu
1478
  Important:
1479
  - you will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
1480
  - you must have at least 20 of such occurences
1481
- - don't mention this formatting thing in the report, just do it"""
 
1482
  )
1483
  final_summary = openai_call(prompt=final_prompt, model="gpt-4o-mini", max_tokens_param=target_length, temperature=0.7)
1484
  final_summary = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', final_summary)
 
1485
  return final_summary.strip()
1486
 
1487
 
@@ -1543,6 +1549,7 @@ Important:
1543
  - you will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
1544
  - You must have at least 20 of such occurences - this is important for the grounding of the report in verifiable facts and sources
1545
  - don't mention this formatting thing in the report, just do it
 
1546
 
1547
  IMPORTANT: Format your response as a proper JSON object with these fields:
1548
  - "relevant": "yes" or "no"
@@ -1554,6 +1561,7 @@ IMPORTANT: Format your response as a proper JSON object with these fields:
1554
  try:
1555
  response = openai_call(prompt=prompt, model="gpt-4o-mini", max_tokens_param=max_tokens, temperature=temperature)
1556
  response = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', response)
 
1557
  if not response:
1558
  logging.error("analyze_with_gpt4o: Empty response received from API.")
1559
  return {"relevant": "no", "summary": "", "followups": []}
@@ -1733,6 +1741,7 @@ Important:
1733
  - You will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
1734
  - You must have at least {10 * pages} of such occurences scattered around the report - this is important for the grounding of the report in verifiable facts and sources
1735
  - Don't mention the safe-formatting in the report or at the end, just do it - This is just for regex processing purpose
 
1736
 
1737
  // Sources
1738
  Use the following learnings and merged reference details from a deep research process on:
@@ -1766,6 +1775,11 @@ Note: Exclude the use of html numbered lists format, they don't get correctly im
1766
  <h4> for bulletpoint title (ex: <h4>item to detail:</h4>details ...)
1767
  - Use inline formatting for the tables with homogeneous border and colors
1768
  - Avoid Chinese characters in the output (use the Pinyin version) since they won't display correcly in the pdf (black boxes)
 
 
 
 
 
1769
 
1770
  --------------- Placeholders -----------
1771
  In order to enrich the content, within the core sections (between introduction and conclusion), you can inject some placeholders that will be developped later on.
@@ -1858,9 +1872,6 @@ with:
1858
  Important note for focus placeholders:
1859
  - after [[ put "Focus Placeholder n:" explicitly (with n as the ref number of the focus box created). This will be used in a regex
1860
  - Do not add a title for the Focus placeholder just before the [[...]], the content that will replace the focus placeholder - generated later on - will already include a title
1861
- - For the Table of contents: do not mention the pages, but make each item on separate line
1862
- - The reference table at the end containing the citations details should have 4 columns: the ref number, the title of the document, the author(s, the URL - with hyperlink)
1863
- the name of the reference table should be: "Reference Summary Table"
1864
 
1865
  // Report ending required
1866
  End the report with the following sequence:
@@ -1911,6 +1922,7 @@ Important note: placeholders (visual, graph or focus) can only appear in the sec
1911
  report = openai_call(prompt, model="o3-mini", max_tokens_param=tokentarget)
1912
  # Post-processing
1913
  report = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', report)
 
1914
 
1915
  # If the report is too long, compress it.
1916
  if len(report) > MAX_MESSAGE_LENGTH:
 
1232
  - You will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
1233
  - You must have at least 20 of such occurences - this is important for the grounding of the report in verifiable facts and sources
1234
  - Don't mention the safe-formatting in the report or at the end, just do it - This is just for regex processing purpose
1235
+ - LinkedIn is not a source - you should check the author of the page visited, this is the real source, mention the name of the author and then add "from LinkedIn Pulse"
1236
 
1237
  // Important
1238
  - Make it real, with anecdotes from the content
 
1244
  result = openai_call(prompt, model="o3-mini", max_tokens_param=10000)
1245
  result = result.strip().strip("```").strip()
1246
  result = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', result)
1247
+ result = re.sub(r'\[\{(.*?)\}\]', r'\1', result)
1248
  logging.info(f"The code produced for this focus placeholder:\n{placeholder_text}\n\n {result}\n\n")
1249
  return result
1250
 
 
1452
  Important:
1453
  - You will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
1454
  - You must have at least 20 of such occurences - this is important for the grounding of the report in verifiable facts and sources
1455
+ - Don't mention the safe-formatting in the report or at the end, just do it - This is just for regex processing purpose
1456
+ - LinkedIn is not a source - you should check the author of the page visited, this is the real source, mention the name of the author and then add "from LinkedIn Pulse""""
1457
  )
1458
  summary_chunk = openai_call(prompt=chunk_prompt, model="gpt-4o-mini", max_tokens_param=500, temperature=0.7)
1459
  summary_chunk = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', summary_chunk)
1460
+ summary_chunk = re.sub(r'\[\{(.*?)\}\]', r'\1', summary_chunk)
1461
  global SUMMARIZATION_REQUEST_COUNT, TOTAL_SUMMARIZED_WORDS
1462
  SUMMARIZATION_REQUEST_COUNT += 1
1463
  TOTAL_SUMMARIZED_WORDS += len(summary_chunk.split())
 
1482
  Important:
1483
  - you will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
1484
  - you must have at least 20 of such occurences
1485
+ - don't mention this formatting thing in the report, just do it
1486
+ - LinkedIn is not a source - you should check the author of the page visited, this is the real source, mention the name of the author and then add "from LinkedIn Pulse""""
1487
  )
1488
  final_summary = openai_call(prompt=final_prompt, model="gpt-4o-mini", max_tokens_param=target_length, temperature=0.7)
1489
  final_summary = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', final_summary)
1490
+ final_summary = re.sub(r'\[\{(.*?)\}\]', r'\1', final_summary)
1491
  return final_summary.strip()
1492
 
1493
 
 
1549
  - you will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
1550
  - You must have at least 20 of such occurences - this is important for the grounding of the report in verifiable facts and sources
1551
  - don't mention this formatting thing in the report, just do it
1552
+ - LinkedIn is not a source - you should check the author of the page visited, this is the real source, mention the name of the author and then add "from LinkedIn Pulse"
1553
 
1554
  IMPORTANT: Format your response as a proper JSON object with these fields:
1555
  - "relevant": "yes" or "no"
 
1561
  try:
1562
  response = openai_call(prompt=prompt, model="gpt-4o-mini", max_tokens_param=max_tokens, temperature=temperature)
1563
  response = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', response)
1564
+ response = re.sub(r'\[\{(.*?)\}\]', r'\1', response)
1565
  if not response:
1566
  logging.error("analyze_with_gpt4o: Empty response received from API.")
1567
  return {"relevant": "no", "summary": "", "followups": []}
 
1741
  - You will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
1742
  - You must have at least {10 * pages} of such occurences scattered around the report - this is important for the grounding of the report in verifiable facts and sources
1743
  - Don't mention the safe-formatting in the report or at the end, just do it - This is just for regex processing purpose
1744
+ - LinkedIn is not a source - you should check the author of the page visited, this is the real source, mention the name of the author and then add "from LinkedIn Pulse"
1745
 
1746
  // Sources
1747
  Use the following learnings and merged reference details from a deep research process on:
 
1775
  <h4> for bulletpoint title (ex: <h4>item to detail:</h4>details ...)
1776
  - Use inline formatting for the tables with homogeneous border and colors
1777
  - Avoid Chinese characters in the output (use the Pinyin version) since they won't display correcly in the pdf (black boxes)
1778
+ - For the Table of contents: do not mention the pages, but make each item on separate line
1779
+ - Put "Table of contents" and "Abstract" title in h1 format.
1780
+ - The Table of contents should not mention the abstract and table of contents, the numbering should start from the introduction and end with References Summary Table
1781
+ - The reference table at the end containing the citations details should have 4 columns: the ref number, the title of the document, the author(s, the URL - with hyperlink)
1782
+ - the name of the reference table should be: "Reference Summary Table"
1783
 
1784
  --------------- Placeholders -----------
1785
  In order to enrich the content, within the core sections (between introduction and conclusion), you can inject some placeholders that will be developped later on.
 
1872
  Important note for focus placeholders:
1873
  - after [[ put "Focus Placeholder n:" explicitly (with n as the ref number of the focus box created). This will be used in a regex
1874
  - Do not add a title for the Focus placeholder just before the [[...]], the content that will replace the focus placeholder - generated later on - will already include a title
 
 
 
1875
 
1876
  // Report ending required
1877
  End the report with the following sequence:
 
1922
  report = openai_call(prompt, model="o3-mini", max_tokens_param=tokentarget)
1923
  # Post-processing
1924
  report = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', report)
1925
+ report = re.sub(r'\[\{(.*?)\}\]', r'\1', report)
1926
 
1927
  # If the report is too long, compress it.
1928
  if len(report) > MAX_MESSAGE_LENGTH: