Spaces:
Running
Running
Update app.py
Browse files
app.py
CHANGED
|
@@ -1232,6 +1232,7 @@ Important:
|
|
| 1232 |
- You will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
|
| 1233 |
- You must have at least 20 of such occurences - this is important for the grounding of the report in verifiable facts and sources
|
| 1234 |
- Don't mention the safe-formatting in the report or at the end, just do it - This is just for regex processing purpose
|
|
|
|
| 1235 |
|
| 1236 |
// Important
|
| 1237 |
- Make it real, with anecdotes from the content
|
|
@@ -1243,6 +1244,7 @@ Important:
|
|
| 1243 |
result = openai_call(prompt, model="o3-mini", max_tokens_param=10000)
|
| 1244 |
result = result.strip().strip("```").strip()
|
| 1245 |
result = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', result)
|
|
|
|
| 1246 |
logging.info(f"The code produced for this focus placeholder:\n{placeholder_text}\n\n {result}\n\n")
|
| 1247 |
return result
|
| 1248 |
|
|
@@ -1450,10 +1452,12 @@ Note: the output will be processed through regex and the identifiers removed, bu
|
|
| 1450 |
Important:
|
| 1451 |
- You will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
|
| 1452 |
- You must have at least 20 of such occurences - this is important for the grounding of the report in verifiable facts and sources
|
| 1453 |
-
- Don't mention the safe-formatting in the report or at the end, just do it - This is just for regex processing purpose
|
|
|
|
| 1454 |
)
|
| 1455 |
summary_chunk = openai_call(prompt=chunk_prompt, model="gpt-4o-mini", max_tokens_param=500, temperature=0.7)
|
| 1456 |
summary_chunk = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', summary_chunk)
|
|
|
|
| 1457 |
global SUMMARIZATION_REQUEST_COUNT, TOTAL_SUMMARIZED_WORDS
|
| 1458 |
SUMMARIZATION_REQUEST_COUNT += 1
|
| 1459 |
TOTAL_SUMMARIZED_WORDS += len(summary_chunk.split())
|
|
@@ -1478,10 +1482,12 @@ Note: the output will be processed through regex and the identifiers removed, bu
|
|
| 1478 |
Important:
|
| 1479 |
- you will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
|
| 1480 |
- you must have at least 20 of such occurences
|
| 1481 |
-
- don't mention this formatting thing in the report, just do it
|
|
|
|
| 1482 |
)
|
| 1483 |
final_summary = openai_call(prompt=final_prompt, model="gpt-4o-mini", max_tokens_param=target_length, temperature=0.7)
|
| 1484 |
final_summary = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', final_summary)
|
|
|
|
| 1485 |
return final_summary.strip()
|
| 1486 |
|
| 1487 |
|
|
@@ -1543,6 +1549,7 @@ Important:
|
|
| 1543 |
- you will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
|
| 1544 |
- You must have at least 20 of such occurences - this is important for the grounding of the report in verifiable facts and sources
|
| 1545 |
- don't mention this formatting thing in the report, just do it
|
|
|
|
| 1546 |
|
| 1547 |
IMPORTANT: Format your response as a proper JSON object with these fields:
|
| 1548 |
- "relevant": "yes" or "no"
|
|
@@ -1554,6 +1561,7 @@ IMPORTANT: Format your response as a proper JSON object with these fields:
|
|
| 1554 |
try:
|
| 1555 |
response = openai_call(prompt=prompt, model="gpt-4o-mini", max_tokens_param=max_tokens, temperature=temperature)
|
| 1556 |
response = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', response)
|
|
|
|
| 1557 |
if not response:
|
| 1558 |
logging.error("analyze_with_gpt4o: Empty response received from API.")
|
| 1559 |
return {"relevant": "no", "summary": "", "followups": []}
|
|
@@ -1733,6 +1741,7 @@ Important:
|
|
| 1733 |
- You will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
|
| 1734 |
- You must have at least {10 * pages} of such occurences scattered around the report - this is important for the grounding of the report in verifiable facts and sources
|
| 1735 |
- Don't mention the safe-formatting in the report or at the end, just do it - This is just for regex processing purpose
|
|
|
|
| 1736 |
|
| 1737 |
// Sources
|
| 1738 |
Use the following learnings and merged reference details from a deep research process on:
|
|
@@ -1766,6 +1775,11 @@ Note: Exclude the use of html numbered lists format, they don't get correctly im
|
|
| 1766 |
<h4> for bulletpoint title (ex: <h4>item to detail:</h4>details ...)
|
| 1767 |
- Use inline formatting for the tables with homogeneous border and colors
|
| 1768 |
- Avoid Chinese characters in the output (use the Pinyin version) since they won't display correcly in the pdf (black boxes)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1769 |
|
| 1770 |
--------------- Placeholders -----------
|
| 1771 |
In order to enrich the content, within the core sections (between introduction and conclusion), you can inject some placeholders that will be developped later on.
|
|
@@ -1858,9 +1872,6 @@ with:
|
|
| 1858 |
Important note for focus placeholders:
|
| 1859 |
- after [[ put "Focus Placeholder n:" explicitly (with n as the ref number of the focus box created). This will be used in a regex
|
| 1860 |
- Do not add a title for the Focus placeholder just before the [[...]], the content that will replace the focus placeholder - generated later on - will already include a title
|
| 1861 |
-
- For the Table of contents: do not mention the pages, but make each item on separate line
|
| 1862 |
-
- The reference table at the end containing the citations details should have 4 columns: the ref number, the title of the document, the author(s, the URL - with hyperlink)
|
| 1863 |
-
the name of the reference table should be: "Reference Summary Table"
|
| 1864 |
|
| 1865 |
// Report ending required
|
| 1866 |
End the report with the following sequence:
|
|
@@ -1911,6 +1922,7 @@ Important note: placeholders (visual, graph or focus) can only appear in the sec
|
|
| 1911 |
report = openai_call(prompt, model="o3-mini", max_tokens_param=tokentarget)
|
| 1912 |
# Post-processing
|
| 1913 |
report = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', report)
|
|
|
|
| 1914 |
|
| 1915 |
# If the report is too long, compress it.
|
| 1916 |
if len(report) > MAX_MESSAGE_LENGTH:
|
|
|
|
| 1232 |
- You will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
|
| 1233 |
- You must have at least 20 of such occurences - this is important for the grounding of the report in verifiable facts and sources
|
| 1234 |
- Don't mention the safe-formatting in the report or at the end, just do it - This is just for regex processing purpose
|
| 1235 |
+
- LinkedIn is not a source - you should check the author of the page visited, this is the real source, mention the name of the author and then add "from LinkedIn Pulse"
|
| 1236 |
|
| 1237 |
// Important
|
| 1238 |
- Make it real, with anecdotes from the content
|
|
|
|
| 1244 |
result = openai_call(prompt, model="o3-mini", max_tokens_param=10000)
|
| 1245 |
result = result.strip().strip("```").strip()
|
| 1246 |
result = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', result)
|
| 1247 |
+
result = re.sub(r'\[\{(.*?)\}\]', r'\1', result)
|
| 1248 |
logging.info(f"The code produced for this focus placeholder:\n{placeholder_text}\n\n {result}\n\n")
|
| 1249 |
return result
|
| 1250 |
|
|
|
|
| 1452 |
Important:
|
| 1453 |
- You will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
|
| 1454 |
- You must have at least 20 of such occurences - this is important for the grounding of the report in verifiable facts and sources
|
| 1455 |
+
- Don't mention the safe-formatting in the report or at the end, just do it - This is just for regex processing purpose
|
| 1456 |
+
- LinkedIn is not a source - you should check the author of the page visited, this is the real source, mention the name of the author and then add "from LinkedIn Pulse""""
|
| 1457 |
)
|
| 1458 |
summary_chunk = openai_call(prompt=chunk_prompt, model="gpt-4o-mini", max_tokens_param=500, temperature=0.7)
|
| 1459 |
summary_chunk = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', summary_chunk)
|
| 1460 |
+
summary_chunk = re.sub(r'\[\{(.*?)\}\]', r'\1', summary_chunk)
|
| 1461 |
global SUMMARIZATION_REQUEST_COUNT, TOTAL_SUMMARIZED_WORDS
|
| 1462 |
SUMMARIZATION_REQUEST_COUNT += 1
|
| 1463 |
TOTAL_SUMMARIZED_WORDS += len(summary_chunk.split())
|
|
|
|
| 1482 |
Important:
|
| 1483 |
- you will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
|
| 1484 |
- you must have at least 20 of such occurences
|
| 1485 |
+
- don't mention this formatting thing in the report, just do it
|
| 1486 |
+
- LinkedIn is not a source - you should check the author of the page visited, this is the real source, mention the name of the author and then add "from LinkedIn Pulse""""
|
| 1487 |
)
|
| 1488 |
final_summary = openai_call(prompt=final_prompt, model="gpt-4o-mini", max_tokens_param=target_length, temperature=0.7)
|
| 1489 |
final_summary = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', final_summary)
|
| 1490 |
+
final_summary = re.sub(r'\[\{(.*?)\}\]', r'\1', final_summary)
|
| 1491 |
return final_summary.strip()
|
| 1492 |
|
| 1493 |
|
|
|
|
| 1549 |
- you will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
|
| 1550 |
- You must have at least 20 of such occurences - this is important for the grounding of the report in verifiable facts and sources
|
| 1551 |
- don't mention this formatting thing in the report, just do it
|
| 1552 |
+
- LinkedIn is not a source - you should check the author of the page visited, this is the real source, mention the name of the author and then add "from LinkedIn Pulse"
|
| 1553 |
|
| 1554 |
IMPORTANT: Format your response as a proper JSON object with these fields:
|
| 1555 |
- "relevant": "yes" or "no"
|
|
|
|
| 1561 |
try:
|
| 1562 |
response = openai_call(prompt=prompt, model="gpt-4o-mini", max_tokens_param=max_tokens, temperature=temperature)
|
| 1563 |
response = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', response)
|
| 1564 |
+
response = re.sub(r'\[\{(.*?)\}\]', r'\1', response)
|
| 1565 |
if not response:
|
| 1566 |
logging.error("analyze_with_gpt4o: Empty response received from API.")
|
| 1567 |
return {"relevant": "no", "summary": "", "followups": []}
|
|
|
|
| 1741 |
- You will safely mention any of these elements provided you respect the above mentioned formatting {{[{{...}}]}}
|
| 1742 |
- You must have at least {10 * pages} of such occurences scattered around the report - this is important for the grounding of the report in verifiable facts and sources
|
| 1743 |
- Don't mention the safe-formatting in the report or at the end, just do it - This is just for regex processing purpose
|
| 1744 |
+
- LinkedIn is not a source - you should check the author of the page visited, this is the real source, mention the name of the author and then add "from LinkedIn Pulse"
|
| 1745 |
|
| 1746 |
// Sources
|
| 1747 |
Use the following learnings and merged reference details from a deep research process on:
|
|
|
|
| 1775 |
<h4> for bulletpoint title (ex: <h4>item to detail:</h4>details ...)
|
| 1776 |
- Use inline formatting for the tables with homogeneous border and colors
|
| 1777 |
- Avoid Chinese characters in the output (use the Pinyin version) since they won't display correcly in the pdf (black boxes)
|
| 1778 |
+
- For the Table of contents: do not mention the pages, but make each item on separate line
|
| 1779 |
+
- Put "Table of contents" and "Abstract" title in h1 format.
|
| 1780 |
+
- The Table of contents should not mention the abstract and table of contents, the numbering should start from the introduction and end with References Summary Table
|
| 1781 |
+
- The reference table at the end containing the citations details should have 4 columns: the ref number, the title of the document, the author(s, the URL - with hyperlink)
|
| 1782 |
+
- the name of the reference table should be: "Reference Summary Table"
|
| 1783 |
|
| 1784 |
--------------- Placeholders -----------
|
| 1785 |
In order to enrich the content, within the core sections (between introduction and conclusion), you can inject some placeholders that will be developped later on.
|
|
|
|
| 1872 |
Important note for focus placeholders:
|
| 1873 |
- after [[ put "Focus Placeholder n:" explicitly (with n as the ref number of the focus box created). This will be used in a regex
|
| 1874 |
- Do not add a title for the Focus placeholder just before the [[...]], the content that will replace the focus placeholder - generated later on - will already include a title
|
|
|
|
|
|
|
|
|
|
| 1875 |
|
| 1876 |
// Report ending required
|
| 1877 |
End the report with the following sequence:
|
|
|
|
| 1922 |
report = openai_call(prompt, model="o3-mini", max_tokens_param=tokentarget)
|
| 1923 |
# Post-processing
|
| 1924 |
report = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', report)
|
| 1925 |
+
report = re.sub(r'\[\{(.*?)\}\]', r'\1', report)
|
| 1926 |
|
| 1927 |
# If the report is too long, compress it.
|
| 1928 |
if len(report) > MAX_MESSAGE_LENGTH:
|