Guiyom commited on
Commit
354a2a1
·
verified ·
1 Parent(s): 40373c8

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +35 -7
app.py CHANGED
@@ -405,7 +405,7 @@ def generate_graph_snippet(placeholder_text: str, context: str, initial_query: s
405
  padding: 20px;
406
  }
407
  h2 {
408
- text-align: center;
409
  margin-bottom: 0;
410
  }
411
  #chart {
@@ -534,7 +534,7 @@ def generate_graph_snippet(placeholder_text: str, context: str, initial_query: s
534
  font-family: Arial, sans-serif;
535
  }
536
  h1 {
537
- text-align: center;
538
  margin: 20px 0;
539
  }
540
  #chart {
@@ -692,7 +692,7 @@ def generate_graph_snippet(placeholder_text: str, context: str, initial_query: s
692
  background-color: #f5f5f5;
693
  }
694
  h1 {
695
- text-align: center;
696
  margin-bottom: 10px;
697
  color: #333;
698
  }
@@ -870,6 +870,7 @@ Keep in mind the:
870
  // Important
871
  - Make the visuals content rich, there's no point having a visual if its content has no real value.
872
  - It has to convey some relevant insights.
 
873
  - Take a deep breath, think step by step and think it well.
874
  - Use your judgement to decide between box plots, bubble charts, calendar view, chord diagrams, histograms, ...
875
  - Your response should start with <html> and end with </html> - no intro before, no comments after.
@@ -1217,6 +1218,18 @@ Note: no need to add a reference table at the end of the focus box, all the refe
1217
  - Skip lines and add intermediate titles for key paragraphs
1218
  - You have freedom on the structure, but it has to cover all potential aspects on the topic in between 500 and 1000 words
1219
 
 
 
 
 
 
 
 
 
 
 
 
 
1220
  // Important
1221
  - Make it real, with anecdotes from the content
1222
  - Ground the content on the inputs shared above (context and knowledge inputs) with facts and numbers
@@ -1227,6 +1240,7 @@ Note: no need to add a reference table at the end of the focus box, all the refe
1227
 
1228
  result = openai_call(prompt, model="o3-mini", max_tokens_param=10000)
1229
  result = result.strip().strip("```").strip()
 
1230
  logging.info(f"The code produced for this focus placeholder:\n{placeholder_text}\n\n {result}\n\n")
1231
  return result
1232
 
@@ -1466,7 +1480,7 @@ Instructions:
1466
  - Key Facts (at least 5): List the core factual claims using short, declarative sentences or bullet points. Apply lemmatization and standard abbreviations.
1467
  - Key Figures (at least 5): Extract numerical data (statistics, dates, percentages) and include any necessary context (units, references, explanations) required to interpret these numbers. Present them concisely (list or table format).
1468
  - Key Arguments (at least 5): Identify main arguments or claims. Summarize supporting evidence and counter-arguments concisely.
1469
- - Key Quotes (at least 1 if any): Include significant quotes (with the author's name in parentheses). Attribute quotes correctly. Paraphrase if needed, indicating that it's a paraphrase. Use symbols (e.g., &, +, ->, =) to conserve tokens.
1470
  - Structured Summary (10 to 50 sentences): Provide a structured summary that includes anecdotes, people, and locations to ensure the report is relatable.
1471
 
1472
  Note: General Optimization Guidelines:
@@ -1647,15 +1661,25 @@ def generate_final_report(initial_query: str, context: str, reportstyle: str, le
1647
  word_count = pages * 500
1648
  prompt = (f"""
1649
  // Instructions:
1650
- 1. Integrate numbers, quotes, and factual references systematically (We want to incorporate as many relevant numbers, statistics, factual references, quotes from the sources)
1651
- ex: Google CEO ... mentioned that / he report xxx from yyy released in ddd revealed that ... / Mr.xyz said that "..."
1652
  2. Whenever you mention a figure or quote, add an inline reference [x] matching its source from the references.
1653
  3. Again, Specifically name relevant organizations, tools, project names, and people encountered in the crumbs or learnings.
1654
  Note: This is for academic purposes, so thorough citations and referencing are essential.
1655
  4. Focus on reputable sources that will not be disputed (generally social media posts cannot be an opposable sources, but some of them may mention reputable sources)
1656
  Note: put the full reference url (no generic domain address), down to the html page or the pdf
1657
  5. It must follow this writing style {reportstyle}.
1658
- 6. All key information giving credit and materiality to the report (numbers, names, people/titles, dates, papers, reports, organisation/institute/NGO/government bodies quotes, products, project names, ... ) should be explicitly mentioned IN the report with ref to the sources from the search results.
 
 
 
 
 
 
 
 
 
 
 
1659
 
1660
  // Sources
1661
  Use the following learnings and merged reference details from a deep research process on:
@@ -1668,6 +1692,7 @@ The report should be very detailed and lengthy — approximately the equivalent
1668
  If you cannot produce a high level of details for specific topic or section, put it in <div class="improvable-chunk">...</div>
1669
 
1670
  // Requirements
 
1671
  - It must include inline citations (e.g., [1], [2], etc.) from real sources provided in the search results below
1672
  - Do not add any inline citations reference in the placeholders descriptions below (visual, graph), you can add them in focus though.
1673
  - No more than 7 sentences per div blocks, skip lines and add line breaks when changing topic.
@@ -1824,6 +1849,9 @@ Then close the html code from the broader report
1824
  )
1825
  tokentarget = word_count * 5 # rough multiplier for token target
1826
  report = openai_call(prompt, model="o3-mini", max_tokens_param=tokentarget)
 
 
 
1827
  # If the report is too long, compress it.
1828
  if len(report) > MAX_MESSAGE_LENGTH:
1829
  report = compress_text(report, MAX_MESSAGE_LENGTH)
 
405
  padding: 20px;
406
  }
407
  h2 {
408
+ text-align: left;
409
  margin-bottom: 0;
410
  }
411
  #chart {
 
534
  font-family: Arial, sans-serif;
535
  }
536
  h1 {
537
+ text-align: left;
538
  margin: 20px 0;
539
  }
540
  #chart {
 
692
  background-color: #f5f5f5;
693
  }
694
  h1 {
695
+ text-align: left;
696
  margin-bottom: 10px;
697
  color: #333;
698
  }
 
870
  // Important
871
  - Make the visuals content rich, there's no point having a visual if its content has no real value.
872
  - It has to convey some relevant insights.
873
+ - Make sure all the items are visible and don't require hovering the mouse to be displayed - this report is meant to be printed.
874
  - Take a deep breath, think step by step and think it well.
875
  - Use your judgement to decide between box plots, bubble charts, calendar view, chord diagrams, histograms, ...
876
  - Your response should start with <html> and end with </html> - no intro before, no comments after.
 
1218
  - Skip lines and add intermediate titles for key paragraphs
1219
  - You have freedom on the structure, but it has to cover all potential aspects on the topic in between 500 and 1000 words
1220
 
1221
+ // Mentioning sources, organisations and individuals
1222
+ - We will perform a post-processing on the output to remove all the identifiable organisations, people and projects
1223
+ - For this reasons use this format for any specific name, organisation or project: \{[\{name\}]\}
1224
+ ex1: \{[\{Google\}]\} CEO, \{[\{Sundar Pichai\}]\} ...
1225
+ ex2: in a report from the \{[\{university of Berkeley\}]\} titled "\{[\{The great acceleration\}]\}"...
1226
+ ex3: the CEO of \{[\{Softbank\}]\} , \{[\{Masayoshi Son\}]\}, said that "the best way to..."
1227
+ ex4: the project \{[\{Stargate\}]\}, anounced by \{[\{OpenAI\}]\} in collaboration with \{[\{Salesforce\}]\}
1228
+ ex5: Mr. \{[\{Michael Parrot\}]\}, Marketing director in \{[\{Panasonic\}]\}, mentioned that ...
1229
+ Note: the output will be processed through regex and the identifiers removed, but this way we can keep track of all sources and citations without disclosing them.
1230
+ - This should apply to names, people/titles, dates, papers, reports, organisation/institute/NGO/government bodies quotes, products, project names, ...
1231
+ Important: you can safely mention any of these provided you respect the above mentioned formatting \{[\{...\}]\}
1232
+
1233
  // Important
1234
  - Make it real, with anecdotes from the content
1235
  - Ground the content on the inputs shared above (context and knowledge inputs) with facts and numbers
 
1240
 
1241
  result = openai_call(prompt, model="o3-mini", max_tokens_param=10000)
1242
  result = result.strip().strip("```").strip()
1243
+ result = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', result)
1244
  logging.info(f"The code produced for this focus placeholder:\n{placeholder_text}\n\n {result}\n\n")
1245
  return result
1246
 
 
1480
  - Key Facts (at least 5): List the core factual claims using short, declarative sentences or bullet points. Apply lemmatization and standard abbreviations.
1481
  - Key Figures (at least 5): Extract numerical data (statistics, dates, percentages) and include any necessary context (units, references, explanations) required to interpret these numbers. Present them concisely (list or table format).
1482
  - Key Arguments (at least 5): Identify main arguments or claims. Summarize supporting evidence and counter-arguments concisely.
1483
+ - Key Quotes (at least 1 if any): Include significant quotes (with the author's name in parenthesis). Attribute quotes correctly. Paraphrase if needed, indicating that it's a paraphrase. Use symbols (e.g., &, +, ->, =) to conserve tokens.
1484
  - Structured Summary (10 to 50 sentences): Provide a structured summary that includes anecdotes, people, and locations to ensure the report is relatable.
1485
 
1486
  Note: General Optimization Guidelines:
 
1661
  word_count = pages * 500
1662
  prompt = (f"""
1663
  // Instructions:
1664
+ 1. Integrate numbers from the sources but always mention the source
 
1665
  2. Whenever you mention a figure or quote, add an inline reference [x] matching its source from the references.
1666
  3. Again, Specifically name relevant organizations, tools, project names, and people encountered in the crumbs or learnings.
1667
  Note: This is for academic purposes, so thorough citations and referencing are essential.
1668
  4. Focus on reputable sources that will not be disputed (generally social media posts cannot be an opposable sources, but some of them may mention reputable sources)
1669
  Note: put the full reference url (no generic domain address), down to the html page or the pdf
1670
  5. It must follow this writing style {reportstyle}.
1671
+
1672
+ // Mentioning sources, organisations and individuals
1673
+ - We will perform a post-processing on the output to remove all the identifiable organisations, people and projects
1674
+ - For this reasons use this format for any specific name, organisation or project: \{[\{name\}]\}
1675
+ ex1: \{[\{Google\}]\} CEO, \{[\{Sundar Pichai\}]\} ...
1676
+ ex2: in a report from the \{[\{university of Berkeley\}]\} titled "\{[\{The great acceleration\}]\}"...
1677
+ ex3: the CEO of \{[\{Softbank\}]\} , \{[\{Masayoshi Son\}]\}, said that "the best way to..."
1678
+ ex4: the project \{[\{Stargate\}]\}, anounced by \{[\{OpenAI\}]\} in collaboration with \{[\{Salesforce\}]\}
1679
+ ex5: Mr. \{[\{Michael Parrot\}]\}, Marketing director in \{[\{Panasonic\}]\}, mentioned that ...
1680
+ Note: the output will be processed through regex and the identifiers removed, but this way we can keep track of all sources and citations without disclosing them.
1681
+ - This should apply to names, people/titles, dates, papers, reports, organisation/institute/NGO/government bodies quotes, products, project names, ...
1682
+ Important: you can safely mention any of these provided you respect the above mentioned formatting \{[\{...\}]\}
1683
 
1684
  // Sources
1685
  Use the following learnings and merged reference details from a deep research process on:
 
1692
  If you cannot produce a high level of details for specific topic or section, put it in <div class="improvable-chunk">...</div>
1693
 
1694
  // Requirements
1695
+ - All text alignment has to be on the left
1696
  - It must include inline citations (e.g., [1], [2], etc.) from real sources provided in the search results below
1697
  - Do not add any inline citations reference in the placeholders descriptions below (visual, graph), you can add them in focus though.
1698
  - No more than 7 sentences per div blocks, skip lines and add line breaks when changing topic.
 
1849
  )
1850
  tokentarget = word_count * 5 # rough multiplier for token target
1851
  report = openai_call(prompt, model="o3-mini", max_tokens_param=tokentarget)
1852
+ # Post-processing
1853
+ report = re.sub(r'\{\[\{(.*?)\}\]\}', r'\1', report)
1854
+
1855
  # If the report is too long, compress it.
1856
  if len(report) > MAX_MESSAGE_LENGTH:
1857
  report = compress_text(report, MAX_MESSAGE_LENGTH)