Spaces:

10gen
/

deepsearchitv2

Running

App Files Files Community

Guiyom commited on Mar 15, 2025

Commit

60c0305

verified ·

1 Parent(s): c75cae4

Update app.py

Browse files

Files changed (1) hide show

app.py +54 -27

app.py CHANGED Viewed

@@ -232,7 +232,24 @@ Note: General Optimization Guidelines:
 4. Ensure that the summary length and level of detail is proportional to the source length.
 Source length: {snippet_words} words. You may produce a more detailed summary if the text is long.
-IMPORTANT: Format your response as a proper JSON object with these fields:
 - "relevant": "yes" or "no"
 - url: full url
 - title: title
@@ -2439,33 +2456,43 @@ def generate_query_tree(context: str, breadth: int, depth: int, selected_engines
     # If selected_engines is None, provide a fallback string
     list_engines = "all relevant search engines" if selected_engines is None else ','.join(map(str, selected_engines))
-    prompt = f"""
-Generate a list of {breadth} search queries relevant to the following context:
-"{context}"
-// Requirements
-- The queries should be suitable for a search engine.
-- Each query should combine terms using logical operators (AND, OR) where appropriate.
-- Do not include explanations or introductory phrases,
-- Just output a JSON object containing a list of strings named 'queries'.
-// IMPORTANT:
-- Return only valid JSON without any markdown code fences (```) or mention of json
-// EXAMPLE (if breadth = 4):
-    {{
-      "queries": [
-        (Artificial Intelligence OR data science) AND mathematics,
-        (geometry OR algebra) AND research AND machine learning,
-        (calculus OR "differential equations") AND "AI applications",
-        "Statistics" AND "data analysis" AND "machine learning algorithms"
-      ]
-    }}
-Do not include any extra text, markdown formatting, or commentary. Output the JSON starting from "{{" and ending with "}}".
-Now generate the result.
-"""
     messages = []
     llm_response = llm_call(prompt=prompt, messages=messages, model="o3-mini", temperature=0, max_tokens_param=1500)
     logging.info(f"Generated query tree: {llm_response}")

 4. Ensure that the summary length and level of detail is proportional to the source length.
 Source length: {snippet_words} words. You may produce a more detailed summary if the text is long.
+// Special guidance for follow-up search queries
+1. Query Progression:
+   - Begin with foundational/conceptual queries
+   - Progress to methodological/technical terms
+   - Culminate in specialized/applied combinations
+2. Term Optimization:
+   - Use Boolean logic (AND/OR) strategically
+   - Include both general terminology AND domain-specific jargon
+   - Add temporal filters when relevant (e.g., "since 2018", "2020-2023")
+   - Consider geographical/cultural modifiers if applicable
+3. Query Structure:
+   - Prioritize conceptual combinations over simple keyword matching
+   - Use quotation marks for exact phrases and hyphenation for compound terms
+   - Include emerging terminology variants (e.g., "LLMs" OR "large language models")
+// IMPORTANT: Format your response as a proper JSON object with these fields:
 - "relevant": "yes" or "no"
 - url: full url
 - title: title
     # If selected_engines is None, provide a fallback string
     list_engines = "all relevant search engines" if selected_engines is None else ','.join(map(str, selected_engines))
+prompt = f"""
+Generate {breadth} search queries for academic research exploring: "{context}"
+// Research Strategy Requirements
+1. Query Progression:
+   - Begin with foundational/conceptual queries
+   - Progress to methodological/technical terms
+   - Culminate in specialized/applied combinations
+2. Term Optimization:
+   - Use Boolean logic (AND/OR) strategically
+   - Include both general terminology AND domain-specific jargon
+   - Add temporal filters when relevant (e.g., "since 2018", "2020-2023")
+   - Consider geographical/cultural modifiers if applicable
+3. Query Structure:
+   - Prioritize conceptual combinations over simple keyword matching
+   - Use quotation marks for exact phrases and hyphenation for compound terms
+   - Include emerging terminology variants (e.g., "LLMs" OR "large language models")
+// Output Requirements
+- Strictly valid JSON format: {{"queries": ["query1", "query2"]}}
+- No Markdown, code fences, or supplementary text
+- Clean string formatting without special characters
+// Example (breadth=4):
+{{
+  "queries": [
+    "Fundamental theories AND (Artificial Intelligence OR machine learning)",
+    "(Computational mathematics OR statistical modeling) AND research paradigms",
+    "\"Deep learning architectures\" AND (optimization techniques OR neural networks)",
+    "\"Generative AI\" AND industrial applications AND (2020-2024 OR recent developments)"
+  ]
+}}
+Generate queries that systematically explore the research landscape from multiple conceptual angles."""
     messages = []
     llm_response = llm_call(prompt=prompt, messages=messages, model="o3-mini", temperature=0, max_tokens_param=1500)
     logging.info(f"Generated query tree: {llm_response}")