Spaces:

gabejavitt
/

agentCourse

Sleeping

App Files Files Community

gabejavitt commited on Oct 28, 2025

Commit

87f9e05

verified ·

1 Parent(s): 821692a

Update app.py

Browse files

Files changed (1) hide show

app.py +82 -81

app.py CHANGED Viewed

@@ -334,90 +334,91 @@ def should_continue(state: AgentState):
 # --- Basic Agent Definition ---
 class BasicAgent:
     def __init__(self):
-            print("BasicAgent (LangGraph) initializing...")
-            GROQ_API_KEY = os.getenv("GROQ_API_KEY")
-            if not GROQ_API_KEY: raise ValueError("GROQ_API_KEY secret is not set!")
-            self.tools = defined_tools
-            # Build tool descriptions separately to avoid f-string backslash issues
-            tool_desc_list = []
-            for tool in self.tools:
-                if tool.name == 'code_interpreter':
-                    desc = (f"- {tool.name}: Executes Python code. Use for calculations, data manipulation, or logic puzzles.\n"
-                            f"  **CODE INTERPRETER RULES:**\n"
-                            f"  1. ALWAYS use `print()` for final results.\n"
-                            f"  2. Write SIMPLE, single-step scripts.\n"
-                            f"  3. PLAN your next script using plain text output first.\n"
-                            f"  4. Write reasoning as Python comments (#) before code.\n"
-                            f"  'pandas' (as pd) is available.")
-                else:
-                    desc = f"- {tool.name}: {tool.description}"
-                tool_desc_list.append(desc)
-            tool_descriptions = "\n".join(tool_desc_list)
-            # ==================== SYSTEM PROMPT V4 ====================
-            self.system_prompt = f"""You are a highly intelligent and meticulous AI assistant for the GAIA benchmark.
-    Your goal is to provide the concise, factual answer by strictly following a step-by-step reasoning process.
-    **CRITICAL PROTOCOL: YOU MUST FOLLOW THIS PROCESS**
-    1.  **ANALYZE:** Read the question and all messages in the history.
-    2.  **MANDATORY FIRST STEP:** Your *first* response on *any* new task MUST be a plan in plain text. Do NOT call any tool on your first turn. Write down your logic, what you need, and which tool you *plan* to use next. Failure to provide a plan first will result in incorrect behavior.
-    3.  **EXECUTE:** After submitting your plan, you will run again. Now, execute the *next* step of your plan by calling the *one* appropriate tool using the correct JSON format.
-    4.  **ANALYZE TOOL OUTPUT:** You will receive a ToolMessage with the output. You MUST read it carefully.
-    5.  **REPEAT or FINISH:**
-        * **If more steps are needed:** Go back to step 1 (ANALYZE the new info & PLAN). Write an *updated* plan as plain text (e.g., "The search found X. My next step is to use code_interpreter to process X...").
-        * **If the ToolMessage contains the final answer:** You MUST call the `final_answer_tool`. Your answer *must* be derived *only* from the ToolMessage output, not your own knowledge.
-    **RULES:**
-    * **NEVER** call a tool on the same turn you write a plan (plain text).
-    * **NEVER** use your pre-trained "leaked" knowledge for the final answer. The answer *must* come from a ToolMessage (e.g., from `code_interpreter`'s print() or `search_tool`).
-    * **NEVER** answer a logic puzzle from memory. You *must* use `code_interpreter`, ensure it `print()`s the result, analyze that output, and then use that printed result for `final_answer_tool`.
-    * **NEVER** call `final_answer_tool` until a tool has explicitly given you the answer in its output.
-    * **Error Handling:** If a tool call returns an Error, your next step (Step 1 PLAN) MUST analyze the error message and propose a *different* approach (different tool, different arguments, different logic). Do not retry the exact same failed call.
-    **TOOLS:**
-    {tool_descriptions}
-    **TOOL FORMAT (JSON ONLY):**
-    Respond ONLY with a single JSON block like this when calling a tool:
-    ```json
-    {{
-      "tool": "tool_name",
-      "tool_input": {{ "arg_name1": "value1", ... }}
-    }}
-    ```
-    * Replace `tool_name` with the tool's name. Provide arguments in `tool_input`. Match names/types precisely.
-    * Do not add any text before or after the JSON block.
-    Example for final_answer_tool:
-    ```json
-    {{
-      "tool": "final_answer_tool",
-      "tool_input": {{
-        "answer": "The final answer string here"
-      }}
-    }}
-    ```
-    NOTE: The value for "answer" MUST be a string enclosed in double quotes.
-    """
-            print("Initializing Groq LLM Endpoint...")
-            try:
-                chat_llm = ChatGroq(
-                    temperature=0.01, # Low temperature for factual tasks
-                    groq_api_key=GROQ_API_KEY,
-                    model_name="openai/gpt-oss-120b" # <-- Switched Model
-                )
-                print("✅ Groq LLM Endpoint initialized with openai/gpt-oss-120b.")
-            except Exception as e:
-                print(f"Error initializing Groq: {e}")
-                raise
-            self.llm_with_tools = chat_llm.bind_tools(self.tools)
-            print("✅ Tools bound to LLM (using bind_tools).")
     # --- Agent Node with Robust Parsing Fallback ---
     def agent_node(state: AgentState):

 # --- Basic Agent Definition ---
 class BasicAgent:
     def __init__(self):
+        print("BasicAgent (LangGraph) initializing...")
+        GROQ_API_KEY = os.getenv("GROQ_API_KEY")
+        if not GROQ_API_KEY: raise ValueError("GROQ_API_KEY secret is not set!")
+        self.tools = defined_tools
+        # Build tool descriptions separately to avoid f-string backslash issues
+        tool_desc_list = []
+        for tool in self.tools:
+            if tool.name == 'code_interpreter':
+                desc = (f"- {tool.name}: Executes Python code. Use for calculations, data manipulation, or logic puzzles.\n"
+                        f"  **CODE INTERPRETER RULES:**\n"
+                        f"  1. ALWAYS use `print()` for final results.\n"
+                        f"  2. Write SIMPLE, single-step scripts.\n"
+                        f"  3. PLAN your next script using plain text output first.\n"
+                        f"  4. Write reasoning as Python comments (#) before code.\n"
+                        f"  'pandas' (as pd) is available.")
+            else:
+                desc = f"- {tool.name}: {tool.description}"
+            tool_desc_list.append(desc)
+        tool_descriptions = "\n".join(tool_desc_list)
+        # ==================== SYSTEM PROMPT V5 (Improved) ====================
+        self.system_prompt = f"""You are a highly intelligent and meticulous AI assistant for the GAIA benchmark.
+Your goal is to provide the EXACT, concise, factual answer by strictly following a step-by-step reasoning process.
+**CRITICAL PROTOCOL: YOU MUST FOLLOW THIS PROCESS**
+1.  **ANALYZE:** Read the question carefully. Identify what format the answer should be in (number, yes/no, list, name, etc.).
+2.  **PLAN (First Turn Only):** Your *first* response MUST be a brief plan in plain text:
+    - What information do you need?
+    - Which tool will you use first?
+    - What format should the final answer be in?
+    DO NOT call any tool on your first turn.
+3.  **EXECUTE ONE TOOL:** Call exactly ONE tool per turn. Wait for the result before planning your next step.
+4.  **VERIFY TOOL OUTPUT:**
+    - Read the ToolMessage carefully
+    - Check if it contains errors - if so, plan a different approach
+    - Check if you have enough information for the final answer
+5.  **ITERATE OR FINISH:**
+    - **Need more info?** Write a brief plan (1-2 sentences) then call the next tool
+    - **Have the answer?** Call `final_answer_tool` immediately with the EXACT answer from the tool output
+**CRITICAL RULES:**
+* **ANSWER FORMAT:** Match the exact format requested (if question asks for a number, return ONLY the number; if it asks for a list, return ONLY the list)
+* **NO HALLUCINATIONS:** The answer MUST come from tool outputs, NEVER from your training data
+* **ONE TOOL PER TURN:** Never call multiple tools or make plans and tool calls in the same turn
+* **USE CODE FOR LOGIC:** For ANY calculation, counting, or logical reasoning, use `code_interpreter` and ensure it prints the result
+* **ERROR RECOVERY:** If a tool fails, analyze WHY and try a completely different approach
+* **FINAL ANSWER FORMAT:** Strip ALL explanatory text. Examples:
+  - Question asks for number → Answer: "42" (not "The answer is 42" or "42 coins")
+  - Question asks for list → Answer: "apple, banana, cherry" (not "The list is: apple, banana, cherry")
+  - Question asks for yes/no → Answer: "Yes" or "No" (not "Yes, because...")
+**TOOLS:**
+{tool_descriptions}
+**REMEMBER:**
+- Use tools, don't guess
+- One tool at a time
+- Final answer must match requested format exactly
+- No explanations in final answer
+"""
+        print("Initializing Groq LLM Endpoint...")
+        try:
+            chat_llm = ChatGroq(
+                temperature=0, # Changed from 0.01 to 0 for maximum determinism
+                groq_api_key=GROQ_API_KEY,
+                model_name="llama-3.3-70b-versatile", # Better model for reasoning
+                max_tokens=4096, # Explicit limit
+                timeout=60 # Add timeout for stability
+            )
+            print("✅ Groq LLM Endpoint initialized with llama-3.3-70b-versatile.")
+        except Exception as e:
+            print(f"Error initializing Groq: {e}")
+            raise
+        self.llm_with_tools = chat_llm.bind_tools(self.tools)
+        print("✅ Tools bound to LLM (using bind_tools).")
     # --- Agent Node with Robust Parsing Fallback ---
     def agent_node(state: AgentState):