Spaces:

gabejavitt
/

agentCourse

Sleeping

App Files Files Community

gabejavitt commited on Oct 28, 2025

Commit

d7eb8af

verified ·

1 Parent(s): 99b68de

Update app.py

Browse files

Files changed (1) hide show

app.py +64 -35

app.py CHANGED Viewed

@@ -10,6 +10,7 @@ import torch
 import json # For robust tool call parsing/generation if needed
 import re # For finding JSON
 import uuid # For generating tool call IDs
 # --- Multimodal & Web Tool Imports ---
 from transformers import pipeline
@@ -68,30 +69,30 @@ def search_tool(query: str) -> str:
 @tool
 def code_interpreter(code: str) -> str:
-    """
-    Executes a string of Python code and returns its stdout, stderr, and any error.
-    ...
-    """
     print(f"--- Calling Code Interpreter with code:\n{code}\n---")
     output_stream = io.StringIO()
     error_stream = io.StringIO()
     try:
         with contextlib.redirect_stdout(output_stream), contextlib.redirect_stderr(error_stream):
-            # --- FIX IS HERE ---
-            # Create a safe environment that includes 'pd' and standard Python built-ins
             safe_globals = {
                 "pd": pd,
-                "__builtins__": __builtins__  # This is the crucial addition
             }
-            # Execute the code within this safe environment
             exec(code, safe_globals, {})
-            # --- END FIX ---
         stdout = output_stream.getvalue(); stderr = error_stream.getvalue()
         if stderr: return f"Error: {stderr}\nStdout: {stdout}"
         if stdout: return f"Success:\n{stdout}"
         return "Success: Code executed without error and produced no stdout."
-    except Exception as e: return f"Execution failed with error: {str(e)}"
 @tool
 def read_file(path: str) -> str:
@@ -243,7 +244,7 @@ class AgentState(TypedDict):
 def should_continue(state: AgentState):
     """
     Custom logic to decide whether to continue or end.
-    Now allows for a "reasoning loop".
     """
     last_message = state['messages'][-1]
@@ -258,15 +259,11 @@ def should_continue(state: AgentState):
                 print("--- Condition: Saw other tools, calling tools node. ---")
                 return "tools"
-    # --- THIS IS THE NEW LOGIC ---
-    # If the last message is from the AI and has NO tool calls (i.e., it's a "thought"),
     # loop back to the agent node to let it "think" again.
-    print("--- Condition: No tool call detected. Looping back to agent. ---")
     return "agent"
-    # The old "END" path is removed. The only way to END
-    # is to explicitly call final_answer_tool.
 # --- Basic Agent Definition ---
 # ----- THIS IS WERE YOU CAN BUILD WHAT YOU WANT ------
@@ -290,22 +287,46 @@ class BasicAgent:
         ])
         # ==================== MODIFIED SYSTEM PROMPT ====================
-        self.system_prompt = f"""You are a highly intelligent and meticulous AI assistant built to answer questions from the GAIA benchmark.
-Your primary goal is to provide **only the concise, factual, and direct answer** to the user's question.
-**CRITICAL INSTRUCTIONS:**
-* **DO NOT** provide the final answer as plain text.
-* **THE ONLY WAY** to provide a final answer is by calling the `final_answer_tool`.
-* **DO NOT** include conversational filler (e.g., "The answer is...").
-* **DO NOT** explain your reasoning unless it's inside a `code_interpreter` comment.
-* **DO NOT** mix plain text and tool-call JSON in the same response.
-* **DO NOT** use XML formats like `<function=...>` or `<code_interpreter>`. **THIS WILL FAIL.**
-You have access to the following tools:
 {tool_descriptions}
-**TOOL USAGE PROTOCOL:**
-* To call a tool, respond ONLY with a single JSON object formatted exactly like this:
     ```json
     {{
       "tool": "tool_name",
@@ -354,10 +375,18 @@ You have access to the following tools:
         print("Building agent graph...")
         graph_builder = StateGraph(AgentState)
         graph_builder.add_node("agent", agent_node)
-        graph_builder.add_node("tools", tool_node)
         graph_builder.add_edge(START, "agent")
-        graph_builder.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END})
-        graph_builder.add_edge("tools", "agent")
         self.graph = graph_builder.compile()
         print("✅ Graph compiled.")

 import json # For robust tool call parsing/generation if needed
 import re # For finding JSON
 import uuid # For generating tool call IDs
+import traceback
 # --- Multimodal & Web Tool Imports ---
 from transformers import pipeline
 @tool
 def code_interpreter(code: str) -> str:
+    """Executes Python code..."""
     print(f"--- Calling Code Interpreter with code:\n{code}\n---")
     output_stream = io.StringIO()
     error_stream = io.StringIO()
     try:
         with contextlib.redirect_stdout(output_stream), contextlib.redirect_stderr(error_stream):
             safe_globals = {
                 "pd": pd,
+                "__builtins__": __builtins__
             }
             exec(code, safe_globals, {})
         stdout = output_stream.getvalue(); stderr = error_stream.getvalue()
         if stderr: return f"Error: {stderr}\nStdout: {stdout}"
         if stdout: return f"Success:\n{stdout}"
         return "Success: Code executed without error and produced no stdout."
+    except Exception as e:
+        # --- THIS IS THE IMPROVEMENT ---
+        # Get the full traceback string
+        tb_str = traceback.format_exc()
+        print(f"--- Code Interpreter FAILED ---\n{tb_str}\n---")
+        return f"Execution failed with error:\n{tb_str}"
+        # --- END IMPROVEMENT ---
 @tool
 def read_file(path: str) -> str:
 def should_continue(state: AgentState):
     """
     Custom logic to decide whether to continue or end.
+    This now allows for a "reasoning loop".
     """
     last_message = state['messages'][-1]
                 print("--- Condition: Saw other tools, calling tools node. ---")
                 return "tools"
+    # --- THIS IS THE KEY CHANGE ---
+    # If the last message is from the AI and has NO tool calls (i.e., it's plain text),
     # loop back to the agent node to let it "think" again.
+    print("--- Condition: No tool call. Looping back to agent (reasoning loop). ---")
     return "agent"
 # --- Basic Agent Definition ---
 # ----- THIS IS WERE YOU CAN BUILD WHAT YOU WANT ------
         ])
         # ==================== MODIFIED SYSTEM PROMPT ====================
+        self.system_prompt = f"""You are a highly intelligent and meticulous AI assistant for the GAIA benchmark.
+Your goal is to provide the concise, factual answer by strictly following a step-by-step reasoning process.
+**CRITICAL PROTOCOL: YOU MUST FOLLOW THIS PROCESS**
+1.  **ANALYZE:** Read the question and all messages in the history.
+2.  **PLAN:** Your *first* response on any new task MUST be a step-by-step plan as plain text. Do NOT call a tool on your first turn. Write down your logic, what you need to find, and which tool you *plan* to use.
+3.  **EXECUTE:** After you submit your plan, you will run again. Now, execute the *first* step of your plan by calling the *one* appropriate tool.
+4.  **ANALYZE TOOL OUTPUT:** You will receive a [Tool Output] message. You MUST read it.
+5.  **REPEAT or FINISH:**
+    * **If more steps are needed:** Go back to step 2 (PLAN). Write an *updated* plan as plain text (e.g., "Step 1 was successful. My new step 2 is...").
+    * **If the [Tool Output] contains the final answer:** You MUST call the `final_answer_tool`. Your answer *must* be derived *only* from the [Tool Output], not your own knowledge.
+**RULES:**
+* **NEVER** call a tool on the same turn you write a plan.
+* **NEVER** use your pre-trained "leaked" knowledge for the final answer. The answer *must* come from a [Tool Output] (e.g., from `code_interpreter`'s print() or `search_tool`).
+* **NEVER** answer a logic puzzle from memory. You *must* use `code_interpreter`, **print the result**, and then use that printed result for your final answer.
+* **NEVER** call `final_answer_tool` until a tool has given you the answer.
+* **Error Handling:** If a tool call fails, your next step (Step 2) must be to write a plan that analyzes the error and tries a *different* approach.
+**TOOLS:**
 {tool_descriptions}
+You have access to the following tools:
+- {tool.name}: ...
+- code_interpreter: Executes Python code.
+  **CODE INTERPRETER RULES:**
+  1.  **ALWAYS** use a `print()` statement to output your final result. The tool only returns what you print.
+  2.  **NEVER** write a complex, multi-step script in one go.
+  3.  **ALWAYS** break the problem down. Call the tool with a *simple* script to get one piece of information (e.g., `print(df.head())`).
+  4.  Then, use that output (in your "think" step) to plan your *next* simple script (e.g., `print(df['column'].value_counts())`).
+  5.  **ALWAYS** write your logical plan as Python comments (`#`) inside the code block *before* you write the code itself.
+**REASONING PROCESS & STOPPING CONDITION:**
+1.  **PLAN:** First, respond with your step-by-step plan in plain text. Do not call a tool yet.
+2.  **(Graph will loop)**
+3.  **EXECUTE:** Now, call the *one* tool needed for the first step of your plan.
+4.  **ANALYZE:** You will get a [Tool Output].
+5.  **REPEAT:** Go back to step 1. Write an updated plan (e.g., "Step 1 was successful and gave me [data]. My step 2 is...").
+6.  **STOP:** Only call `final_answer_tool` when a [Tool Output] has given you the final, exact answer.
+**TOOL FORMAT (JSON ONLY):**
     ```json
     {{
       "tool": "tool_name",
         print("Building agent graph...")
         graph_builder = StateGraph(AgentState)
         graph_builder.add_node("agent", agent_node)
+        graph_builder.add_node("tools", tool_node)
         graph_builder.add_edge(START, "agent")
+        graph_builder.add_edge("tools", "agent") # This edge is correct
+        # --- REPLACE your old add_conditional_edges ---
+        graph_builder.add_conditional_edges(
+            "agent",
+            should_continue,
+            {
+                "tools": "tools",  # If tools are called
+                "agent": "agent",  # If text is generated (the new loop)
+                END: END           # If final_answer is called
+            }
         self.graph = graph_builder.compile()
         print("✅ Graph compiled.")