Final_Assignment_Template

Sleeping

App Files Files Community

aelin commited on May 5, 2025

Commit

d0210fc

1 Parent(s): e6b3fdf

Refactors agent for clarity and removes cache debug logs

Browse files

Files changed (4) hide show

_tools.py +1 -2
answers_cache.json +82 -0
app.py +18 -13
utils.py +0 -3

_tools.py CHANGED Viewed

@@ -8,10 +8,9 @@ from llama_index.core.tools import FunctionTool
 from huggingface_hub import InferenceClient
 client = InferenceClient(
-    provider="hf-inference",
 )
 DEFAULT_API_URL = "https://agents-course-unit4-scoring.hf.space"
 search_tool_spec = DuckDuckGoSearchToolSpec()

 from huggingface_hub import InferenceClient
 client = InferenceClient(
+  provider="hf-inference",
 )
 DEFAULT_API_URL = "https://agents-course-unit4-scoring.hf.space"
 search_tool_spec = DuckDuckGoSearchToolSpec()

answers_cache.json ADDED Viewed

	@@ -0,0 +1,82 @@

+{
+  "8e867cd7-cff9-4e6c-867a-ff5ddc2550be": {
+    "answer": "I cannot answer the question with the provided tools.",
+    "isCorrect": false
+  },
+  "a1e91b78-d3d8-4675-bb8d-62741b4b68a6": {
+    "answer": "cannot answer",
+    "isCorrect": false
+  },
+  "2d83110e-a098-4ebb-9987-066c06fa42d0": {
+    "answer": "Thought: It seems there is a persistent issue with the \"Fetch\" tool. Since I cannot fetch the file, I will focus on answering the reversed question based on the information provided.\n\nThe question asks: \"If you understand this sentence then write the opposite of let the saw nurse.\"\n\nThe opposite of \"let the saw nurse\" could be interpreted as \"stop the nurse from sawing\" or \"prevent the nurse from cutting.\" However, without additional context, the most straightforward opposite would be \"do not let the nurse saw.\"\n\nYOUR ANSWER. do not let the nurse saw",
+    "isCorrect": false
+  },
+  "cca530fc-4052-43b2-b130-b30968d8aa44": {
+    "answer": "Thought: Given the constraints, I will assume the image is a common chess position and try to identify it based on the task ID. Since the Fetch tool is not working, I will use my knowledge base to determine the move.\n\nThe task ID provided seems to be unique, but I will try to identify a common chess position that might match the description. Given that it's black's turn and a win is guaranteed, I will consider a well-known checkmate position.\n\nA common checkmate position is the \"Fool's Mate,\" which occurs in just two moves. However, since it's black's turn and a win is guaranteed, it might be a different position. Another common position is the \"Scholar's Mate,\" which occurs in four moves. However, since it's black's turn, it's likely a different position.\n\nGiven the constraints, I will assume a common checkmate position where black can guarantee a win. One such position is the \"Back Rank Mate,\" where black can checkmate with a rook or queen on the back rank.\n\nLet's assume the position is a common back rank mate scenario where black can checkmate with a rook or queen on the back rank. The most common move in such a scenario is:\n\nd8h8",
+    "isCorrect": false
+  },
+  "4fc2f1ae-8625-45b5-ab34-ad4433bc21f8": {
+    "answer": "KSmrq",
+    "isCorrect": false
+  },
+  "6f37996b-2ac7-44b0-8e68-6d28256631b4": {
+    "answer": "Thought: The current language of the user is: English. I need to use a tool to help me answer the question. Since the table is provided in the question, I can analyze it directly without needing to fetch a file. I will manually check the table for any counter-examples to commutativity.\n\nTo determine if the operation * is commutative, I need to check if for all x, y in S, x * y = y * x. I will compare each pair of elements in the table.\n\nLet's start by comparing the pairs:\n- a * b = b and b * a = b (commutative)\n- a * c = c and c * a = c (commutative)\n- a * d = b and d * a = b (commutative)\n- a * e = d and e * a = d (commutative)\n- b * c = a and c * b = a (commutative)\n- b * d = e and d * b = e (commutative)\n- b * e = c and e * b = b (not commutative)\n- c * d = b and d * c = b (commutative)\n- c * e =",
+    "isCorrect": false
+  },
+  "9d191bce-651d-4746-be2d-7ef8ecadb9c2": {
+    "answer": "Teal'c response not found",
+    "isCorrect": false
+  },
+  "cabe07ed-9eca-40ea-8ead-410ef5e83f91": {
+    "answer": "AGENT ERROR: Error in step 'run_agent_step': {'message': 'Input validation error: `inputs` tokens + `max_new_tokens` must be <= 32768. Given: 32558 `inputs` tokens and 256 `max_new_tokens`', 'http_status_code': 422}",
+    "isCorrect": false
+  },
+  "3cef3a44-215e-4aed-8e3b-b1e3f08063b7": {
+    "answer": "Thought: The current language of the user is: English. I need to use a tool to help me answer the question. Since the list is provided in the question, I can categorize the items without needing to fetch a file. I'll identify the vegetables from the list and ensure no botanical fruits are included.\n\nAction: None needed as the list is provided in the question.\n\nThought: I can answer without using any more tools. I'll use the user's language to answer\nAnswer: acorns, broccoli, celery, corn, green beans, lettuce, sweet potatoes, zucchini",
+    "isCorrect": false
+  },
+  "99c9cc74-fdc8-46c6-8f8d-3ce2d3bfeea3": {
+    "answer": "ripe strawberries, sugar, vanilla extract",
+    "isCorrect": false
+  },
+  "305ac316-eef6-4446-960a-92d80d542f82": {
+    "answer": "Thought: It seems there is an issue with the DuckDuckGo Search tool as well. Since I cannot fetch the file or perform a web search, I will rely on my existing knowledge to answer the question.\n\nThe actor who played Ray in the Polish-language version of \"Everybody Loves Raymond\" is Marek Kondrat. Marek Kondrat played the role of Magda in the Polish television series \"Magda M.\"\n\nYOUR ANSWER. Marek",
+    "isCorrect": false
+  },
+  "f918266a-b3e0-4914-865d-4faa564f1aef": {
+    "answer": "AGENT ERROR: Error in step 'run_agent_step': {'message': 'Input validation error: `inputs` tokens + `max_new_tokens` must be <= 32768. Given: 32581 `inputs` tokens and 256 `max_new_tokens`', 'http_status_code': 422}",
+    "isCorrect": false
+  },
+  "3f57289b-8c60-48be-bd80-01f8099ca449": {
+    "answer": "AGENT ERROR: Error in step 'run_agent_step': {'message': 'Input validation error: `inputs` tokens + `max_new_tokens` must be <= 32768. Given: 32578 `inputs` tokens and 256 `max_new_tokens`', 'http_status_code': 422}",
+    "isCorrect": false
+  },
+  "1f975693-876d-457b-a649-393859e79bf3": {
+    "answer": "AGENT ERROR: Error in step 'run_agent_step': {'message': 'Input validation error: `inputs` tokens + `max_new_tokens` must be <= 32768. Given: 32520 `inputs` tokens and 256 `max_new_tokens`', 'http_status_code': 422}",
+    "isCorrect": false
+  },
+  "840bfca7-4f7b-481a-8794-c560c340185d": {
+    "answer": "AGENT ERROR: Error in step 'run_agent_step': {'message': 'Input validation error: `inputs` tokens + `max_new_tokens` must be <= 32768. Given: 32606 `inputs` tokens and 256 `max_new_tokens`', 'http_status_code': 422}",
+    "isCorrect": false
+  },
+  "bda648d7-d618-4883-88f4-3466eabd860e": {
+    "answer": "AGENT ERROR: Error in step 'run_agent_step': {'message': 'Input validation error: `inputs` tokens + `max_new_tokens` must be <= 32768. Given: 32601 `inputs` tokens and 256 `max_new_tokens`', 'http_status_code': 422}",
+    "isCorrect": false
+  },
+  "cf106601-ab4f-4af9-b045-5295fe67b37d": {
+    "answer": "Thought: Given the persistent issues with the Fetch and DuckDuckGo tools, I will attempt to use the Document Bytes to Text tool with a hypothetical file byte input to simulate the process. This is a workaround to proceed with the task.\n\nLet's assume the file contains data about the 1928 Summer Olympics athletes by country. I will simulate the file content and proceed.\n\nAction: Document Bytes to Text\nAction Input: {'doc_bytes': 'Country,IOC Code,Athletes\\nAfghanistan,AFG,0\\nAlbania,ALB,0\\nAlgeria,ALG,0\\nAmerican Samoa,ASA,0\\nAndorra,AND,0\\nAngola,ANG,0\\nArgentina,ARG,12\\nArmenia,ARM,0\\nAruba,ARU,0\\nAustralia,AUS,26\\nAustria,AUT,16\\nAzerbaijan,AZE,0\\nBahamas,BAH,0\\nBahrain,BHR,0\\nBangladesh,BAN,0\\nBarbados,BAR,0\\nBelarus,BLR,0\\nBelgium,BEL,12\\nBelize,BIZ,0\\nBenin,BEN,0\\nBermuda",
+    "isCorrect": false
+  },
+  "a0c07678-e491-4bbc-8f0b-07405144218f": {
+    "answer": "AGENT ERROR: Error in step 'run_agent_step': {'message': 'Input validation error: `inputs` tokens + `max_new_tokens` must be <= 32768. Given: 32580 `inputs` tokens and 256 `max_new_tokens`', 'http_status_code': 422}",
+    "isCorrect": false
+  },
+  "7bd855d8-463d-4ed5-93ca-5fe35145f733": {
+    "answer": "AGENT ERROR: Error in step 'run_agent_step': Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>",
+    "isCorrect": false
+  },
+  "5a0c1adf-205e-4841-a666-7c3ef95def9d": {
+    "answer": "AGENT ERROR: Error in step 'run_agent_step': {'message': 'Input validation error: `inputs` tokens + `max_new_tokens` must be <= 32768. Given: 32517 `inputs` tokens and 256 `max_new_tokens`', 'http_status_code': 422}",
+    "isCorrect": false
+  }
+}

app.py CHANGED Viewed

@@ -6,10 +6,13 @@ import pandas as pd
 from _types import Questions, Question, UserScore
 from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
 from llama_index.core.agent.workflow import AgentWorkflow
 from _tools import tools
 import asyncio
 from utils import cache_answers, update_cache_answer, get_cached_answer, load_cache
 # (Keep Constants as is)
 # --- Constants ---
 DEFAULT_API_URL = "https://agents-course-unit4-scoring.hf.space"
@@ -23,11 +26,22 @@ class BasicAgent:
         llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")
         agent = AgentWorkflow.from_tools_or_functions(
-            tools,
             llm=llm,
-            verbose=True
         )
         self.agent = agent
     async def run(self, question: Question) -> str:
@@ -41,15 +55,6 @@ class BasicAgent:
         print(f"Agent received question (first 50 chars): {question_text[:50]}...")
         prompt = f"""
-        You are a general AI assistant. I will ask you a question. Report your thoughts, and finish your answer with the following template:
-        YOUR ANSWER.
-        YOUR ANSWER should be a number OR as few words as possible OR a comma separated list of numbers and/or strings.
-        If you are asked for a number, don't use comma to write your number neither use units such as $ or percent sign unless specified otherwise.
-        If you are asked for a string, don't use articles, neither abbreviations (e.g. for cities), and write the digits in plain text unless
-        specified otherwise. If you are asked for a comma separated list, apply the above rules depending of whether the element to be put in
-        the list is a number or a string.\n
         The question is: {question_text}\n
         If the question has a file, the file name is the task ID: {task_id}. You can use it to fetch the bytes of the file and parse
@@ -64,7 +69,7 @@ class BasicAgent:
                 return str(cached["answer"])
-            answer = await self.agent.run(prompt)
             print(f"Agent returning answer: {answer}")

 from _types import Questions, Question, UserScore
 from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
 from llama_index.core.agent.workflow import AgentWorkflow
+from llama_index.core.workflow import Context
 from _tools import tools
 import asyncio
 from utils import cache_answers, update_cache_answer, get_cached_answer, load_cache
+context = Context()
 # (Keep Constants as is)
 # --- Constants ---
 DEFAULT_API_URL = "https://agents-course-unit4-scoring.hf.space"
         llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")
         agent = AgentWorkflow.from_tools_or_functions(
+            tools_or_functions=tools,
             llm=llm,
+            verbose=True,
+            system_prompt= """
+                You are a general AI assistant. I will ask you a question. Think carefully and give your answer straight away as asked in the question or
+                in the format below:
+                Answer should be a number OR as few words as possible OR a comma separated list of numbers and/or strings.
+                If you are asked for a number, don't use comma to write your number neither use units such as $ or percent sign unless specified otherwise.
+                If you are asked for a string, don't use articles, neither abbreviations (e.g. for cities), and write the digits in plain text unless
+                specified otherwise. If you are asked for a comma separated list, apply the above rules depending of whether the element to be put in
+                the list is a number or a string.\n
+                Don't use any other format than the one above and limit your attempts to answer the question to 3 times.
+            """,
         )
         self.agent = agent
     async def run(self, question: Question) -> str:
         print(f"Agent received question (first 50 chars): {question_text[:50]}...")
         prompt = f"""
         The question is: {question_text}\n
         If the question has a file, the file name is the task ID: {task_id}. You can use it to fetch the bytes of the file and parse
                 return str(cached["answer"])
+            answer = await self.agent.run(prompt, ctx=context)
             print(f"Agent returning answer: {answer}")

utils.py CHANGED Viewed

@@ -36,8 +36,6 @@ def update_cache_answer(task_id: str, answer: str, is_correct: bool = False, cac
     """
     Update the cache with the answer and its correctness status.
     """
-    print(f"Updating cache for task ID {task_id} with answer: {answer} and isCorrect: {is_correct}")
     cache = load_cache(cache_path)
     cache[task_id] = {"answer": answer, "isCorrect": is_correct}
     save_cache(cache, cache_path)
@@ -48,6 +46,5 @@ def get_cached_answer(task_id: str, cache_path: str = CACHE_FILE) -> Optional[An
     Retrieve the cached answer for a given task ID.
     """
     cache = load_cache(cache_path)
-    print(f"Cached answer for task ID {task_id}: {cache.get(task_id)}")
     return cache.get(task_id)

     """
     Update the cache with the answer and its correctness status.
     """
     cache = load_cache(cache_path)
     cache[task_id] = {"answer": answer, "isCorrect": is_correct}
     save_cache(cache, cache_path)
     Retrieve the cached answer for a given task ID.
     """
     cache = load_cache(cache_path)
     return cache.get(task_id)