Final_Assignment_Template

Sleeping

App Files Files Community

sqfoo commited on May 15, 2025

Commit

21595fa

verified ·

1 Parent(s): 6d176e2

Update agent.py

Browse files

Files changed (1) hide show

agent.py +4 -5

agent.py CHANGED Viewed

@@ -94,6 +94,7 @@ def image_caption(dir: str) -> str:
     return metadata[0].page_content
 # 2. Coding
 # 3. Multi-Modality
 # ("human", f"Question: {question}\nReport to validate: {final_answer}")
@@ -123,7 +124,8 @@ class BasicAgent:
                 - youtube_transcript: fetch the transcript of the Youtube video by passing the video url as input if the question asks for watching a Youtube video
                 - read_file: read the content of the attached file by passing the file directory as input
                 - image_caption: understand the visual content of the attached image by passing the image directory as input
                 HERE are some examples illustrating how and what tools to call.
                 ---------------
                 TASK: "In the video https://www.youtube.com/watch?v=L1vXCYZAYYM, what is the highest number of bird species to be on camera simultaneously?"
@@ -132,16 +134,13 @@ class BasicAgent:
                 TASK: How many Grammy Awards that Taylor Swift has won.
                 ACTION: Call the web_search tools with the query: 'how many Grammy Awards that Taylor Swift has won.' to extract the answer.
-                TASK: Count how many people in this image.
-                ACTION: Call the image_caption tool by passing the image directory as input. Then, use LLM to understand the image caption and answer the question.
                 TASK: How much the total expense in this spreadsheet?
                 ACTION: Call the read_file tool to extract the content of the provided spreadfile. Then, use LLM to extract the amount of every expense and sum them up.
                 TASK: How many All England Title that Lee Chong Wei won?
                 ACTION: Call wiki_search with the query: "Lee Chong Wei". Extract the relevant row of All England Title and count how many rows is there.
         """
-        self.tools = [web_search, visit_webpage, wiki_search, youtube_transcript, read_file, image_caption]
         self.prompt = ChatPromptTemplate.from_messages([
             ("system", self.sys_prompt),
             ("human", "{input}")

     return metadata[0].page_content
 # 2. Coding
+from langchain_experimental.tools import PythonREPLTool
 # 3. Multi-Modality
 # ("human", f"Question: {question}\nReport to validate: {final_answer}")
                 - youtube_transcript: fetch the transcript of the Youtube video by passing the video url as input if the question asks for watching a Youtube video
                 - read_file: read the content of the attached file by passing the file directory as input
                 - image_caption: understand the visual content of the attached image by passing the image directory as input
+                - PythonREPLTool: run the python code
                 HERE are some examples illustrating how and what tools to call.
                 ---------------
                 TASK: "In the video https://www.youtube.com/watch?v=L1vXCYZAYYM, what is the highest number of bird species to be on camera simultaneously?"
                 TASK: How many Grammy Awards that Taylor Swift has won.
                 ACTION: Call the web_search tools with the query: 'how many Grammy Awards that Taylor Swift has won.' to extract the answer.
                 TASK: How much the total expense in this spreadsheet?
                 ACTION: Call the read_file tool to extract the content of the provided spreadfile. Then, use LLM to extract the amount of every expense and sum them up.
                 TASK: How many All England Title that Lee Chong Wei won?
                 ACTION: Call wiki_search with the query: "Lee Chong Wei". Extract the relevant row of All England Title and count how many rows is there.
         """
+        elf.tools = [web_search, visit_webpage, wiki_search, youtube_transcript, read_file, image_caption, PythonREPLTool()]
         self.prompt = ChatPromptTemplate.from_messages([
             ("system", self.sys_prompt),
             ("human", "{input}")