Final_Project_Agent_Course

Sleeping

App Files Files Community

Thanh Vinh Vo commited on Jul 9, 2025

Commit

329838b

1 Parent(s): b913f84

update

Browse files

Files changed (1) hide show

app.py +3 -19

app.py CHANGED Viewed

@@ -30,34 +30,17 @@ def extract_table_from_html(html: str, match: str | None = None) -> list:
     """
     A tool that extracts HTML tables from HTML content and returns them as pandas DataFrames.
     Example usecases include extracting tables from Wikipedia pages, HTML emails, or other web content.
-    This function uses pandas.read_html() to parse HTML tables from the provided HTML content
-    and returns the extracted tables as a list of pandas DataFrames. It can optionally filter
-    tables based on a text pattern match.
     Args:
         html (str): The HTML content containing HTML tables to extract. This can be raw HTML
                    string content or a URL to a webpage.
         match (str | None, optional): A string or regular expression pattern to match
-                                    against table text content. Only tables containing
-                                    this pattern will be returned. If None, all tables
                                     are extracted. Defaults to None.
                                     DO NOT use HTML strings / tags in this parameter.
     Returns:
         list: A list of pandas DataFrames, where each DataFrame represents a table found
               in the HTML content. Returns an empty list if no tables are found.
-    Raises:
-        ValueError: If the HTML content is invalid or cannot be parsed.
-        Exception: If HTML parsing fails or other unexpected errors occur.
-    Note:
-        - Uses pandas.read_html() which requires lxml, html5lib, or BeautifulSoup4
-        - Tables must be properly formatted HTML <table> elements
-        - The match parameter is case-sensitive
-        - Returns native pandas DataFrames for direct manipulation and analysis
-        - Can accept either raw HTML content or URLs (pandas.read_html supports both)
-        - Returns empty list instead of raising error when no tables are found
     """
     import pandas as pd
@@ -256,6 +239,7 @@ class BasicAgent:
                     - Solving chess problems.
                 This agent follow rules below when possible:
                     1. `wikipedia` Python package is provided to interact with Wikipedia pages.
                     2. `chess` Python package is provided. Please use it when there is need to solve chess problems.
                     3. Please take the question literally! Do not add any additional information or assumptions.
@@ -268,7 +252,7 @@ class BasicAgent:
             model=InferenceClientModel(
                 "Qwen/Qwen2.5-32B-Instruct"
             ),
-            tools=[get_file, audio_to_text, extract_table_from_html],
             managed_agents=[
                 self.multimodal_agent,
                 self.code_agent],

     """
     A tool that extracts HTML tables from HTML content and returns them as pandas DataFrames.
     Example usecases include extracting tables from Wikipedia pages, HTML emails, or other web content.
     Args:
         html (str): The HTML content containing HTML tables to extract. This can be raw HTML
                    string content or a URL to a webpage.
         match (str | None, optional): A string or regular expression pattern to match
+                                    against table text content. If None, all tables
                                     are extracted. Defaults to None.
                                     DO NOT use HTML strings / tags in this parameter.
     Returns:
         list: A list of pandas DataFrames, where each DataFrame represents a table found
               in the HTML content. Returns an empty list if no tables are found.
     """
     import pandas as pd
                     - Solving chess problems.
                 This agent follow rules below when possible:
                     1. `wikipedia` Python package is provided to interact with Wikipedia pages.
+                    2. Use `extract_table_from_html` tool to process Wikipedia pages first before other approaches.
                     2. `chess` Python package is provided. Please use it when there is need to solve chess problems.
                     3. Please take the question literally! Do not add any additional information or assumptions.
             model=InferenceClientModel(
                 "Qwen/Qwen2.5-32B-Instruct"
             ),
+            tools=[get_file, audio_to_text],
             managed_agents=[
                 self.multimodal_agent,
                 self.code_agent],