Final_Assignment_Template

Sleeping

App Files Files Community

carolinacon commited on Aug 13, 2025

Commit

8073bab

1 Parent(s): f22eb38

Added basic web search agent functionality with langgraph

Browse files

Files changed (12) hide show

config/__init__.py +0 -0
config/prompts.yaml +88 -0
config/settings.py +29 -0
core/__init__.py +0 -0
core/agent.py +66 -0
core/state.py +6 -0
nodes/__init__.py +0 -0
nodes/nodes.py +70 -0
tools/__init__.py +0 -0
tools/tavily_tools.py +19 -0
utils/__init__.py +0 -0
utils/prompt_manager.py +79 -0

config/__init__.py ADDED Viewed

File without changes

config/prompts.yaml ADDED Viewed

	@@ -0,0 +1,88 @@

+prompts:
+  base_system:
+    content: |
+      You are a general AI assistant tasked with answering complex questions.
+      Make sure you think step by step in order to answer the given question.
+      Here is a summary of the steps you took so far:
+      <summary>
+      {{summary}}
+      </summary>
+       Include citations for all the information you retrieve, ensuring you know exactly where the data comes from.
+       If you have the information inside your knowledge, still call a tool in order to confirm it.
+       **Guidelines for Conducting Research:**
+       * **Citations:** Always support findings with source URLs, clearly provided as in-text citations.
+       * **Accuracy:** Rely solely on data obtained via provided tools—never fabricate information.
+       * **Methodology:** Follow a structured approach:
+       * **Thought:** Consider necessary information and next steps.
+       * **Action:** Select and execute appropriate tools.
+       * **Observation:** Analyze obtained results.
+       * Repeat Thought/Action/Observation cycles as needed.
+       * **Final Answer:** Synthesize and present findings with citations in markdown format.
+       **Example Workflows:**
+       **Workflow 1: Search Only**
+       **Question:** What are recent news headlines about artificial intelligence?
+       * **Thought:** I need quick, recent articles about AI.
+       * **Action:** Use Tavily Web Search with the query "recent artificial intelligence news" and set `time_range` to "week".
+       * **Observation:** Retrieved 10 relevant articles from reputable news sources.
+       * **Final Answer:** Summarize key headlines with citations.
+       **Workflow 2: Search and Extract**
+       **Question:** Provide detailed insights into recent advancements in quantum computing.
+       * **Thought:** I should find recent detailed articles first.
+       * **Action:** Use Tavily Web Search with the query "recent advancements in quantum computing" and set `time_range` to "month".
+       * **Observation:** Retrieved 10 relevant results.
+       * **Thought:** I should extract content from the most comprehensive article.
+       * **Action:** Use Tavily Web Extract on the most relevant URL from search results.
+       * **Observation:** Extracted detailed information about quantum computing advancements.
+       * **Final Answer:** Provide detailed insights summarized from extracted content with citations.
+       **Workflow 3: Search and Crawl**
+       **Question:** What are the latest advancements in renewable energy technologies?
+       * **Thought:** I need recent articles about advancements in renewable energy.
+       * **Action:** Use Tavily Web Search with the query "latest advancements in renewable energy technologies" and set `time_range` to "month".
+       * **Observation:** Retrieved 10 articles discussing recent developments in solar panels, wind turbines, and energy storage.
+       * **Thought:** To gain deeper insights, I'll crawl a relevant industry-leading renewable energy site.
+       * **Action:** Use Tavily Web Crawl on the URL of a leading renewable energy industry website, setting `max_depth` to 2.
+       * **Observation:** Gathered extensive content from multiple articles linked on the site, highlighting new technologies and innovations.
+       * **Final Answer:** Provide a synthesized summary of findings with citations.
+    type: base_system
+    variables: ["summary"]
+    version: 1.0
+    description: "Core system prompt for all interactions"
+  final_answer_processor:
+    content: |
+      You are a general AI assistant. You are given a question and an answer to that question.
+      Process the answer such that it contains only YOUR FINAL ANSWER and respects the following guidelines.
+      YOUR FINAL ANSWER should be a number OR as few words as possible OR a comma separated list of numbers and/or strings.
+      If you are asked for a number, don't use comma to write your number neither use units such as $ or percent sign unless specified otherwise.
+      If you are asked for a string, don't use articles, neither abbreviations (e.g. for cities), and write the digits in plain text unless specified otherwise.
+      If you are asked for a comma separated list, apply the above rules depending of whether the element to be put in the list is a number or a string.
+    type: answer_refinement
+    variables: []
+    version: 1.0
+    description: "Prompt for processing the final answer in order to make it compliant with gaia answers submission rules"
+  summarization:
+    content: |
+      This is summary of the conversation to date: {{summary}}
+      Extend the summary by taking into account the new messages above.
+      Try to follow this guideline. If the message consists in a tool call add a new bullet point and specify the tool and its action.
+      If the message consists in a tool call result append a summary of the result to the appropriate bullet point.
+      After analyzing the tool call result, specify if this has been useful or not.
+    type: memory_optimization
+    variables: ["summary"]
+    version: 1.0
+    description: "Prompt for summarization and memory optimization"

config/settings.py ADDED Viewed

	@@ -0,0 +1,29 @@

+# Configuration management
+import os
+from typing import Dict, Any
+from pathlib import Path
+class AgentConfig:
+    """Centralized configuration"""
+    def __init__(self):
+        # LLM Configuration
+        self.MODEL_NAME = os.getenv("MODEL_NAME", "gpt-4.1")
+        # File Paths
+        self.PROJECT_ROOT = Path(__file__).parent.parent
+        self.PROMPTS_PATH = self.PROJECT_ROOT / "config" / "prompts.yaml"
+    def to_dict(self) -> Dict[str, Any]:
+        """Convert config to dictionary"""
+        return {
+            key: value for key, value in self.__dict__.items()
+            if not key.startswith('_')
+        }
+# Global config instance
+config = AgentConfig()

core/__init__.py ADDED Viewed

File without changes

core/agent.py ADDED Viewed

	@@ -0,0 +1,66 @@

+from langchain_core.messages import HumanMessage
+from langgraph.graph.state import CompiledStateGraph
+from core.state import State
+from nodes.nodes import assistant, optimize_memory, response_processing
+from tools.tavily_tools import llm_tools
+from langgraph.graph import START, StateGraph, END
+from langgraph.prebuilt import tools_condition
+from langgraph.prebuilt import ToolNode
+class GaiaAgent:
+    react_graph: CompiledStateGraph
+    def __init__(self):
+        # Graph
+        builder = StateGraph(State)
+        # Define nodes: these do the work
+        builder.add_node("assistant", assistant)
+        builder.add_node("tools", ToolNode(llm_tools))
+        builder.add_node("optimize_memory", optimize_memory)
+        builder.add_node("response_processing", response_processing)
+        # Define edges: these determine how the control flow moves
+        builder.add_edge(START, "assistant")
+        builder.add_conditional_edges(
+            "assistant",
+            # If the latest message (result) from assistant is a tool call -> tools_condition routes to tools If the
+            # latest message (result) from assistant is a not a tool call -> tools_condition routes to
+            # response_processing
+            tools_condition, {"tools": "tools", "__end__": "response_processing"}
+        )
+        builder.add_edge("tools", "optimize_memory")
+        builder.add_edge("optimize_memory", "assistant")
+        builder.add_edge("response_processing", END)
+        self.react_graph = builder.compile()
+    def __call__(self, question: str) -> str:
+        messages = [HumanMessage(content=question)]
+        messages = self.react_graph.invoke({"messages": messages})
+        for m in messages['messages']:
+            m.pretty_print()
+        return m
+    def __streamed_call__(self, question: str) -> str:
+        # Test the web agent
+        inputs = {
+            "messages": [
+                HumanMessage(
+                    content=question
+                )
+            ]
+        }
+        # Stream the web agent's response
+        for s in self.react_graph.stream(inputs, stream_mode="values"):
+            message = s["messages"][-1]
+            if isinstance(message, tuple):
+                print(message)
+            else:
+                message.pretty_print()
+        return message.content

core/state.py ADDED Viewed

	@@ -0,0 +1,6 @@

+from langgraph.graph import MessagesState
+class State(MessagesState):
+    summary: str
+    question: str

nodes/__init__.py ADDED Viewed

File without changes

nodes/nodes.py ADDED Viewed

	@@ -0,0 +1,70 @@

+from langchain_core.messages import SystemMessage, HumanMessage, AIMessage, RemoveMessage
+from langchain_openai import ChatOpenAI
+from core.state import State
+import time
+from tools.tavily_tools import llm_tools
+from utils.prompt_manager import prompt_mgmt
+model = ChatOpenAI(model="gpt-4.1")
+response_processing_model = ChatOpenAI(model="gpt-4.1-mini")
+model = model.bind_tools(llm_tools, parallel_tool_calls=False)
+# Node
+def assistant(state: State):
+    # set up the question
+    # Get summary if it exists
+    summary = state.get("summary", "")
+    # Get original question if it exists
+    question = state.get("question", "")
+    if not question:
+        question = state["messages"][0]
+    sys_msg = SystemMessage(content=prompt_mgmt.render_template("base_system", {"summary": summary}))
+    try:
+        response = model.invoke([sys_msg] + state["messages"])
+    except Exception as e:
+        if "429" in str(e):
+            time.sleep(5)
+            response = model.invoke([sys_msg] + state["messages"])
+            return {"messages": [response]}
+        raise
+    return {"question": question, "messages": [response]}
+def response_processing(state: State):
+    question = state.get("question", "")
+    answer = state["messages"][-1]
+    print("Question:", question.content)
+    print("Answer:", answer.content)
+    gaia_messages = [HumanMessage(content=question.content), AIMessage(content=answer.content)]
+    gaia_sys_msg = SystemMessage(content=prompt_mgmt.render_template("final_answer_processor", {}))
+    response = response_processing_model.invoke([gaia_sys_msg] + gaia_messages)
+    return {"messages": [response]}
+def optimize_memory(state: State):
+    # First, we get any existing summary
+    summary = state.get("summary", "")
+    # Create our summarization prompt
+    if summary:
+        # A summary already exists
+        summary_message = prompt_mgmt.render_template("summarization", {"summary":summary})
+    else:
+        summary_message = "Create a summary of the conversation above:"
+    # Add prompt to our history
+    messages = state["messages"] + [HumanMessage(content=summary_message)]
+    response = model.invoke(messages)
+    # Delete all but the 2 most recent messages and the first one
+    delete_messages = [RemoveMessage(id=m.id) for m in state["messages"][:-2]]
+    return {"summary": response.content, "messages": delete_messages}

tools/__init__.py ADDED Viewed

File without changes

tools/tavily_tools.py ADDED Viewed

	@@ -0,0 +1,19 @@

+from langchain_tavily import TavilySearch
+from langchain_tavily import TavilyExtract
+from langchain_tavily import TavilyCrawl
+# Initialize Tavily Search Tool
+tavily_search_tool = TavilySearch(
+    max_results=10,
+    topic="general",
+)
+# Define the LangChain extract tool
+tavily_extract_tool = TavilyExtract(extract_depth="basic")
+# Define the LangChain crawl tool
+tavily_crawl_tool = TavilyCrawl()
+llm_tools = [
+    tavily_search_tool, tavily_extract_tool, tavily_crawl_tool
+]

utils/__init__.py ADDED Viewed

File without changes

utils/prompt_manager.py ADDED Viewed

	@@ -0,0 +1,79 @@

+from enum import Enum
+from dataclasses import dataclass, field
+from typing import Any, Dict, List, Optional
+from jinja2 import Environment, BaseLoader
+import tiktoken
+import yaml
+from pathlib import Path
+class PromptType(Enum):
+    BASE_SYSTEM = "base_system"
+    ANSWER_REFINEMENT = "answer_refinement"
+    MEMORY_OPTIMIZATION = "memory_optimization"
+@dataclass
+class PromptTemplate:
+    """Structured prompt template with metadata"""
+    name: str
+    content: str
+    prompt_type: PromptType
+    variables: List[str] = field(default_factory=list)
+    token_estimate: int = 0
+    version: str = "1.0"
+    description: str = ""
+class PromptManager:
+    """Centralized management for Agent's prompts"""
+    def __init__(self, prompt_config_path: str, model_name: str = "gpt-4.1"):
+        self.templates: Dict[str, PromptTemplate] = {}
+        self.jinja_env = Environment(loader=BaseLoader())
+        self.token_counter = tiktoken.encoding_for_model(model_name)
+        # Load prompts from config
+        self.load_prompts_from_config(prompt_config_path)
+    def load_prompts_from_config(self, config_path: str):
+        """Load prompts from YAML  configuration file"""
+        path = Path(config_path)
+        if path.suffix.lower() == '.yaml' or path.suffix.lower() == '.yml':
+            with open(path, 'r') as f:
+                config = yaml.safe_load(f)
+        for name, prompt_data in config.get('prompts', {}).items():
+            template = PromptTemplate(
+                name=name,
+                content=prompt_data['content'],
+                prompt_type=PromptType(prompt_data.get('type', 'base_system')),
+                variables=prompt_data.get('variables', []),
+                version=prompt_data.get('version', '1.0'),
+                description=prompt_data.get('description', '')
+            )
+            template.token_estimate = self._estimate_tokens(template.content)
+            self.templates[name] = template
+    def _estimate_tokens(self, text: str) -> int:
+        """Estimate token count for text"""
+        return len(self.token_counter.encode(text))
+    def render_template(self, name: str, state: Dict[str, Any]) -> str:
+        """Render template with current state variables"""
+        # Prepare template variables
+        template_vars = {}
+        # Add all state variables
+        template_vars.update(state)
+        # Create and render template
+        template = self.templates[name]
+        jinja_template = self.jinja_env.from_string(template.content)
+        return jinja_template.render(**template_vars)
+# Global instance
+prompt_mgmt = PromptManager("config\prompts.yaml")