Spaces:

shekkari21
/

agent-from-scratch

Sleeping

App Files Files Community

shekkari21 commited on Jan 29

Commit

4dbe519

1 Parent(s): ec96f6b

added files for creating basic agentic loop

Browse files

Files changed (13) hide show

.gitignore +3 -0
README.md +99 -17
agent_framework/README.md +128 -0
agent_framework/__init__.py +51 -0
agent_framework/agent.py +171 -0
agent_framework/llm.py +118 -0
agent_framework/mcp.py +84 -0
agent_framework/models.py +62 -0
agent_framework/tools.py +115 -0
agent_framework/utils.py +79 -0
example.py +64 -0
my_code.ipynb +1 -926
tavily_mcp_server.py +38 -0

.gitignore CHANGED Viewed

@@ -214,3 +214,6 @@ __marimo__/
 # Streamlit
 .streamlit/secrets.toml

 # Streamlit
 .streamlit/secrets.toml
+my_code.ipynb
+__pycache__/
+.venv/

README.md CHANGED Viewed

@@ -1,7 +1,21 @@
-This repository contains the code for Manning Publications' "Build an AI Agent From Scratch".
-### Install uv (docs: https://docs.astral.sh/uv/getting-started/installation/)
-- macOS/Linux (official script):
 ```bash
 curl -LsSf https://astral.sh/uv/install.sh | sh
 ```
@@ -9,31 +23,99 @@ curl -LsSf https://astral.sh/uv/install.sh | sh
 ```bash
 brew install uv
 ```
-- Verify installation:
-```bash
-uv --version
-```
-### Create a virtual environment (uv venv)
 ```bash
 uv venv
 source .venv/bin/activate
 ```
-### Install dependencies with uv
 ```bash
 uv pip install -r requirements.txt
 ```
-### Install scratch_agents package (Required for Chapter 4+)
-- For Chapter 4 and beyond, install the scratch_agents package in editable mode:
 ```bash
-uv pip install -e .
 ```
-### Environment variables
-- Copy the example env file and set your API keys:
-```bash
-cp .env.example .env
 ```
-- Open `.env` and provide the necessary keys (e.g., `OPENAI_API_KEY=...`).

+# AI Agent Framework
+A flexible framework for building AI agents with tool support, MCP integration, and multi-step reasoning.
+## Features
+- 🤖 **Agent System**: Multi-step reasoning with tool execution
+- 🛠️ **Tool Framework**: Easy tool creation and integration
+- 🔌 **MCP Integration**: Load tools from MCP servers
+- 💬 **LLM Client**: Unified interface for LLM API calls via LiteLLM
+- 📦 **Modular Design**: Clean, organized package structure
+## Installation
+### Prerequisites
+Install `uv` (recommended package manager):
+- macOS/Linux:
 ```bash
 curl -LsSf https://astral.sh/uv/install.sh | sh
 ```
 ```bash
 brew install uv
 ```
+### Setup
+1. Create a virtual environment:
 ```bash
 uv venv
 source .venv/bin/activate
 ```
+2. Install dependencies:
 ```bash
 uv pip install -r requirements.txt
 ```
+3. Set up environment variables:
 ```bash
+cp .env.example .env
+# Edit .env and add your API keys (OPENAI_API_KEY, TAVILY_API_KEY, etc.)
 ```
+## Quick Start
+```python
+from agent_framework import Agent, LlmClient, FunctionTool
+# Define a tool
+def calculator(expression: str) -> float:
+    """Calculate mathematical expressions."""
+    return eval(expression)
+# Create the agent
+agent = Agent(
+    model=LlmClient(model="gpt-5-mini"),
+    tools=[FunctionTool(calculator)],
+    instructions="You are a helpful assistant.",
+)
+# Run the agent
+result = await agent.run("What is 1234 * 5678?")
+print(result.output)  # "7006652"
+```
+## Package Structure
+```
+agent_framework/
+├── __init__.py      # Package exports
+├── models.py        # Core data models (Message, ToolCall, Event, ExecutionContext)
+├── tools.py         # BaseTool and FunctionTool classes
+├── llm.py           # LlmClient and request/response models
+├── agent.py         # Agent and AgentResult classes
+├── mcp.py           # MCP tool loading utilities
+└── utils.py         # Helper functions for tool definitions
 ```
+## Usage Examples
+### Using the @tool Decorator
+```python
+from agent_framework import tool
+@tool
+def multiply(a: float, b: float) -> float:
+    """Multiply two numbers."""
+    return a * b
+# multiply is now a FunctionTool instance
+```
+### MCP Tool Integration
+```python
+from agent_framework import load_mcp_tools
+import os
+connection = {
+    "command": "npx",
+    "args": ["-y", "tavily-mcp@latest"],
+    "env": {"TAVILY_API_KEY": os.getenv("TAVILY_API_KEY")}
+}
+mcp_tools = await load_mcp_tools(connection)
+agent = Agent(
+    model=LlmClient(model="gpt-5-mini"),
+    tools=mcp_tools,
+)
+```
+## Documentation
+See `agent_framework/README.md` for detailed API documentation.
+## License
+See LICENSE file for details.

agent_framework/README.md ADDED Viewed

	@@ -0,0 +1,128 @@

+# Agent Framework
+A flexible framework for building AI agents with tool support, MCP integration, and multi-step reasoning.
+## Structure
+```
+agent_framework/
+├── __init__.py      # Package exports
+├── models.py        # Core data models (Message, ToolCall, Event, ExecutionContext)
+├── tools.py         # BaseTool and FunctionTool classes
+├── llm.py           # LlmClient and request/response models
+├── agent.py         # Agent and AgentResult classes
+├── mcp.py           # MCP tool loading utilities
+└── utils.py         # Helper functions for tool definitions
+```
+## Quick Start
+```python
+from agent_framework import Agent, LlmClient, FunctionTool
+# Define a tool
+def calculator(expression: str) -> float:
+    """Calculate mathematical expressions."""
+    return eval(expression)
+# Create the agent
+agent = Agent(
+    model=LlmClient(model="gpt-5-mini"),
+    tools=[FunctionTool(calculator)],
+    instructions="You are a helpful assistant.",
+)
+# Run the agent
+result = await agent.run("What is 1234 * 5678?")
+print(result.output)  # "7006652"
+```
+## Components
+### Models (`models.py`)
+- `Message`: Text messages in conversations
+- `ToolCall`: LLM's request to execute a tool
+- `ToolResult`: Result from tool execution
+- `Event`: Recorded occurrence during agent execution
+- `ExecutionContext`: Central storage for execution state
+### Tools (`tools.py`)
+- `BaseTool`: Abstract base class for all tools
+- `FunctionTool`: Wraps Python functions as tools
+### LLM (`llm.py`)
+- `LlmClient`: Client for LLM API calls using LiteLLM
+- `LlmRequest`: Request object for LLM calls
+- `LlmResponse`: Response object from LLM calls
+### Agent (`agent.py`)
+- `Agent`: Main agent class that orchestrates reasoning and tool execution
+- `AgentResult`: Result of an agent execution
+### MCP (`mcp.py`)
+- `load_mcp_tools()`: Load tools from MCP servers
+### Utils (`utils.py`)
+- `function_to_input_schema()`: Convert function signature to JSON Schema
+- `format_tool_definition()`: Format tool definition in OpenAI format
+- `tool`: Decorator to convert functions to tools
+## Usage Examples
+### Basic Tool Usage
+```python
+from agent_framework import FunctionTool
+def my_function(x: int, y: int) -> int:
+    """Add two numbers."""
+    return x + y
+tool = FunctionTool(my_function)
+result = await tool.execute(context, x=5, y=3)  # 8
+```
+### Using the @tool Decorator
+```python
+from agent_framework import tool
+@tool
+def multiply(a: float, b: float) -> float:
+    """Multiply two numbers."""
+    return a * b
+# multiply is now a FunctionTool instance
+```
+### MCP Tool Integration
+```python
+from agent_framework import load_mcp_tools
+import os
+connection = {
+    "command": "npx",
+    "args": ["-y", "tavily-mcp@latest"],
+    "env": {"TAVILY_API_KEY": os.getenv("TAVILY_API_KEY")}
+}
+mcp_tools = await load_mcp_tools(connection)
+agent = Agent(
+    model=LlmClient(model="gpt-5-mini"),
+    tools=mcp_tools,
+)
+```
+## Installation
+The framework uses:
+- `pydantic` for data validation
+- `litellm` for LLM API calls
+- `mcp` for MCP server integration
+Install dependencies:
+```bash
+pip install pydantic litellm mcp
+```

agent_framework/__init__.py ADDED Viewed

	@@ -0,0 +1,51 @@

+"""Agent Framework - A flexible framework for building AI agents with tool support."""
+from .models import (
+    Message,
+    ToolCall,
+    ToolResult,
+    ContentItem,
+    Event,
+    ExecutionContext,
+)
+from .tools import BaseTool, FunctionTool, tool
+from .llm import LlmClient, LlmRequest, LlmResponse
+from .agent import Agent, AgentResult
+from .mcp import load_mcp_tools
+from .utils import (
+    function_to_input_schema,
+    format_tool_definition,
+    function_to_tool_definition,
+    mcp_tools_to_openai_format,
+)
+__all__ = [
+    # Models
+    "Message",
+    "ToolCall",
+    "ToolResult",
+    "ContentItem",
+    "Event",
+    "ExecutionContext",
+    # Tools
+    "BaseTool",
+    "FunctionTool",
+    "tool",
+    # LLM
+    "LlmClient",
+    "LlmRequest",
+    "LlmResponse",
+    # Agent
+    "Agent",
+    "AgentResult",
+    # MCP
+    "load_mcp_tools",
+    # Utils
+    "function_to_input_schema",
+    "format_tool_definition",
+    "function_to_tool_definition",
+    "mcp_tools_to_openai_format",
+]
+__version__ = "0.1.0"

agent_framework/agent.py ADDED Viewed

	@@ -0,0 +1,171 @@

+"""Agent class for executing multi-step reasoning with tools."""
+from dataclasses import dataclass
+from typing import List, Optional
+from pydantic import BaseModel
+from .models import (
+    ExecutionContext,
+    Event,
+    Message,
+    ToolCall,
+    ToolResult
+)
+from .tools import BaseTool
+from .llm import LlmClient, LlmRequest, LlmResponse
+@dataclass
+class AgentResult:
+    """Result of an agent execution."""
+    output: str | BaseModel
+    context: ExecutionContext
+class Agent:
+    """Agent that can reason and use tools to solve tasks."""
+    def __init__(
+        self,
+        model: LlmClient,
+        tools: List[BaseTool] = None,
+        instructions: str = "",
+        max_steps: int = 10,
+        name: str = "agent",
+    ):
+        self.model = model
+        self.instructions = instructions
+        self.max_steps = max_steps
+        self.name = name
+        self.tools = self._setup_tools(tools or [])
+    def _setup_tools(self, tools: List[BaseTool]) -> List[BaseTool]:
+        return tools
+    def _prepare_llm_request(self, context: ExecutionContext) -> LlmRequest:
+        """Convert execution context to LLM request."""
+        # Flatten events into content items
+        flat_contents = []
+        for event in context.events:
+            flat_contents.extend(event.content)
+        return LlmRequest(
+            instructions=[self.instructions] if self.instructions else [],
+            contents=flat_contents,
+            tools=self.tools,
+            tool_choice="auto" if self.tools else None,
+        )
+    async def think(self, llm_request: LlmRequest) -> LlmResponse:
+        """Get LLM's response/decision."""
+        return await self.model.generate(llm_request)
+    async def act(
+        self,
+        context: ExecutionContext,
+        tool_calls: List[ToolCall]
+    ) -> List[ToolResult]:
+        """Execute tool calls and return results."""
+        tools_dict = {tool.name: tool for tool in self.tools}
+        results = []
+        for tool_call in tool_calls:
+            if tool_call.name not in tools_dict:
+                results.append(ToolResult(
+                    tool_call_id=tool_call.tool_call_id,
+                    name=tool_call.name,
+                    status="error",
+                    content=[f"Tool '{tool_call.name}' not found"],
+                ))
+                continue
+            tool = tools_dict[tool_call.name]
+            try:
+                output = await tool.execute(context, **tool_call.arguments)
+                results.append(ToolResult(
+                    tool_call_id=tool_call.tool_call_id,
+                    name=tool_call.name,
+                    status="success",
+                    content=[str(output)],
+                ))
+            except Exception as e:
+                results.append(ToolResult(
+                    tool_call_id=tool_call.tool_call_id,
+                    name=tool_call.name,
+                    status="error",
+                    content=[str(e)],
+                ))
+        return results
+    async def step(self, context: ExecutionContext):
+        """Execute one step of the agent loop."""
+        # Prepare what to send to the LLM
+        llm_request = self._prepare_llm_request(context)
+        # Get LLM's decision
+        llm_response = await self.think(llm_request)
+        # Record LLM response as an event
+        response_event = Event(
+            execution_id=context.execution_id,
+            author=self.name,
+            content=llm_response.content,
+        )
+        context.add_event(response_event)
+        # Execute tools if the LLM requested any
+        tool_calls = [c for c in llm_response.content if isinstance(c, ToolCall)]
+        if tool_calls:
+            tool_results = await self.act(context, tool_calls)
+            tool_event = Event(
+                execution_id=context.execution_id,
+                author=self.name,
+                content=tool_results,
+            )
+            context.add_event(tool_event)
+        context.increment_step()
+    async def run(
+        self,
+        user_input: str,
+        context: ExecutionContext = None
+    ) -> AgentResult:
+        """Run the agent with user input."""
+        # Create or reuse context
+        if context is None:
+            context = ExecutionContext()
+        # Add user input as the first event
+        user_event = Event(
+            execution_id=context.execution_id,
+            author="user",
+            content=[Message(role="user", content=user_input)]
+        )
+        context.add_event(user_event)
+        # Execute steps until completion or max steps reached
+        while not context.final_result and context.current_step < self.max_steps:
+            await self.step(context)
+            # Check if the last event is a final response
+            last_event = context.events[-1]
+            if self._is_final_response(last_event):
+                context.final_result = self._extract_final_result(last_event)
+        return AgentResult(output=context.final_result, context=context)
+    def _is_final_response(self, event: Event) -> bool:
+        """Check if this event contains a final response."""
+        has_tool_calls = any(isinstance(c, ToolCall) for c in event.content)
+        has_tool_results = any(isinstance(c, ToolResult) for c in event.content)
+        return not has_tool_calls and not has_tool_results
+    def _extract_final_result(self, event: Event) -> str:
+        """Extract the final result from an event."""
+        for item in event.content:
+            if isinstance(item, Message) and item.role == "assistant":
+                return item.content
+        return None

agent_framework/llm.py ADDED Viewed

	@@ -0,0 +1,118 @@

+"""LLM client and request/response models."""
+import json
+from typing import Any, List, Optional, Dict
+from pydantic import BaseModel, Field, ConfigDict
+from litellm import acompletion
+from .models import Message, ToolCall, ToolResult, ContentItem
+class LlmRequest(BaseModel):
+    """Request object for LLM calls."""
+    model_config = ConfigDict(arbitrary_types_allowed=True)
+    instructions: List[str] = Field(default_factory=list)
+    contents: List[ContentItem] = Field(default_factory=list)
+    tools: List[Any] = Field(default_factory=list)
+    tool_choice: Optional[str] = None
+class LlmResponse(BaseModel):
+    """Response object from LLM calls."""
+    content: List[ContentItem] = Field(default_factory=list)
+    error_message: Optional[str] = None
+    usage_metadata: Dict[str, Any] = Field(default_factory=dict)
+class LlmClient:
+    """Client for LLM API calls using LiteLLM."""
+    def __init__(self, model: str, **config):
+        self.model = model
+        self.config = config
+    async def generate(self, request: LlmRequest) -> LlmResponse:
+        """Generate a response from the LLM."""
+        try:
+            messages = self._build_messages(request)
+            tools = [t.tool_definition for t in request.tools] if request.tools else None
+            response = await acompletion(
+                model=self.model,
+                messages=messages,
+                tools=tools,
+                **({"tool_choice": request.tool_choice}
+                   if request.tool_choice else {}),
+                **self.config
+            )
+            return self._parse_response(response)
+        except Exception as e:
+            return LlmResponse(error_message=str(e))
+    def _build_messages(self, request: LlmRequest) -> List[dict]:
+        """Convert LlmRequest to API message format."""
+        messages = []
+        for instruction in request.instructions:
+            messages.append({"role": "system", "content": instruction})
+        for item in request.contents:
+            if isinstance(item, Message):
+                messages.append({"role": item.role, "content": item.content})
+            elif isinstance(item, ToolCall):
+                tool_call_dict = {
+                    "id": item.tool_call_id,
+                    "type": "function",
+                    "function": {
+                        "name": item.name,
+                        "arguments": json.dumps(item.arguments)
+                    }
+                }
+                # Append to previous assistant message if exists
+                if messages and messages[-1]["role"] == "assistant":
+                    messages[-1].setdefault("tool_calls", []).append(tool_call_dict)
+                else:
+                    messages.append({
+                        "role": "assistant",
+                        "content": None,
+                        "tool_calls": [tool_call_dict]
+                    })
+            elif isinstance(item, ToolResult):
+                messages.append({
+                    "role": "tool",
+                    "tool_call_id": item.tool_call_id,
+                    "content": str(item.content[0]) if item.content else ""
+                })
+        return messages
+    def _parse_response(self, response) -> LlmResponse:
+        """Convert API response to LlmResponse."""
+        choice = response.choices[0]
+        content_items = []
+        if choice.message.content:
+            content_items.append(Message(
+                role="assistant",
+                content=choice.message.content
+            ))
+        if choice.message.tool_calls:
+            for tc in choice.message.tool_calls:
+                content_items.append(ToolCall(
+                    tool_call_id=tc.id,
+                    name=tc.function.name,
+                    arguments=json.loads(tc.function.arguments)
+                ))
+        return LlmResponse(
+            content=content_items,
+            usage_metadata={
+                "input_tokens": response.usage.prompt_tokens,
+                "output_tokens": response.usage.completion_tokens,
+            }
+        )

agent_framework/mcp.py ADDED Viewed

	@@ -0,0 +1,84 @@

+"""MCP (Model Context Protocol) tool integration."""
+import os
+from typing import Dict, List
+from mcp import ClientSession, StdioServerParameters
+from mcp.client.stdio import stdio_client
+from .tools import BaseTool, FunctionTool
+def _extract_text_content(result) -> str:
+    """Extract text content from MCP tool result."""
+    if not hasattr(result, 'content'):
+        return str(result)
+    texts = []
+    for item in result.content:
+        if hasattr(item, 'text'):
+            texts.append(item.text)
+        else:
+            texts.append(str(item))
+    return "\n\n".join(texts)
+async def load_mcp_tools(connection: Dict) -> List[BaseTool]:
+    """Load tools from an MCP server and convert to FunctionTools.
+    Args:
+        connection: Dictionary with connection parameters:
+            - command: Command to run the MCP server
+            - args: Arguments for the command
+            - env: Environment variables (optional)
+    Returns:
+        List of BaseTool instances wrapping MCP tools
+    Example:
+        connection = {
+            "command": "npx",
+            "args": ["-y", "tavily-mcp@latest"],
+            "env": {"TAVILY_API_KEY": os.getenv("TAVILY_API_KEY")}
+        }
+        tools = await load_mcp_tools(connection)
+    """
+    tools = []
+    async with stdio_client(StdioServerParameters(**connection)) as (read, write):
+        async with ClientSession(read, write) as session:
+            await session.initialize()
+            mcp_tools = await session.list_tools()
+            for mcp_tool in mcp_tools.tools:
+                func_tool = _create_mcp_tool(mcp_tool, connection)
+                tools.append(func_tool)
+    return tools
+def _create_mcp_tool(mcp_tool, connection: Dict) -> FunctionTool:
+    """Create a FunctionTool that wraps an MCP tool."""
+    async def call_mcp(**kwargs):
+        async with stdio_client(StdioServerParameters(**connection)) as (read, write):
+            async with ClientSession(read, write) as session:
+                await session.initialize()
+                result = await session.call_tool(mcp_tool.name, kwargs)
+                return _extract_text_content(result)
+    tool_definition = {
+        "type": "function",
+        "function": {
+            "name": mcp_tool.name,
+            "description": mcp_tool.description,
+            "parameters": mcp_tool.inputSchema,
+        }
+    }
+    return FunctionTool(
+        func=call_mcp,
+        name=mcp_tool.name,
+        description=mcp_tool.description,
+        tool_definition=tool_definition
+    )

agent_framework/models.py ADDED Viewed

	@@ -0,0 +1,62 @@

+"""Core data models for the agent framework."""
+from typing import Literal, Union, List, Dict, Optional, Any
+from pydantic import BaseModel, Field
+from dataclasses import dataclass, field
+import uuid
+from datetime import datetime
+class Message(BaseModel):
+    """A text message in the conversation."""
+    type: Literal["message"] = "message"
+    role: Literal["system", "user", "assistant"]
+    content: str
+class ToolCall(BaseModel):
+    """LLM's request to execute a tool."""
+    type: Literal["tool_call"] = "tool_call"
+    tool_call_id: str
+    name: str
+    arguments: dict
+class ToolResult(BaseModel):
+    """Result from tool execution."""
+    type: Literal["tool_result"] = "tool_result"
+    tool_call_id: str
+    name: str
+    status: Literal["success", "error"]
+    content: list
+ContentItem = Union[Message, ToolCall, ToolResult]
+class Event(BaseModel):
+    """A recorded occurrence during agent execution."""
+    id: str = Field(default_factory=lambda: str(uuid.uuid4()))
+    execution_id: str
+    timestamp: float = Field(default_factory=lambda: datetime.now().timestamp())
+    author: str  # "user" or agent name
+    content: List[ContentItem] = Field(default_factory=list)
+@dataclass
+class ExecutionContext:
+    """Central storage for all execution state."""
+    execution_id: str = field(default_factory=lambda: str(uuid.uuid4()))
+    events: List[Event] = field(default_factory=list)
+    current_step: int = 0
+    state: Dict[str, Any] = field(default_factory=dict)
+    final_result: Optional[str | BaseModel] = None
+    def add_event(self, event: Event):
+        """Append an event to the execution history."""
+        self.events.append(event)
+    def increment_step(self):
+        """Move to the next execution step."""
+        self.current_step += 1

agent_framework/tools.py ADDED Viewed

	@@ -0,0 +1,115 @@

+"""Tool system for the agent framework."""
+from abc import ABC, abstractmethod
+from typing import Dict, Any, Callable
+import inspect
+from .models import ExecutionContext
+from .utils import function_to_input_schema, format_tool_definition
+class BaseTool(ABC):
+    """Abstract base class for all tools."""
+    def __init__(
+        self,
+        name: str = None,
+        description: str = None,
+        tool_definition: Dict[str, Any] = None,
+    ):
+        self.name = name or self.__class__.__name__
+        self.description = description or self.__doc__ or ""
+        self._tool_definition = tool_definition
+    @property
+    def tool_definition(self) -> Dict[str, Any] | None:
+        return self._tool_definition
+    @abstractmethod
+    async def execute(self, context: ExecutionContext, **kwargs) -> Any:
+        pass
+    async def __call__(self, context: ExecutionContext, **kwargs) -> Any:
+        return await self.execute(context, **kwargs)
+class FunctionTool(BaseTool):
+    """Wraps a Python function as a BaseTool."""
+    def __init__(
+        self,
+        func: Callable,
+        name: str = None,
+        description: str = None,
+        tool_definition: Dict[str, Any] = None
+    ):
+        self.func = func
+        self.needs_context = 'context' in inspect.signature(func).parameters
+        self.name = name or func.__name__
+        self.description = description or (func.__doc__ or "").strip()
+        tool_definition = tool_definition or self._generate_definition()
+        super().__init__(
+            name=self.name,
+            description=self.description,
+            tool_definition=tool_definition
+        )
+    async def execute(self, context: ExecutionContext = None, **kwargs) -> Any:
+        """Execute the wrapped function.
+        Context is only required if the wrapped function has a 'context' parameter.
+        """
+        if self.needs_context:
+            if context is None:
+                raise ValueError(
+                    f"Tool '{self.name}' requires a context parameter. "
+                    f"Please provide an ExecutionContext instance."
+                )
+            result = self.func(context=context, **kwargs)
+        else:
+            result = self.func(**kwargs)
+        # Handle both sync and async functions
+        if inspect.iscoroutine(result):
+            return await result
+        return result
+    def _generate_definition(self) -> Dict[str, Any]:
+        """Generate tool definition from function signature."""
+        parameters = function_to_input_schema(self.func)
+        return format_tool_definition(self.name, self.description, parameters)
+def tool(
+    func: Callable = None,
+    *,
+    name: str = None,
+    description: str = None,
+    tool_definition: Dict[str, Any] = None
+):
+    """Decorator to convert a function into a FunctionTool.
+    Usage:
+        @tool
+        def my_function(x: int) -> int:
+            return x * 2
+        # Or with parameters:
+        @tool(name="custom_name", description="Custom description")
+        def my_function(x: int) -> int:
+            return x * 2
+    """
+    from typing import Union
+    def decorator(f: Callable) -> FunctionTool:
+        return FunctionTool(
+            func=f,
+            name=name,
+            description=description,
+            tool_definition=tool_definition
+        )
+    if func is not None:
+        return decorator(func)
+    return decorator

agent_framework/utils.py ADDED Viewed

	@@ -0,0 +1,79 @@

+"""Utility functions for the agent framework."""
+import inspect
+from typing import Dict, Any
+def function_to_input_schema(func) -> dict:
+    """Convert a function signature to JSON Schema input format."""
+    type_map = {
+        str: "string",
+        int: "integer",
+        float: "number",
+        bool: "boolean",
+        list: "array",
+        dict: "object",
+        type(None): "null",
+    }
+    try:
+        signature = inspect.signature(func)
+    except ValueError as e:
+        raise ValueError(
+            f"Failed to get signature for function {func.__name__}: {str(e)}"
+        )
+    parameters = {}
+    for param in signature.parameters.values():
+        try:
+            param_type = type_map.get(param.annotation, "string")
+        except KeyError as e:
+            raise KeyError(
+                f"Unknown type annotation {param.annotation} for parameter {param.name}: {str(e)}"
+            )
+        parameters[param.name] = {"type": param_type}
+    required = [
+        param.name
+        for param in signature.parameters.values()
+        if param.default == inspect._empty
+    ]
+    return {
+        "type": "object",
+        "properties": parameters,
+        "required": required,
+    }
+def format_tool_definition(name: str, description: str, parameters: dict) -> dict:
+    """Format a tool definition in OpenAI function calling format."""
+    return {
+        "type": "function",
+        "function": {
+            "name": name,
+            "description": description,
+            "parameters": parameters,
+        },
+    }
+def function_to_tool_definition(func) -> dict:
+    """Convert a function to OpenAI tool definition format."""
+    return format_tool_definition(
+        func.__name__,
+        func.__doc__ or "",
+        function_to_input_schema(func)
+    )
+def mcp_tools_to_openai_format(mcp_tools) -> list[dict]:
+    """Convert MCP tool definitions to OpenAI tool format."""
+    return [
+        format_tool_definition(
+            name=tool.name,
+            description=tool.description,
+            parameters=tool.inputSchema,
+        )
+        for tool in mcp_tools.tools
+    ]

example.py ADDED Viewed

	@@ -0,0 +1,64 @@

+"""Example usage of the agent framework."""
+import asyncio
+import os
+from dotenv import load_dotenv
+from agent_framework import Agent, LlmClient, FunctionTool, load_mcp_tools
+load_dotenv()
+# Example 1: Simple calculator tool
+def calculator(expression: str) -> float:
+    """Calculate mathematical expressions."""
+    return eval(expression)
+# Example 2: Using the @tool decorator
+from agent_framework import tool
+@tool
+def search_web(query: str, max_results: int = 5) -> str:
+    """Search the web for information."""
+    # This is a placeholder - in real usage, you'd call an actual search API
+    return f"Search results for: {query}"
+async def main():
+    # Create a calculator tool
+    calc_tool = FunctionTool(calculator)
+    # Create the agent
+    agent = Agent(
+        model=LlmClient(model="gpt-5-mini"),
+        tools=[calc_tool, search_web],
+        instructions="You are a helpful assistant that can calculate and search the web.",
+    )
+    # Run the agent
+    result = await agent.run("What is 1234 * 5678?")
+    print(f"Result: {result.output}")
+    print(f"Steps taken: {result.context.current_step}")
+    # Example with MCP tools
+    if os.getenv("TAVILY_API_KEY"):
+        connection = {
+            "command": "npx",
+            "args": ["-y", "tavily-mcp@latest"],
+            "env": {"TAVILY_API_KEY": os.getenv("TAVILY_API_KEY")}
+        }
+        mcp_tools = await load_mcp_tools(connection)
+        agent_with_mcp = Agent(
+            model=LlmClient(model="gpt-5-mini"),
+            tools=[calc_tool, *mcp_tools],
+            instructions="You are a helpful assistant with web search capabilities.",
+        )
+        result = await agent_with_mcp.run("What is the capital of France?")
+        print(f"Result: {result.output}")
+if __name__ == "__main__":
+    asyncio.run(main())

my_code.ipynb CHANGED Viewed

@@ -1,926 +1 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "bd396f3a",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "True"
-      ]
-     },
-     "execution_count": 1,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from dotenv import load_dotenv, find_dotenv\n",
-    "\n",
-    "load_dotenv(find_dotenv())\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "bdc55e33",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "ChatCompletionMessage(content='The capital of India is New Delhi.', refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None)\n",
-      "The capital of India is New Delhi.\n"
-     ]
-    }
-   ],
-   "source": [
-    "from openai import OpenAI\n",
-    "client = OpenAI()\n",
-    "\n",
-    "response = client.chat.completions.create(\n",
-    "    model = 'gpt-5-mini',\n",
-    "    messages = [\n",
-    "        {'role': 'system', 'content' : 'You are a helpful assistant !'},\n",
-    "        {'role': 'user', 'content': 'What is the capital of India ?'}\n",
-    "    ]\n",
-    ")\n",
-    "\n",
-    "print(response.choices[0].message)\n",
-    "print(response.choices[0].message.content)\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "id": "396e8826",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Hello! How can I help you today?\n"
-     ]
-    }
-   ],
-   "source": [
-    "## with this we can unify all providers\n",
-    "\n",
-    "from litellm import completion\n",
-    "response = completion(\n",
-    "    model = 'gpt-5-mini',\n",
-    "    messages = [{'role' : 'user', 'content' : 'Hello !' }]\n",
-    ")\n",
-    "\n",
-    "print(response.choices[0].message.content)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "cb505eb4",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Nice to meet you, Akhil — how can I help you today?\n",
-      "I don't know — I don't have access to personal details unless you tell me. What would you like me to call you in this chat? (I can use that name for this conversation, but I can't remember it across separate sessions unless you set it in your app/profile.)\n"
-     ]
-    }
-   ],
-   "source": [
-    "from litellm import completion\n",
-    "\n",
-    "response1 = completion(\n",
-    "        model = 'gpt-5-mini',\n",
-    "    messages = [{'role' : 'user', 'content':'My name is Akhil'}]\n",
-    ")\n",
-    "\n",
-    "response2 = completion(\n",
-    "        model = 'gpt-5-mini',\n",
-    "    messages = [{'role' : 'user', 'content':'what\\'s my name'}]\n",
-    ")\n",
-    "\n",
-    "print(response1.choices[0].message.content)\n",
-    "print(response2.choices[0].message.content)\n",
-    "\n",
-    "### This proves that each LLM call is independent. Our Model doesn't have memory"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "id": "cd3ade31",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Great — love the ambition, Akhil. If you want to be “the future of AI,” I can help you get there. How would you like me to help right now? (Pick one: roadmap, project ideas, resume/LinkedIn copy, interview prep, or a 12‑month actionable plan.)\n",
-      "\n",
-      "Below are a few immediately useful things you can use or ask me to expand.\n",
-      "\n",
-      "Quick elevator pitch / LinkedIn headline\n",
-      "- Headline: Akhil — Building safe, scalable AI that augments human creativity and solves real-world problems\n",
-      "- 1‑line pitch: “I build trustworthy AI systems that turn complex data into products people love — with a focus on safety, scalability, and real-world impact.”\n",
-      "\n",
-      "High‑level skills to prioritize\n",
-      "- Foundations: probability, linear algebra, optimization\n",
-      "- Core ML: supervised learning, neural networks, transfer learning, transformers\n",
-      "- Systems & infra: PyTorch/TensorFlow, Docker, Kubernetes, model serving, MLOps\n",
-      "- Specialized: LLMs, RL, generative models, multimodal models (vision+language)\n",
-      "- Soft skills: product sense, communication, writing and presenting research\n",
-      "- Ethics & safety: alignment concepts, bias mitigation, robust evaluation\n",
-      "\n",
-      "3 concrete projects (increasing complexity)\n",
-      "1. End‑to‑end ML app: simple image classifier with deployment (Flask/FastAPI + Docker + test pipeline)\n",
-      "2. LLM product prototype: retrieval-augmented chatbot for a specific domain (docs → vector DB → RAG)\n",
-      "3. Research/engineering hybrid: fine-tune or distill a model for efficiency and publish a blog post + code on GitHub\n",
-      "\n",
-      "Practical 12‑month roadmap (high level)\n",
-      "- Months 0–2: Fill gaps — math refresher, PyTorch, small projects, GitHub portfolio\n",
-      "- Months 3–5: Build and deploy 2 production prototypes (one LLM-based), publish writeups\n",
-      "- Months 6–9: Contribute to OSS or collaborate on a research project; attend conferences/meetups\n",
-      "- Months 10–12: Target internships/roles, refine portfolio, prepare interviews, publish a substantial case study or replication\n",
-      "\n",
-      "Quick resources\n",
-      "- Fast theory/math: “Mathematics for Machine Learning” + 3Blue1Brown playlists\n",
-      "- Practical ML: Deep Learning Book (selected chapters), PyTorch docs, Hugging Face course\n",
-      "- MLOps/RAG: LangChain/HF tutorials, Vector DB docs (Pinecone/Weaviate)\n",
-      "\n",
-      "If you want, I can:\n",
-      "- Create a personalized 6‑ or 12‑month plan based on your background and time availability\n",
-      "- Draft a LinkedIn summary, resume bullets, or a cover letter\n",
-      "- Design a project roadmap with milestones and tech stack\n",
-      "Tell me which and give me your experience level (student / early-career / senior / founder) and how many hours per week you can commit.\n",
-      "Short answer: you’re Akhil — the person who just told me “I am going to be the Future of AI.” Beyond that, only you can fully answer “who am I,” but I can help you shape a clear, useful version of that identity for career, confidence, and action.\n",
-      "\n",
-      "Pick one of these and I’ll build it for you:\n",
-      "- A crisp personal identity/mission statement (1–2 lines)\n",
-      "- A short LinkedIn “About” summary\n",
-      "- A 12‑month plan to become a leader in AI\n",
-      "- A set of interview/resume bullets matched to your level\n",
-      "\n",
-      "If you want to explore it yourself first, answer 5 quick prompts (one sentence each):\n",
-      "1. What technical skills do you already have (languages, frameworks, papers/projects)?\n",
-      "2. What do you enjoy doing most in AI (research, building products, deploying models, safety/ethics)?\n",
-      "3. What impact do you want to have (industry, research, social good, startups)?\n",
-      "4. What are your top 2 strengths and top 1 weakness you want to fix?\n",
-      "5. How many hours/week can you commit to learning or working toward this goal?\n",
-      "\n",
-      "Or, if you want an immediate example identity statement based on your earlier claim:\n",
-      "- “I’m Akhil — an aspiring AI leader building safe, scalable systems that augment human creativity. My mission is to bridge cutting‑edge research and real‑world impact.”\n",
-      "\n",
-      "Tell me which option you want or answer the 5 prompts and I’ll draft something tailored.\n"
-     ]
-    }
-   ],
-   "source": [
-    "### Managing conversation history\n",
-    "\n",
-    "\n",
-    "from litellm import completion\n",
-    "\n",
-    "## Maintain a messages object\n",
-    "messages = []\n",
-    "\n",
-    "## append your message/conversation\n",
-    "messages.append({'role':'user', 'content':'My name is Akhil and I am going to be the Future of AI'})\n",
-    "response3 = completion(model = 'gpt-5-mini', messages = messages)\n",
-    "\n",
-    "print(response3.choices[0].message.content)\n",
-    "\n",
-    "## append the message from assistant\n",
-    "messages.append({'role':'assistant', 'content':response3.choices[0].message.content})\n",
-    "\n",
-    "## write a new message\n",
-    "messages.append({'role':'user', 'content':'who am i'})\n",
-    "response4 = completion(model = 'gpt-5-mini', messages = messages)\n",
-    "\n",
-    "print(response4.choices[0].message.content)\n",
-    "\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "e0868cf6",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "{\"name\":\"Akhil\",\"email\":\"akhil.masters21@gmail.com\",\"phone\":\"9550303420\"}\n"
-     ]
-    }
-   ],
-   "source": [
-    "### Structured output\n",
-    "\n",
-    "from pydantic import BaseModel\n",
-    "from litellm import completion\n",
-    "\n",
-    "class ExtractedInfo(BaseModel):\n",
-    "    name  : str\n",
-    "    email : str\n",
-    "    phone : str | None = None\n",
-    "\n",
-    "response = completion(\n",
-    "    model=\"gpt-5-mini\",\n",
-    "    messages=[{\n",
-    "        \"role\": \"user\", \n",
-    "        \"content\": \"My name is Akhil, my email is akhil.masters21@gmail.com, and my phone is 9550303420.\"\n",
-    "    }],\n",
-    "    response_format=ExtractedInfo\n",
-    ")\n",
-    "\n",
-    "print(response.choices[0].message.content)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "03d48814",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Q: What is 2 + 2?\n",
-      "A: 2 + 2 = 4.\n",
-      "\n",
-      "Q: What is the capital of Japan?\n",
-      "A: The capital of Japan is Tokyo.\n",
-      "\n",
-      "Q: Who wrote Romeo and Juliet?\n",
-      "A: Romeo and Juliet was written by William Shakespeare. It was likely written and first performed in the mid-1590s (published in 1597).\n",
-      "\n"
-     ]
-    }
-   ],
-   "source": [
-    "### Asynchronus calls\n",
-    "\n",
-    "import asyncio\n",
-    "from litellm import acompletion\n",
-    "async def get_response(prompt: str) -> str:\n",
-    "    response = await acompletion(\n",
-    "        model = 'gpt-5-mini',\n",
-    "        messages=[{\"role\": \"user\", \"content\": prompt}]\n",
-    "    )\n",
-    "    return response.choices[0].message.content\n",
-    "    \n",
-    "prompts = [\n",
-    "    \"What is 2 + 2?\",\n",
-    "    \"What is the capital of Japan?\",\n",
-    "    \"Who wrote Romeo and Juliet?\"\n",
-    "]\n",
-    "\n",
-    "### here \n",
-    "## tasks = [get_response(What is 2 + 2?), get_response(What is the capital of Japan?)] \n",
-    "## doesnt run the function, it just creates a coroutine object. Thats the difference in async.\n",
-    "## functions are called in gather step\n",
-    "\n",
-    "tasks = [get_response(p) for p in prompts]\n",
-    "results = await asyncio.gather(*tasks)\n",
-    "\n",
-    "for prompt, result in zip(prompts, results):\n",
-    "      print(f\"Q: {prompt}\")\n",
-    "      print(f\"A: {result}\\n\")\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 14,
-   "id": "3333de1d",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Q: What is 0 + 0?\n",
-      "A: 0\n",
-      "\n",
-      "Because adding zero to zero yields zero.\n",
-      "\n",
-      "Q: What is 1 + 1?\n",
-      "A: 1 + 1 = 2.\n",
-      "\n",
-      "Q: What is 2 + 2?\n",
-      "A: 2 + 2 = 4.\n",
-      "\n",
-      "Q: What is 3 + 3?\n",
-      "A: 3 + 3 = 6.\n",
-      "\n",
-      "Q: What is 4 + 4?\n",
-      "A: 8\n",
-      "\n",
-      "Q: What is 5 + 5?\n",
-      "A: 10\n",
-      "\n",
-      "Q: What is 6 + 6?\n",
-      "A: 12\n",
-      "\n",
-      "Q: What is 7 + 7?\n",
-      "A: 14\n",
-      "\n",
-      "Q: What is 8 + 8?\n",
-      "A: 16\n",
-      "\n",
-      "Q: What is 9 + 9?\n",
-      "A: 18\n",
-      "\n",
-      "Q: What is 10 + 10?\n",
-      "A: 10 + 10 = 20\n",
-      "\n",
-      "Q: What is 11 + 11?\n",
-      "A: 22\n",
-      "\n",
-      "Q: What is 12 + 12?\n",
-      "A: 24\n",
-      "\n",
-      "Q: What is 13 + 13?\n",
-      "A: 26\n",
-      "\n",
-      "Q: What is 14 + 14?\n",
-      "A: 28\n",
-      "\n",
-      "Q: What is 15 + 15?\n",
-      "A: 30\n",
-      "\n",
-      "Q: What is 16 + 16?\n",
-      "A: 32\n",
-      "\n",
-      "Q: What is 17 + 17?\n",
-      "A: 34\n",
-      "\n",
-      "Q: What is 18 + 18?\n",
-      "A: 36\n",
-      "\n",
-      "Q: What is 19 + 19?\n",
-      "A: 38\n",
-      "\n"
-     ]
-    }
-   ],
-   "source": [
-    "### rate limiting queries\n",
-    "semaphore = asyncio.Semaphore(10)\n",
-    "\n",
-    "async def call_llm(prompt : str) -> str:\n",
-    "    async with semaphore:\n",
-    "        response = await acompletion(\n",
-    "            model=\"gpt-5-mini\",\n",
-    "            messages=[{\"role\": \"user\", \"content\": prompt}],\n",
-    "            num_retries=3  # Automatic retry with exponential backoff\n",
-    "        )\n",
-    "        return response.choices[0].message.content\n",
-    "prompts = [f\"What is {i} + {i}?\" for i in range(20)]\n",
-    "tasks = [call_llm(p) for p in prompts]\n",
-    "results = await asyncio.gather(*tasks, return_exceptions=True)\n",
-    "\n",
-    "\n",
-    "for prompt, result in zip(prompts, results):\n",
-    "      print(f\"Q: {prompt}\")\n",
-    "      print(f\"A: {result}\\n\")\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 16,
-   "id": "1caef766",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Generating test split: 100%|██████████| 93/93 [00:00<00:00, 1653.78 examples/s]\n",
-      "Generating validation split: 100%|██████████| 53/53 [00:00<00:00, 32022.20 examples/s]"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Number of Level 1 problems: 53\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "\n"
-     ]
-    }
-   ],
-   "source": [
-    "## loading the GAIA dataset\n",
-    "\n",
-    "from datasets import load_dataset\n",
-    "level1_problems = load_dataset(\"gaia-benchmark/GAIA\", \"2023_level1\", split=\"validation\")\n",
-    "print(f\"Number of Level 1 problems: {len(level1_problems)}\")\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 17,
-   "id": "733c211c",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'task_id': '8e867cd7-cff9-4e6c-867a-ff5ddc2550be',\n",
-       " 'Question': 'How many studio albums were published by Mercedes Sosa between 2000 and 2009 (included)? You can use the latest 2022 version of english wikipedia.',\n",
-       " 'Level': '1',\n",
-       " 'Final answer': '3',\n",
-       " 'file_name': '',\n",
-       " 'file_path': '',\n",
-       " 'Annotator Metadata': {'Steps': '1. I did a search for Mercedes Sosa\\n2. I went to the Wikipedia page for her\\n3. I scrolled down to \"Studio albums\"\\n4. I counted the ones between 2000 and 2009',\n",
-       "  'Number of steps': '4',\n",
-       "  'How long did this take?': '5 minutes',\n",
-       "  'Tools': '1. web browser\\n2. google search',\n",
-       "  'Number of tools': '2'}}"
-      ]
-     },
-     "execution_count": 17,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "level1_problems[1]"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 19,
-   "id": "3d5bcb22",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "100%|██████████| 40/40 [02:23<00:00,  3.58s/it]\n"
-     ]
-    }
-   ],
-   "source": [
-    "## defining a respose for gaia\n",
-    "from pydantic import BaseModel\n",
-    "from tqdm.asyncio import tqdm\n",
-    "gaia_prompt = \"\"\"You are a general AI assistant. I will ask you a question.\n",
-    "First, determine if you can solve this problem with your current capabilities and set \"is_solvable\" accordingly.\n",
-    "If you can solve it, set \"is_solvable\" to true and provide your answer in \"final_answer\".\n",
-    "If you cannot solve it, set \"is_solvable\" to false and explain why in \"unsolvable_reason\".\n",
-    "Your final answer should be a number OR as few words as possible OR a comma separated list of numbers and/or strings.\n",
-    "If you are asked for a number, don't use comma to write your number neither use units such as $ or percent sign unless specified otherwise.\n",
-    "If you are asked for a string, don't use articles, neither abbreviations (e.g. for cities), and write the digits in plain text unless specified otherwise.\n",
-    "If you are asked for a comma separated list, apply the above rules depending on whether the element is a number or a string.\"\"\"\n",
-    "\n",
-    "class GaiaOutput(BaseModel):\n",
-    "    is_solvable: bool\n",
-    "    unsolvable_reason: str = \"\"\n",
-    "    final_answer: str = \"\"\n",
-    "\n",
-    "PROVIDER_SEMAPHORES = {'openai': asyncio.Semaphore(30), 'anthropic': asyncio.Semaphore(10)}\n",
-    "\n",
-    "def get_provider(model: str) -> str:\n",
-    "    return \"anthropic\" if model.startswith(\"anthropic/\") else \"openai\"\n",
-    "\n",
-    "\n",
-    "async def solve_problem(model: str, question: str) -> GaiaOutput:\n",
-    "    provider = get_provider(model)\n",
-    "    async with PROVIDER_SEMAPHORES[provider]:\n",
-    "        response = await acompletion(\n",
-    "            model = model,\n",
-    "            messages=[\n",
-    "                {\"role\": \"system\", \"content\": gaia_prompt},\n",
-    "                {\"role\": \"user\", \"content\": question},\n",
-    "            ],\n",
-    "            response_format=GaiaOutput,\n",
-    "            num_retries=2,\n",
-    "        )\n",
-    "        finish_reason = response.choices[0].finish_reason\n",
-    "        content = response.choices[0].message.content\n",
-    "        if finish_reason == \"refusal\" or content is None:\n",
-    "            return GaiaOutput(\n",
-    "                is_solvable=False,\n",
-    "                unsolvable_reason=f\"Model refused to answer (finish_reason: {finish_reason})\",\n",
-    "                final_answer=\"\"\n",
-    "            )\n",
-    "        return GaiaOutput.model_validate_json(content)\n",
-    "\n",
-    "def is_correct(prediction: str | None, answer: str) -> bool:\n",
-    "    \"\"\"Check exact match between prediction and answer (case-insensitive).\"\"\"\n",
-    "    if prediction is None:\n",
-    "        return False\n",
-    "    return prediction.strip().lower() == answer.strip().lower()\n",
-    "\n",
-    "async def evaluate_gaia_single(problem: dict, model: str) -> dict:\n",
-    "    \"\"\"Evaluate a single problem-model pair and return result.\"\"\"\n",
-    "    try:\n",
-    "        output = await solve_problem(model, problem[\"Question\"])\n",
-    "        return {\n",
-    "            \"task_id\": problem[\"task_id\"],\n",
-    "            \"model\": model,\n",
-    "            \"correct\": is_correct(output.final_answer, problem[\"Final answer\"]),\n",
-    "            \"is_solvable\": output.is_solvable,\n",
-    "            \"prediction\": output.final_answer,\n",
-    "            \"answer\": problem[\"Final answer\"],\n",
-    "            \"unsolvable_reason\": output.unsolvable_reason,\n",
-    "        }\n",
-    "    except Exception as e:\n",
-    "        return {\n",
-    "            \"task_id\": problem[\"task_id\"],\n",
-    "            \"model\": model,\n",
-    "            \"correct\": False,\n",
-    "            \"is_solvable\": None,\n",
-    "            \"prediction\": None,\n",
-    "            \"answer\": problem[\"Final answer\"],\n",
-    "            \"error\": str(e),\n",
-    "        }\n",
-    "\n",
-    "async def run_experiment(\n",
-    "    problems: list[dict],\n",
-    "    models: list[str],\n",
-    ") -> dict[str, list]:\n",
-    "    \"\"\"Evaluate all models on all problems.\"\"\"\n",
-    "    tasks = [\n",
-    "        evaluate_gaia_single(problem, model)\n",
-    "        for problem in problems\n",
-    "        for model in models\n",
-    "    ]\n",
-    "    \n",
-    "    all_results = await tqdm.gather(*tasks)\n",
-    "    \n",
-    "    # Group results by model\n",
-    "    results = {model: [] for model in models}\n",
-    "    for result in all_results:\n",
-    "        results[result[\"model\"]].append(result)\n",
-    "    \n",
-    "    return results\n",
-    "\n",
-    "MODELS = [\n",
-    "    \"gpt-5\",\n",
-    "    \"gpt-5-mini\"\n",
-    "]\n",
-    " \n",
-    "subset = level1_problems.select(range(20))\n",
-    "results = await run_experiment(subset, MODELS)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 20,
-   "id": "04f60efa",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'gpt-5': [{'task_id': 'e1fc63a2-da7a-432f-be78-7c4a95598703',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': '17',\n",
-       "   'answer': '17',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': '8e867cd7-cff9-4e6c-867a-ff5ddc2550be',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': '4',\n",
-       "   'answer': '3',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': 'ec09fa32-d03f-4bf8-84b0-1f16922c3ae4',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': '3',\n",
-       "   'answer': '3',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': '5d0080cb-90d7-4712-bc33-848150e917d3',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': '0.1777',\n",
-       "   'unsolvable_reason': 'I don’t have access to the specific paper text or its figures and can’t browse to retrieve the exact calculated volume.'},\n",
-       "  {'task_id': 'a1e91b78-d3d8-4675-bb8d-62741b4b68a6',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': '3',\n",
-       "   'unsolvable_reason': 'I can’t access or watch the linked video to determine the number.'},\n",
-       "  {'task_id': '46719c30-f4c3-4cad-be07-d5cb21eee6bb',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'Mapping Human Oriented Information to Software Agents for Online Systems Usage',\n",
-       "   'unsolvable_reason': 'I need to look up the 2015 paper’s author list and their publication histories, which I cannot access without web browsing or additional details.'},\n",
-       "  {'task_id': '4b6bb5f7-f634-410e-815d-e673ab7f8632',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': 'THE CASTLE',\n",
-       "   'answer': 'THE CASTLE',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': 'cffe0e32-c9a6-4c52-9877-78ceb4aaa9fb',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'Fred',\n",
-       "   'unsolvable_reason': 'The referenced document with employee profiles and gift assignments is not provided, so the giver who failed to give a gift cannot be determined.'},\n",
-       "  {'task_id': '2d83110e-a098-4ebb-9987-066c06fa42d0',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': 'right',\n",
-       "   'answer': 'Right',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': '5cfb274c-0207-4aa7-9575-6ac0bd95d9b2',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'No',\n",
-       "   'unsolvable_reason': 'Missing the spreadsheet/layout of green plots, so I cannot determine if a non-backtracking loop exists.'},\n",
-       "  {'task_id': '27d5d136-8563-469e-92bf-fd103c28b57c',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': '(¬A → B) ↔ (A ∨ ¬B)',\n",
-       "   'answer': '(¬A → B) ↔ (A ∨ ¬B)',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': 'dc28cf18-6431-458b-83ef-64b3ce566c10',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': '2',\n",
-       "   'answer': '2',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': 'b816bfce-3d80-4913-a07d-69b752ce6377',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': 'cute',\n",
-       "   'answer': 'fluffy',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': '72e110e7-464c-453c-a309-90a95aed6538',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'Guatemala',\n",
-       "   'unsolvable_reason': 'I don’t have browsing access to verify the 2020 BASE DDC 633 page and its flags.'},\n",
-       "  {'task_id': '42576abe-0deb-4869-8c63-225c2d75a95a',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': 'Maktay Mato Apple',\n",
-       "   'answer': 'Maktay mato apple',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': 'b415aba4-4b68-4fc6-9b89-2c812e55a3e1',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'diamond',\n",
-       "   'unsolvable_reason': 'I don’t have browsing tools to look up the specific 2012 Scientific Reports conference proceedings article and identify the nano-compound without external access.'},\n",
-       "  {'task_id': 'cca530fc-4052-43b2-b130-b30968d8aa44',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'Rd5',\n",
-       "   'unsolvable_reason': 'Cannot view the chessboard image'},\n",
-       "  {'task_id': '935e2cff-ae78-4218-b3f5-115589b19dae',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': 'research',\n",
-       "   'answer': 'research',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': '4fc2f1ae-8625-45b5-ab34-ad4433bc21f8',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': 'FunkMonk',\n",
-       "   'answer': 'FunkMonk',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': '5188369a-3bbe-43d8-8b94-11558f909a08',\n",
-       "   'model': 'gpt-5',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'Annie Levin',\n",
-       "   'unsolvable_reason': 'I need to look up Merriam-Webster’s Word of the Day page for June 27, 2022 to see the quoted writer, but I don’t have browsing access.'}],\n",
-       " 'gpt-5-mini': [{'task_id': 'e1fc63a2-da7a-432f-be78-7c4a95598703',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': '17',\n",
-       "   'unsolvable_reason': 'I cannot access external websites such as Wikipedia to retrieve the exact minimum perigee value required for the calculation.'},\n",
-       "  {'task_id': '8e867cd7-cff9-4e6c-867a-ff5ddc2550be',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': '3',\n",
-       "   'unsolvable_reason': \"I cannot access the 2022 English Wikipedia from here to verify Mercedes Sosa's discography and reliably count studio albums released between 2000 and 2009.\"},\n",
-       "  {'task_id': 'ec09fa32-d03f-4bf8-84b0-1f16922c3ae4',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': '3',\n",
-       "   'answer': '3',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': '5d0080cb-90d7-4712-bc33-848150e917d3',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': '0.1777',\n",
-       "   'unsolvable_reason': \"I cannot access external documents or the internet and do not have the paper's calculated fish bag volume memorized.\"},\n",
-       "  {'task_id': 'a1e91b78-d3d8-4675-bb8d-62741b4b68a6',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': '3',\n",
-       "   'unsolvable_reason': 'I cannot access or view external video content (YouTube) to count bird species on screen.'},\n",
-       "  {'task_id': '46719c30-f4c3-4cad-be07-d5cb21eee6bb',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'Mapping Human Oriented Information to Software Agents for Online Systems Usage',\n",
-       "   'unsolvable_reason': \"I cannot access external databases or the internet to look up the 2015 paper's authors and their publication histories, and I do not have that specific bibliographic information memorized.\"},\n",
-       "  {'task_id': '4b6bb5f7-f634-410e-815d-e673ab7f8632',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'THE CASTLE',\n",
-       "   'unsolvable_reason': 'I cannot reliably recall the exact wording of the first scene heading from the official script and I cannot access external resources to check the script to provide the precise, verbatim setting.'},\n",
-       "  {'task_id': 'cffe0e32-c9a6-4c52-9877-78ceb4aaa9fb',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'Fred',\n",
-       "   'unsolvable_reason': 'Insufficient information: the document with the employees, their likes, and assignment/gift details was not provided.'},\n",
-       "  {'task_id': '2d83110e-a098-4ebb-9987-066c06fa42d0',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': 'right',\n",
-       "   'answer': 'Right',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': '5cfb274c-0207-4aa7-9575-6ac0bd95d9b2',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'No',\n",
-       "   'unsolvable_reason': 'I cannot access the attached spreadsheet or any images. Paste the grid (use G for Earl plots and . for others) or give coordinates so I can analyze the path.'},\n",
-       "  {'task_id': '27d5d136-8563-469e-92bf-fd103c28b57c',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': '(¬A → B) ↔ (A ∨ ¬B)',\n",
-       "   'answer': '(¬A → B) ↔ (A ∨ ¬B)',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': 'dc28cf18-6431-458b-83ef-64b3ce566c10',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': '2',\n",
-       "   'answer': '2',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': 'b816bfce-3d80-4913-a07d-69b752ce6377',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'fluffy',\n",
-       "   'unsolvable_reason': \"I cannot access external sources to read Emily Midkiff's June 2014 article in Fafnir and so cannot determine the quoted word.\"},\n",
-       "  {'task_id': '72e110e7-464c-453c-a309-90a95aed6538',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'Guatemala',\n",
-       "   'unsolvable_reason': 'I cannot browse the Bielefeld University Library BASE site or view its 2020 content to inspect the article flags. Determining which country’s flag was unique requires accessing that specific webpage or an archived snapshot, which I cannot do.'},\n",
-       "  {'task_id': '42576abe-0deb-4869-8c63-225c2d75a95a',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': True,\n",
-       "   'is_solvable': True,\n",
-       "   'prediction': 'Maktay Mato Apple',\n",
-       "   'answer': 'Maktay mato apple',\n",
-       "   'unsolvable_reason': ''},\n",
-       "  {'task_id': 'b415aba4-4b68-4fc6-9b89-2c812e55a3e1',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'diamond',\n",
-       "   'unsolvable_reason': 'I cannot access external web resources or the specific 2012 Scientific Reports conference proceedings to identify that article and its studied compound.'},\n",
-       "  {'task_id': 'cca530fc-4052-43b2-b130-b30968d8aa44',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': 'image not available',\n",
-       "   'answer': 'Rd5',\n",
-       "   'unsolvable_reason': 'Image not provided or inaccessible; cannot determine board position and legal winning move'},\n",
-       "  {'task_id': '935e2cff-ae78-4218-b3f5-115589b19dae',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'research',\n",
-       "   'unsolvable_reason': \"I cannot access or view the specific Wikipedia public logs for the Legume page from 2022; determining what 'R' stood for requires looking at those logs or contemporaneous Wikipedia discussion, which I cannot browse from here.\"},\n",
-       "  {'task_id': '4fc2f1ae-8625-45b5-ab34-ad4433bc21f8',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'FunkMonk',\n",
-       "   'unsolvable_reason': 'I cannot access Wikipedia or external web sources to check which dinosaur article was promoted in November 2016 and who nominated it.'},\n",
-       "  {'task_id': '5188369a-3bbe-43d8-8b94-11558f909a08',\n",
-       "   'model': 'gpt-5-mini',\n",
-       "   'correct': False,\n",
-       "   'is_solvable': False,\n",
-       "   'prediction': '',\n",
-       "   'answer': 'Annie Levin',\n",
-       "   'unsolvable_reason': 'I cannot access the Merriam-Webster Word of the Day archive or the web to verify the quoted writer for June 27 2022.'}]}"
-      ]
-     },
-     "execution_count": 20,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "results"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "99926f44",
-   "metadata": {},
-   "source": [
-    "## Tool Usage"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "ba50100c",
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": ".venv",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.12.11"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}


1	+

tavily_mcp_server.py ADDED Viewed

	@@ -0,0 +1,38 @@

+import os
+from tavily import TavilyClient
+from dotenv import load_dotenv
+from mcp.server.fastmcp import FastMCP
+load_dotenv()
+tavily_client = TavilyClient(os.getenv("TAVILY_API_KEY"))
+mcp = FastMCP("custom-tavily-search")
+@mcp.tool()
+def search_web(query: str, max_results: int = 5) -> str:
+    """
+    Search the web using Tavily API.
+    Args:
+        query: Search query string
+        max_results: Maximum number of results to return (default: 5)
+    Returns:
+        Search results as formatted string
+    """
+    try:
+        response = tavily_client.search(
+            query,
+            max_results=max_results,
+        )
+        results = response.get("results", [])
+        return "\n\n".join(
+            f"Title: {r['title']}\nURL: {r['url']}\nContent: {r['content']}"
+            for r in results
+        )
+    except Exception as e:
+        return f"Error searching web: {str(e)}"
+if __name__ == "__main__":
+    mcp.run(transport='stdio')