Agent System Architecture โ€” Complete Blueprint

Multi-Agent Google Workspace System  |  OpenAI Agent SDK + MCP Protocol + OpenRouter LLM  |  Generated February 2026

1. High-Level Architecture Overview

The complete request flow from user to Google API and back. Two nested Agent loops โ€” an Outer Orchestrator with 6 function_tools, and Inner Sub-Agents with MCP tools โ€” both talk to the same LLM.

flowchart TB
    subgraph USER["๐Ÿ‘ค User Layer"]
        UI["Browser / Chat UI"]
        API["FastAPI Backend\n/api/chat"]
    end

    subgraph ORCH["๐Ÿง  ORCHESTRATOR AGENT โ€” Outer Agent"]
        direction TB
        SVC["service(query)"]
        CACHE["LRU Cache\n100 entries ยท 5min TTL"]
        ENSURE["_ensure_connection()"]
        AGENT["Agent(\nname='Assistant'\nmodel=trinity-large-preview:free\ntools=6 function_tools\n)"]
        RUNNER["Runner.run(agent, query)\nโ†’ sends to OpenRouter LLM"]
    end

    subgraph TOOLS["๐Ÿ”ง 6 function_tool Wrappers โ€” Registered on Outer Agent"]
        direction LR
        FT1["google_sheets_task()"]
        FT2["google_docs_task()"]
        FT3["google_drive_task()"]
        FT4["google_calendar_task()"]
        FT5["gmail_task()"]
        FT6["google_slides_task()"]
    end

    subgraph INNER["๐Ÿ“ฆ Inner Sub-Agent โ€” e.g. GmailAgent"]
        direction TB
        INIT["GmailAgent.__init__()\nCreate AsyncOpenAI client"]
        MCPOBJ["create_google_mcp_server()\nโ†’ MCPServerStdio object"]
        SUBPROCESS["async with mcp_server:\nโ†’ Spawn MCP subprocess"]
        LISTTOOL["mcp_server.list_tools()\nโ†’ 15 Gmail tools"]
        INNERAGENT["Agent(\nname='Gmail Agent'\nmcp_servers=[mcp_server]\nmodel=trinity-large-preview:free\n)"]
        INNERRUN["Runner.run(inner_agent, query)\nโ†’ LLM calls MCP tools"]
    end

    subgraph MCP["โš™๏ธ Google MCP Server โ€” Subprocess via stdio"]
        direction TB
        MAIN["main.py --tools gmail --single-user"]
        OAUTH["OAuth Credential Store\n~/.google_workspace_mcp/credentials/"]
        GAPI["Google Gmail API\nvia googleapis"]
    end

    subgraph LLM["โ˜๏ธ OpenRouter LLM"]
        MODEL["arcee-ai/trinity-large-preview:free\nhttps://openrouter.ai/api/v1"]
    end

    UI -->|"HTTP POST /chat"| API
    API -->|"await service(query)"| SVC
    SVC --> CACHE
    CACHE -->|"miss"| ENSURE
    ENSURE --> AGENT
    AGENT --> RUNNER
    RUNNER <-->|"Turn 1: query + 6 tool defs"| MODEL
    MODEL -->|"function_call: gmail_task(query)"| FT5
    FT5 --> INIT
    INIT --> MCPOBJ
    MCPOBJ --> SUBPROCESS
    SUBPROCESS --> LISTTOOL
    LISTTOOL --> INNERAGENT
    INNERAGENT --> INNERRUN
    INNERRUN <-->|"Turn 1: query + 15 MCP tool defs"| MODEL
    MODEL -->|"function_call: search_gmail_messages"| SUBPROCESS
    SUBPROCESS -->|"stdio JSON-RPC"| MAIN
    MAIN --> OAUTH
    OAUTH -->|"auto-refresh token"| GAPI
    GAPI -->|"Gmail results"| MAIN
    MAIN -->|"stdio response"| SUBPROCESS
    SUBPROCESS -->|"tool result โ†’ Turn 2"| MODEL
    MODEL -->|"final text answer"| INNERRUN
    INNERRUN -->|"result.final_output"| FT5
    FT5 -->|"return string"| RUNNER
    RUNNER -->|"Turn 2 with tool result"| MODEL
    MODEL -->|"final formatted answer"| RUNNER
    RUNNER -->|"result.final_output"| SVC
    SVC -->|"cache + return"| API
    API -->|"JSON response"| UI
    

2. Detailed Request Sequence โ€” "Show my unread emails"

Step-by-step message flow with exact timing from real test runs (~20s total).

PhaseWhat HappensTime
Outer Turn 1LLM receives query + 6 tool schemas โ†’ picks gmail_task~5s
Inner SetupGmailAgent spawns MCP subprocess, loads 15 tools~2.5s
Inner Turn 1LLM receives query + 15 Gmail tool schemas โ†’ picks search_gmail_messages~4s
MCP ExecutionJSON-RPC over stdio โ†’ OAuth auto-refresh โ†’ Gmail API call~1s
Inner Turn 2LLM formats raw Gmail results into readable text~4s
Outer Turn 2LLM wraps sub-agent response for the user~5s
sequenceDiagram
    autonumber
    participant U as ๐Ÿ‘ค User
    participant API as FastAPI
    participant SVC as service()
    participant Cache as LRU Cache
    participant OA as Outer Agent
(Assistant) participant OR as OpenRouter LLM
(trinity-large) participant FT as gmail_task()
function_tool participant GA as GmailAgent participant MF as MCP Factory
google_mcp_config participant MS as MCPServerStdio
(subprocess) participant GS as google-mcp-server
main.py participant OAuth as OAuth Store
credentials/ participant Gmail as Google Gmail
API U->>API: POST /api/chat {message: "show unread emails"} API->>SVC: await service(query) SVC->>Cache: _get_cached_response(query) Cache-->>SVC: None (cache miss) SVC->>OA: _ensure_connection() โ†’ Agent created once Note over OA: Agent with 6 function_tools:
sheets, docs, drive,
calendar, gmail, slides SVC->>OR: Runner.run(agent, query)
Turn 1: query + 6 tool JSON schemas Note over OR: LLM analyzes query:
"show unread emails"
โ†’ matches gmail_task OR-->>SVC: ResponseFunctionToolCall
{name: "gmail_task", args: {query: "..."}} Note over SVC: SDK auto-invokes
the function_tool SVC->>FT: gmail_task(query="Search unread emails...") FT->>GA: GmailAgent() โ†’ __init__ GA->>GA: Create AsyncOpenAI client
(OpenRouter endpoint) FT->>GA: await agent.run(query) GA->>MF: create_google_mcp_server("gmail", GMAIL_TOOLS) Note over MF: MCPServerStdio(
command: python main.py
args: --tools gmail --single-user
cwd: google-mcp-server/
tool_filter: 15 gmail tools
) MF-->>GA: MCPServerStdio object (not started yet) GA->>MS: async with mcp_server: (START subprocess) MS->>GS: spawn: python main.py --tools gmail --single-user Note over GS: FastMCP server boots
loads gmail tools only
stdio transport ready
(~2.5 seconds) GS-->>MS: subprocess ready (stdio pipe open) GA->>MS: await mcp_server.list_tools() MS->>GS: JSON-RPC: tools/list GS-->>MS: 15 tool definitions MS-->>GA: [search_gmail_messages, get_gmail_message_content, ...] Note over GA: create_static_tool_filter
filters to exactly 15 allowed tools GA->>GA: Create inner Agent
(name="Gmail Agent", mcp_servers=[ms]) GA->>OR: Runner.run(inner_agent, query)
Turn 1: query + 15 MCP tool schemas Note over OR: LLM sees 15 Gmail tools
picks: search_gmail_messages
args: {query: "is:unread"} OR-->>GA: ResponseFunctionToolCall
{name: "search_gmail_messages"} Note over GA: SDK auto-invokes
MCP tool via call_tool() GA->>MS: call_tool("search_gmail_messages", {query: "is:unread"}) MS->>GS: JSON-RPC: tools/call {search_gmail_messages} GS->>OAuth: get_credential("aiwithjawadsaghir@gmail.com") OAuth-->>GS: credentials (auto-refresh if expired) GS->>Gmail: Gmail API: messages.list(q="is:unread") Gmail-->>GS: {messages: [{id, threadId}, ...]} GS-->>MS: JSON-RPC response: 10 unread messages MS-->>GA: tool result: "Found 10 messages..." GA->>OR: Turn 2: tool result + conversation history Note over OR: LLM formats the
Gmail results into
human-readable text OR-->>GA: "I found 10 unread emails..." GA-->>FT: return result.final_output MS->>MS: subprocess exits (async with ends) FT-->>SVC: "I found 10 unread emails..." SVC->>OR: Turn 2: function_tool result Note over OR: LLM wraps the sub-agent
response for the user OR-->>SVC: Final formatted response SVC->>Cache: _set_cached_response(query, response) SVC-->>API: return response string API-->>U: JSON {response: "I found 10 unread emails..."}

3. Component & Module Relationships

How the source files, classes, and external services connect. google_mcp_config.py is the shared config, Gmail_Agent.py (and 5 siblings) are inner agents, Orchestrator_Agent.py is the outer agent.

flowchart TB
    subgraph CONFIG["google_mcp_config.py โ€” Shared Config"]
        direction TB
        C1["MCP_SERVER_DIR = ../google-mcp-server/"]
        C2["MCP_PYTHON = .venv/Scripts/python.exe"]
        C3["OPENROUTER_API_KEY / BASE_URL"]
        C4["MODEL_NAME = arcee-ai/trinity-large-preview:free"]
        C5["GMAIL_TOOLS = 15 tool names"]
        C6["SHEETS_TOOLS = 14 tool names"]
        C7["DOCS_TOOLS = 19 tool names"]
        C8["DRIVE_TOOLS = 17 tool names"]
        C9["CALENDAR_TOOLS = 6 tool names"]
        C10["SLIDES_TOOLS = 9 tool names"]
        CF["create_google_mcp_server(service, tool_names)\nโ†’ MCPServerStdio"]
    end

    subgraph AGENTS["6 Specialized Agent Classes"]
        direction TB
        A1["GoogleSheetsAgent โ€” 14 tools via MCP"]
        A2["GoogleDocsAgent โ€” 19 tools via MCP"]
        A3["GoogleDriveAgent โ€” 17 tools via MCP"]
        A4["GoogleCalendarAgent โ€” 6 tools via MCP"]
        A5["GmailAgent โ€” 15 tools via MCP"]
        A6["GoogleSlidesAgent โ€” 9 tools via MCP"]
    end

    subgraph ORCH_FILE["Orchestrator_Agent.py"]
        direction TB
        O1["@function_tool google_sheets_task"]
        O2["@function_tool google_docs_task"]
        O3["@function_tool google_drive_task"]
        O4["@function_tool google_calendar_task"]
        O5["@function_tool gmail_task"]
        O6["@function_tool google_slides_task"]
        OC["_create_agent() โ†’ Agent with 6 tools"]
        OS["service(query) โ†’ response string"]
        OSS["service_streaming(query) โ†’ async generator"]
    end

    subgraph SDK["OpenAI Agent SDK"]
        direction TB
        S1["Agent โ€” wraps model + tools + instructions"]
        S2["Runner.run() โ€” multi-turn loop"]
        S3["MCPServerStdio โ€” subprocess manager"]
        S4["create_static_tool_filter โ€” whitelist tools"]
        S5["function_tool โ€” decorator for Python functions"]
        S6["OpenAIChatCompletionsModel โ€” LLM adapter"]
    end

    subgraph EXTERNAL["External Services"]
        direction LR
        E1["OpenRouter API\n(LLM inference)"]
        E2["Google APIs\n(Gmail, Drive, Docs, etc.)"]
        E3["OAuth2 Credentials\n(local file store)"]
    end

    CF --> A1 & A2 & A3 & A4 & A5 & A6
    O1 --> A1
    O2 --> A2
    O3 --> A3
    O4 --> A4
    O5 --> A5
    O6 --> A6
    OC --> O1 & O2 & O3 & O4 & O5 & O6
    OS --> OC
    A5 --> S3
    S3 --> S4
    OC --> S1
    OS --> S2
    S2 --> S6
    S6 --> E1
    S3 --> E2
    S3 --> E3
    

4. Runner.run() Turn Loop โ€” How the Agent Loops Internally

Runner.run() is a multi-turn loop. It sends the query + tool schemas to the LLM, checks if the response is text or a tool call, executes tools, and loops back. This loop runs twice โ€” once for the Outer Orchestrator (function_tools) and once inside the Inner Sub-Agent (MCP tools).

flowchart TD
    START(["User query arrives"]) --> CHECK_CACHE{"Cache hit?"}
    CHECK_CACHE -->|Yes| RETURN_CACHED["Return cached response"]
    CHECK_CACHE -->|No| OUTER_T1

    subgraph OUTER["OUTER AGENT LOOP โ€” Orchestrator"]
        OUTER_T1["Turn 1 โ†’ LLM\nSend: query + 6 tool schemas"]
        OUTER_T1 --> OUTER_RESP1{"LLM response type?"}
        OUTER_RESP1 -->|"text message"| OUTER_DONE["Return text as final_output"]
        OUTER_RESP1 -->|"function_call"| OUTER_INVOKE["SDK invokes function_tool\ne.g. gmail_task(query)"]
        OUTER_INVOKE --> INNER_START

        subgraph INNER["INNER AGENT LOOP โ€” e.g. GmailAgent"]
            INNER_START["1. create_google_mcp_server()"]
            INNER_START --> INNER_SPAWN["2. async with mcp_server:\n   Spawn subprocess\n   ~2.5s startup"]
            INNER_SPAWN --> INNER_LIST["3. list_tools()\n   Get 15 Gmail tools"]
            INNER_LIST --> INNER_AGENT["4. Create inner Agent\n   with mcp_servers=[server]"]
            INNER_AGENT --> INNER_T1["5. Runner.run(agent, query)\n   Turn 1 โ†’ LLM\n   Send: query + 15 MCP tool schemas"]
            INNER_T1 --> INNER_RESP{"LLM response?"}
            INNER_RESP -->|"text"| INNER_DONE["Return final_output"]
            INNER_RESP -->|"MCP tool call"| MCP_CALL

            subgraph MCP_EXEC["MCP Tool Execution"]
                MCP_CALL["SDK calls mcp_server.call_tool()\ne.g. search_gmail_messages"]
                MCP_CALL --> STDIO["JSON-RPC over stdio pipe\nโ†’ google-mcp-server process"]
                STDIO --> CRED["Load OAuth credentials\nauto-refresh if expired"]
                CRED --> GAPI["Call Google API\ngmail.users.messages.list"]
                GAPI --> RESULT["Return API result\nmessage IDs, subjects, etc."]
            end

            RESULT --> INNER_T2["Turn 2 โ†’ LLM\nSend: tool result + history"]
            INNER_T2 --> INNER_RESP2{"LLM response?"}
            INNER_RESP2 -->|"more tool calls"| MCP_CALL
            INNER_RESP2 -->|"text"| INNER_DONE
        end

        INNER_DONE --> OUTER_T2["Turn 2 โ†’ LLM\nSend: function_tool result"]
        OUTER_T2 --> OUTER_RESP2{"LLM response?"}
        OUTER_RESP2 -->|"more function_calls"| OUTER_INVOKE
        OUTER_RESP2 -->|"text"| OUTER_DONE
    end

    OUTER_DONE --> CACHE_SET["Cache response\n5min TTL"]
    CACHE_SET --> RESPOND(["Return to user"])
    RETURN_CACHED --> RESPOND
    

Key Files

FileRoleKey Exports
Orchestrator_Agent.py Outer Agent โ€” routes queries to specialist sub-agents service(), service_streaming(), 6ร— @function_tool
Gmail_Agent.py Inner Agent โ€” Gmail specialist with 15 MCP tools GmailAgent.run(query)
google_mcp_config.py Shared config โ€” MCP factory, tool lists, LLM settings create_google_mcp_server(), tool name constants
google-mcp-server/main.py MCP Server โ€” runs as subprocess, provides Google API tools FastMCP @server.tool() functions (80 total across 6 services)

Tool Counts Per Service

ServiceAgent ClassToolsCapabilities
GmailGmailAgent15Search, read, send, draft, labels, filters, threads, attachments
DocsGoogleDocsAgent19Create, edit, find-replace, tables, images, PDF export, comments
DriveGoogleDriveAgent17Search, upload, share, permissions, copy, download, ownership
SheetsGoogleSheetsAgent14Read, write, format, conditional formatting, comments
SlidesGoogleSlidesAgent9Create, update, thumbnails, comments
CalendarGoogleCalendarAgent6List calendars, CRUD events, free/busy
Total80