Agent System Architecture — Complete Blueprint

Multi-Agent Google Workspace System | OpenAI Agent SDK + MCP Protocol + OpenRouter LLM | Generated February 2026

1. High-Level Architecture Overview

The complete request flow from user to Google API and back. Two nested Agent loops — an Outer Orchestrator with 6 function_tools, and Inner Sub-Agents with MCP tools — both talk to the same LLM.

flowchart TB
    subgraph USER["👤 User Layer"]
        UI["Browser / Chat UI"]
        API["FastAPI Backend\n/api/chat"]
    end

    subgraph ORCH["🧠 ORCHESTRATOR AGENT — Outer Agent"]
        direction TB
        SVC["service(query)"]
        CACHE["LRU Cache\n100 entries · 5min TTL"]
        ENSURE["_ensure_connection()"]
        AGENT["Agent(\nname='Assistant'\nmodel=trinity-large-preview:free\ntools=6 function_tools\n)"]
        RUNNER["Runner.run(agent, query)\n→ sends to OpenRouter LLM"]
    end

    subgraph TOOLS["🔧 6 function_tool Wrappers — Registered on Outer Agent"]
        direction LR
        FT1["google_sheets_task()"]
        FT2["google_docs_task()"]
        FT3["google_drive_task()"]
        FT4["google_calendar_task()"]
        FT5["gmail_task()"]
        FT6["google_slides_task()"]
    end

    subgraph INNER["📦 Inner Sub-Agent — e.g. GmailAgent"]
        direction TB
        INIT["GmailAgent.__init__()\nCreate AsyncOpenAI client"]
        MCPOBJ["create_google_mcp_server()\n→ MCPServerStdio object"]
        SUBPROCESS["async with mcp_server:\n→ Spawn MCP subprocess"]
        LISTTOOL["mcp_server.list_tools()\n→ 15 Gmail tools"]
        INNERAGENT["Agent(\nname='Gmail Agent'\nmcp_servers=[mcp_server]\nmodel=trinity-large-preview:free\n)"]
        INNERRUN["Runner.run(inner_agent, query)\n→ LLM calls MCP tools"]
    end

    subgraph MCP["⚙️ Google MCP Server — Subprocess via stdio"]
        direction TB
        MAIN["main.py --tools gmail --single-user"]
        OAUTH["OAuth Credential Store\n~/.google_workspace_mcp/credentials/"]
        GAPI["Google Gmail API\nvia googleapis"]
    end

    subgraph LLM["☁️ OpenRouter LLM"]
        MODEL["arcee-ai/trinity-large-preview:free\nhttps://openrouter.ai/api/v1"]
    end

    UI -->|"HTTP POST /chat"| API
    API -->|"await service(query)"| SVC
    SVC --> CACHE
    CACHE -->|"miss"| ENSURE
    ENSURE --> AGENT
    AGENT --> RUNNER
    RUNNER <-->|"Turn 1: query + 6 tool defs"| MODEL
    MODEL -->|"function_call: gmail_task(query)"| FT5
    FT5 --> INIT
    INIT --> MCPOBJ
    MCPOBJ --> SUBPROCESS
    SUBPROCESS --> LISTTOOL
    LISTTOOL --> INNERAGENT
    INNERAGENT --> INNERRUN
    INNERRUN <-->|"Turn 1: query + 15 MCP tool defs"| MODEL
    MODEL -->|"function_call: search_gmail_messages"| SUBPROCESS
    SUBPROCESS -->|"stdio JSON-RPC"| MAIN
    MAIN --> OAUTH
    OAUTH -->|"auto-refresh token"| GAPI
    GAPI -->|"Gmail results"| MAIN
    MAIN -->|"stdio response"| SUBPROCESS
    SUBPROCESS -->|"tool result → Turn 2"| MODEL
    MODEL -->|"final text answer"| INNERRUN
    INNERRUN -->|"result.final_output"| FT5
    FT5 -->|"return string"| RUNNER
    RUNNER -->|"Turn 2 with tool result"| MODEL
    MODEL -->|"final formatted answer"| RUNNER
    RUNNER -->|"result.final_output"| SVC
    SVC -->|"cache + return"| API
    API -->|"JSON response"| UI

2. Detailed Request Sequence — "Show my unread emails"

Step-by-step message flow with exact timing from real test runs (~20s total).

Phase	What Happens	Time
Outer Turn 1	LLM receives query + 6 tool schemas → picks `gmail_task`	~5s
Inner Setup	GmailAgent spawns MCP subprocess, loads 15 tools	~2.5s
Inner Turn 1	LLM receives query + 15 Gmail tool schemas → picks `search_gmail_messages`	~4s
MCP Execution	JSON-RPC over stdio → OAuth auto-refresh → Gmail API call	~1s
Inner Turn 2	LLM formats raw Gmail results into readable text	~4s
Outer Turn 2	LLM wraps sub-agent response for the user	~5s

sequenceDiagram
    autonumber
    participant U as 👤 User
    participant API as FastAPI
    participant SVC as service()
    participant Cache as LRU Cache
    participant OA as Outer Agent
(Assistant)
    participant OR as OpenRouter LLM
(trinity-large)
    participant FT as gmail_task()
function_tool
    participant GA as GmailAgent
    participant MF as MCP Factory
google_mcp_config
    participant MS as MCPServerStdio
(subprocess)
    participant GS as google-mcp-server
main.py
    participant OAuth as OAuth Store
credentials/
    participant Gmail as Google Gmail
API

    U->>API: POST /api/chat {message: "show unread emails"}
    API->>SVC: await service(query)
    SVC->>Cache: _get_cached_response(query)
    Cache-->>SVC: None (cache miss)
    SVC->>OA: _ensure_connection() → Agent created once

    Note over OA: Agent with 6 function_tools:
sheets, docs, drive,
calendar, gmail, slides

    SVC->>OR: Runner.run(agent, query)
Turn 1: query + 6 tool JSON schemas
    Note over OR: LLM analyzes query:
"show unread emails"
→ matches gmail_task

    OR-->>SVC: ResponseFunctionToolCall
{name: "gmail_task", args: {query: "..."}}

    Note over SVC: SDK auto-invokes
the function_tool

    SVC->>FT: gmail_task(query="Search unread emails...")
    FT->>GA: GmailAgent() → __init__
    GA->>GA: Create AsyncOpenAI client
(OpenRouter endpoint)

    FT->>GA: await agent.run(query)
    GA->>MF: create_google_mcp_server("gmail", GMAIL_TOOLS)

    Note over MF: MCPServerStdio(
command: python main.py
args: --tools gmail --single-user
cwd: google-mcp-server/
tool_filter: 15 gmail tools
)

    MF-->>GA: MCPServerStdio object (not started yet)

    GA->>MS: async with mcp_server: (START subprocess)
    MS->>GS: spawn: python main.py --tools gmail --single-user
    Note over GS: FastMCP server boots
loads gmail tools only
stdio transport ready
(~2.5 seconds)
    GS-->>MS: subprocess ready (stdio pipe open)

    GA->>MS: await mcp_server.list_tools()
    MS->>GS: JSON-RPC: tools/list
    GS-->>MS: 15 tool definitions
    MS-->>GA: [search_gmail_messages, get_gmail_message_content, ...]

    Note over GA: create_static_tool_filter
filters to exactly 15 allowed tools

    GA->>GA: Create inner Agent
(name="Gmail Agent", mcp_servers=[ms])

    GA->>OR: Runner.run(inner_agent, query)
Turn 1: query + 15 MCP tool schemas
    Note over OR: LLM sees 15 Gmail tools
picks: search_gmail_messages
args: {query: "is:unread"}

    OR-->>GA: ResponseFunctionToolCall
{name: "search_gmail_messages"}

    Note over GA: SDK auto-invokes
MCP tool via call_tool()

    GA->>MS: call_tool("search_gmail_messages", {query: "is:unread"})
    MS->>GS: JSON-RPC: tools/call {search_gmail_messages}
    GS->>OAuth: get_credential("aiwithjawadsaghir@gmail.com")
    OAuth-->>GS: credentials (auto-refresh if expired)
    GS->>Gmail: Gmail API: messages.list(q="is:unread")
    Gmail-->>GS: {messages: [{id, threadId}, ...]}
    GS-->>MS: JSON-RPC response: 10 unread messages
    MS-->>GA: tool result: "Found 10 messages..."

    GA->>OR: Turn 2: tool result + conversation history
    Note over OR: LLM formats the
Gmail results into
human-readable text

    OR-->>GA: "I found 10 unread emails..."
    GA-->>FT: return result.final_output
    MS->>MS: subprocess exits (async with ends)

    FT-->>SVC: "I found 10 unread emails..."
    SVC->>OR: Turn 2: function_tool result
    Note over OR: LLM wraps the sub-agent
response for the user

    OR-->>SVC: Final formatted response
    SVC->>Cache: _set_cached_response(query, response)
    SVC-->>API: return response string
    API-->>U: JSON {response: "I found 10 unread emails..."}

3. Component & Module Relationships

How the source files, classes, and external services connect. google_mcp_config.py is the shared config, Gmail_Agent.py (and 5 siblings) are inner agents, Orchestrator_Agent.py is the outer agent.

flowchart TB
    subgraph CONFIG["google_mcp_config.py — Shared Config"]
        direction TB
        C1["MCP_SERVER_DIR = ../google-mcp-server/"]
        C2["MCP_PYTHON = .venv/Scripts/python.exe"]
        C3["OPENROUTER_API_KEY / BASE_URL"]
        C4["MODEL_NAME = arcee-ai/trinity-large-preview:free"]
        C5["GMAIL_TOOLS = 15 tool names"]
        C6["SHEETS_TOOLS = 14 tool names"]
        C7["DOCS_TOOLS = 19 tool names"]
        C8["DRIVE_TOOLS = 17 tool names"]
        C9["CALENDAR_TOOLS = 6 tool names"]
        C10["SLIDES_TOOLS = 9 tool names"]
        CF["create_google_mcp_server(service, tool_names)\n→ MCPServerStdio"]
    end

    subgraph AGENTS["6 Specialized Agent Classes"]
        direction TB
        A1["GoogleSheetsAgent — 14 tools via MCP"]
        A2["GoogleDocsAgent — 19 tools via MCP"]
        A3["GoogleDriveAgent — 17 tools via MCP"]
        A4["GoogleCalendarAgent — 6 tools via MCP"]
        A5["GmailAgent — 15 tools via MCP"]
        A6["GoogleSlidesAgent — 9 tools via MCP"]
    end

    subgraph ORCH_FILE["Orchestrator_Agent.py"]
        direction TB
        O1["@function_tool google_sheets_task"]
        O2["@function_tool google_docs_task"]
        O3["@function_tool google_drive_task"]
        O4["@function_tool google_calendar_task"]
        O5["@function_tool gmail_task"]
        O6["@function_tool google_slides_task"]
        OC["_create_agent() → Agent with 6 tools"]
        OS["service(query) → response string"]
        OSS["service_streaming(query) → async generator"]
    end

    subgraph SDK["OpenAI Agent SDK"]
        direction TB
        S1["Agent — wraps model + tools + instructions"]
        S2["Runner.run() — multi-turn loop"]
        S3["MCPServerStdio — subprocess manager"]
        S4["create_static_tool_filter — whitelist tools"]
        S5["function_tool — decorator for Python functions"]
        S6["OpenAIChatCompletionsModel — LLM adapter"]
    end

    subgraph EXTERNAL["External Services"]
        direction LR
        E1["OpenRouter API\n(LLM inference)"]
        E2["Google APIs\n(Gmail, Drive, Docs, etc.)"]
        E3["OAuth2 Credentials\n(local file store)"]
    end

    CF --> A1 & A2 & A3 & A4 & A5 & A6
    O1 --> A1
    O2 --> A2
    O3 --> A3
    O4 --> A4
    O5 --> A5
    O6 --> A6
    OC --> O1 & O2 & O3 & O4 & O5 & O6
    OS --> OC
    A5 --> S3
    S3 --> S4
    OC --> S1
    OS --> S2
    S2 --> S6
    S6 --> E1
    S3 --> E2
    S3 --> E3

4. Runner.run() Turn Loop — How the Agent Loops Internally

Runner.run() is a multi-turn loop. It sends the query + tool schemas to the LLM, checks if the response is text or a tool call, executes tools, and loops back. This loop runs twice — once for the Outer Orchestrator (function_tools) and once inside the Inner Sub-Agent (MCP tools).

flowchart TD
    START(["User query arrives"]) --> CHECK_CACHE{"Cache hit?"}
    CHECK_CACHE -->|Yes| RETURN_CACHED["Return cached response"]
    CHECK_CACHE -->|No| OUTER_T1

    subgraph OUTER["OUTER AGENT LOOP — Orchestrator"]
        OUTER_T1["Turn 1 → LLM\nSend: query + 6 tool schemas"]
        OUTER_T1 --> OUTER_RESP1{"LLM response type?"}
        OUTER_RESP1 -->|"text message"| OUTER_DONE["Return text as final_output"]
        OUTER_RESP1 -->|"function_call"| OUTER_INVOKE["SDK invokes function_tool\ne.g. gmail_task(query)"]
        OUTER_INVOKE --> INNER_START

        subgraph INNER["INNER AGENT LOOP — e.g. GmailAgent"]
            INNER_START["1. create_google_mcp_server()"]
            INNER_START --> INNER_SPAWN["2. async with mcp_server:\n   Spawn subprocess\n   ~2.5s startup"]
            INNER_SPAWN --> INNER_LIST["3. list_tools()\n   Get 15 Gmail tools"]
            INNER_LIST --> INNER_AGENT["4. Create inner Agent\n   with mcp_servers=[server]"]
            INNER_AGENT --> INNER_T1["5. Runner.run(agent, query)\n   Turn 1 → LLM\n   Send: query + 15 MCP tool schemas"]
            INNER_T1 --> INNER_RESP{"LLM response?"}
            INNER_RESP -->|"text"| INNER_DONE["Return final_output"]
            INNER_RESP -->|"MCP tool call"| MCP_CALL

            subgraph MCP_EXEC["MCP Tool Execution"]
                MCP_CALL["SDK calls mcp_server.call_tool()\ne.g. search_gmail_messages"]
                MCP_CALL --> STDIO["JSON-RPC over stdio pipe\n→ google-mcp-server process"]
                STDIO --> CRED["Load OAuth credentials\nauto-refresh if expired"]
                CRED --> GAPI["Call Google API\ngmail.users.messages.list"]
                GAPI --> RESULT["Return API result\nmessage IDs, subjects, etc."]
            end

            RESULT --> INNER_T2["Turn 2 → LLM\nSend: tool result + history"]
            INNER_T2 --> INNER_RESP2{"LLM response?"}
            INNER_RESP2 -->|"more tool calls"| MCP_CALL
            INNER_RESP2 -->|"text"| INNER_DONE
        end

        INNER_DONE --> OUTER_T2["Turn 2 → LLM\nSend: function_tool result"]
        OUTER_T2 --> OUTER_RESP2{"LLM response?"}
        OUTER_RESP2 -->|"more function_calls"| OUTER_INVOKE
        OUTER_RESP2 -->|"text"| OUTER_DONE
    end

    OUTER_DONE --> CACHE_SET["Cache response\n5min TTL"]
    CACHE_SET --> RESPOND(["Return to user"])
    RETURN_CACHED --> RESPOND

Key Files

File	Role	Key Exports
Orchestrator_Agent.py	Outer Agent — routes queries to specialist sub-agents	`service()`, `service_streaming()`, 6× `@function_tool`
Gmail_Agent.py	Inner Agent — Gmail specialist with 15 MCP tools	`GmailAgent.run(query)`
google_mcp_config.py	Shared config — MCP factory, tool lists, LLM settings	`create_google_mcp_server()`, tool name constants
google-mcp-server/main.py	MCP Server — runs as subprocess, provides Google API tools	FastMCP `@server.tool()` functions (80 total across 6 services)

Tool Counts Per Service

Service	Agent Class	Tools	Capabilities
Gmail	GmailAgent	15	Search, read, send, draft, labels, filters, threads, attachments
Docs	GoogleDocsAgent	19	Create, edit, find-replace, tables, images, PDF export, comments
Drive	GoogleDriveAgent	17	Search, upload, share, permissions, copy, download, ownership
Sheets	GoogleSheetsAgent	14	Read, write, format, conditional formatting, comments
Slides	GoogleSlidesAgent	9	Create, update, thumbnails, comments
Calendar	GoogleCalendarAgent	6	List calendars, CRUD events, free/busy
Total		80