| # Architecture Overview |
|
|
| This document provides a comprehensive overview of the DeerFlow backend architecture. |
|
|
| ## System Architecture |
|
|
| ``` |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Client (Browser) β |
| βββββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ |
| β |
| βΌ |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Nginx (Port 2026) β |
| β Unified Reverse Proxy Entry Point β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β /api/langgraph/* β LangGraph Server (2024) β β |
| β β /api/* β Gateway API (8001) β β |
| β β /* β Frontend (3000) β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| βββββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ |
| β |
| βββββββββββββββββββββββββΌββββββββββββββββββββββββ |
| β β β |
| βΌ βΌ βΌ |
| βββββββββββββββββββββββ βββββββββββββββββββββββ βββββββββββββββββββββββ |
| β LangGraph Server β β Gateway API β β Frontend β |
| β (Port 2024) β β (Port 8001) β β (Port 3000) β |
| β β β β β β |
| β - Agent Runtime β β - Models API β β - Next.js App β |
| β - Thread Mgmt β β - MCP Config β β - React UI β |
| β - SSE Streaming β β - Skills Mgmt β β - Chat Interface β |
| β - Checkpointing β β - File Uploads β β β |
| β β β - Artifacts β β β |
| βββββββββββββββββββββββ βββββββββββββββββββββββ βββββββββββββββββββββββ |
| β β |
| β βββββββββββββββββββ |
| β β |
| βΌ βΌ |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Shared Configuration β |
| β βββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββ β |
| β β config.yaml β β extensions_config.json β β |
| β β - Models β β - MCP Servers β β |
| β β - Tools β β - Skills State β β |
| β β - Sandbox β β β β |
| β β - Summarization β β β β |
| β βββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββ β |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| ## Component Details |
|
|
| ### LangGraph Server |
|
|
| The LangGraph server is the core agent runtime, built on LangGraph for robust multi-agent workflow orchestration. |
|
|
| **Entry Point**: `src/agents/lead_agent/agent.py:make_lead_agent` |
|
|
| **Key Responsibilities**: |
| - Agent creation and configuration |
| - Thread state management |
| - Middleware chain execution |
| - Tool execution orchestration |
| - SSE streaming for real-time responses |
|
|
| **Configuration**: `langgraph.json` |
|
|
| ```json |
| { |
| "agent": { |
| "type": "agent", |
| "path": "src.agents:make_lead_agent" |
| } |
| } |
| ``` |
|
|
| ### Gateway API |
|
|
| FastAPI application providing REST endpoints for non-agent operations. |
|
|
| **Entry Point**: `src/gateway/app.py` |
|
|
| **Routers**: |
| - `models.py` - `/api/models` - Model listing and details |
| - `mcp.py` - `/api/mcp` - MCP server configuration |
| - `skills.py` - `/api/skills` - Skills management |
| - `uploads.py` - `/api/threads/{id}/uploads` - File upload |
| - `artifacts.py` - `/api/threads/{id}/artifacts` - Artifact serving |
|
|
| ### Agent Architecture |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β make_lead_agent(config) β |
| ββββββββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Middleware Chain β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β 1. ThreadDataMiddleware - Initialize workspace/uploads/outputs β β |
| β β 2. UploadsMiddleware - Process uploaded files β β |
| β β 3. SandboxMiddleware - Acquire sandbox environment β β |
| β β 4. SummarizationMiddleware - Context reduction (if enabled) β β |
| β β 5. TitleMiddleware - Auto-generate titles β β |
| β β 6. TodoListMiddleware - Task tracking (if plan_mode) β β |
| β β 7. ViewImageMiddleware - Vision model support β β |
| β β 8. ClarificationMiddleware - Handle clarifications β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| ββββββββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Agent Core β |
| β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββββββ β |
| β β Model β β Tools β β System Prompt β β |
| β β (from factory) β β (configured + β β (with skills) β β |
| β β β β MCP + builtin) β β β β |
| β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββββββ β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| ### Thread State |
|
|
| The `ThreadState` extends LangGraph's `AgentState` with additional fields: |
|
|
| ```python |
| class ThreadState(AgentState): |
| # Core state from AgentState |
| messages: list[BaseMessage] |
| |
| # DeerFlow extensions |
| sandbox: dict # Sandbox environment info |
| artifacts: list[str] # Generated file paths |
| thread_data: dict # {workspace, uploads, outputs} paths |
| title: str | None # Auto-generated conversation title |
| todos: list[dict] # Task tracking (plan mode) |
| viewed_images: dict # Vision model image data |
| ``` |
|
|
| ### Sandbox System |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Sandbox Architecture β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| |
| βββββββββββββββββββββββββββ |
| β SandboxProvider β (Abstract) |
| β - acquire() β |
| β - get() β |
| β - release() β |
| ββββββββββββββ¬βββββββββββββ |
| β |
| ββββββββββββββββββββββΌβββββββββββββββββββββ |
| β β |
| βΌ βΌ |
| βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ |
| β LocalSandboxProvider β β AioSandboxProvider β |
| β (src/sandbox/local.py) β β (src/community/) β |
| β β β β |
| β - Singleton instance β β - Docker-based β |
| β - Direct execution β β - Isolated containers β |
| β - Development use β β - Production use β |
| βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ |
| |
| βββββββββββββββββββββββββββ |
| β Sandbox β (Abstract) |
| β - execute_command() β |
| β - read_file() β |
| β - write_file() β |
| β - list_dir() β |
| βββββββββββββββββββββββββββ |
| ``` |
|
|
| **Virtual Path Mapping**: |
|
|
| | Virtual Path | Physical Path | |
| |-------------|---------------| |
| | `/mnt/user-data/workspace` | `backend/.deer-flow/threads/{thread_id}/user-data/workspace` | |
| | `/mnt/user-data/uploads` | `backend/.deer-flow/threads/{thread_id}/user-data/uploads` | |
| | `/mnt/user-data/outputs` | `backend/.deer-flow/threads/{thread_id}/user-data/outputs` | |
| | `/mnt/skills` | `deer-flow/skills/` | |
|
|
| ### Tool System |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Tool Sources β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| |
| βββββββββββββββββββββββ βββββββββββββββββββββββ βββββββββββββββββββββββ |
| β Built-in Tools β β Configured Tools β β MCP Tools β |
| β (src/tools/) β β (config.yaml) β β (extensions.json) β |
| βββββββββββββββββββββββ€ βββββββββββββββββββββββ€ βββββββββββββββββββββββ€ |
| β - present_file β β - web_search β β - github β |
| β - ask_clarification β β - web_fetch β β - filesystem β |
| β - view_image β β - bash β β - postgres β |
| β β β - read_file β β - brave-search β |
| β β β - write_file β β - puppeteer β |
| β β β - str_replace β β - ... β |
| β β β - ls β β β |
| βββββββββββββββββββββββ βββββββββββββββββββββββ βββββββββββββββββββββββ |
| β β β |
| βββββββββββββββββββββββββ΄ββββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββ |
| β get_available_tools() β |
| β (src/tools/__init__) β |
| βββββββββββββββββββββββββββ |
| ``` |
|
|
| ### Model Factory |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Model Factory β |
| β (src/models/factory.py) β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| |
| config.yaml: |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β models: β |
| β - name: gpt-4 β |
| β display_name: GPT-4 β |
| β use: langchain_openai:ChatOpenAI β |
| β model: gpt-4 β |
| β api_key: $OPENAI_API_KEY β |
| β max_tokens: 4096 β |
| β supports_thinking: false β |
| β supports_vision: true β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββ |
| β create_chat_model() β |
| β - name: str β |
| β - thinking_enabled β |
| ββββββββββββββ¬βββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββ |
| β resolve_class() β |
| β (reflection system) β |
| ββββββββββββββ¬βββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββ |
| β BaseChatModel β |
| β (LangChain instance) β |
| βββββββββββββββββββββββββββ |
| ``` |
|
|
| **Supported Providers**: |
| - OpenAI (`langchain_openai:ChatOpenAI`) |
| - Anthropic (`langchain_anthropic:ChatAnthropic`) |
| - DeepSeek (`langchain_deepseek:ChatDeepSeek`) |
| - Custom via LangChain integrations |
|
|
| ### MCP Integration |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β MCP Integration β |
| β (src/mcp/manager.py) β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| |
| extensions_config.json: |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β { β |
| β "mcpServers": { β |
| β "github": { β |
| β "enabled": true, β |
| β "type": "stdio", β |
| β "command": "npx", β |
| β "args": ["-y", "@modelcontextprotocol/server-github"], β |
| β "env": {"GITHUB_TOKEN": "$GITHUB_TOKEN"} β |
| β } β |
| β } β |
| β } β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββ |
| β MultiServerMCPClient β |
| β (langchain-mcp-adapters)β |
| ββββββββββββββ¬βββββββββββββ |
| β |
| ββββββββββββββββββββββΌβββββββββββββββββββββ |
| β β β |
| βΌ βΌ βΌ |
| βββββββββββββ βββββββββββββ βββββββββββββ |
| β stdio β β SSE β β HTTP β |
| β transport β β transport β β transport β |
| βββββββββββββ βββββββββββββ βββββββββββββ |
| ``` |
|
|
| ### Skills System |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Skills System β |
| β (src/skills/loader.py) β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| |
| Directory Structure: |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β skills/ β |
| β βββ public/ # Public skills (committed) β |
| β β βββ pdf-processing/ β |
| β β β βββ SKILL.md β |
| β β βββ frontend-design/ β |
| β β β βββ SKILL.md β |
| β β βββ ... β |
| β βββ custom/ # Custom skills (gitignored) β |
| β βββ user-installed/ β |
| β βββ SKILL.md β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| |
| SKILL.md Format: |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β --- β |
| β name: PDF Processing β |
| β description: Handle PDF documents efficiently β |
| β license: MIT β |
| β allowed-tools: β |
| β - read_file β |
| β - write_file β |
| β - bash β |
| β --- β |
| β β |
| β # Skill Instructions β |
| β Content injected into system prompt... β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| ### Request Flow |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Request Flow Example β |
| β User sends message to agent β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| |
| 1. Client β Nginx |
| POST /api/langgraph/threads/{thread_id}/runs |
| {"input": {"messages": [{"role": "user", "content": "Hello"}]}} |
| |
| 2. Nginx β LangGraph Server (2024) |
| Proxied to LangGraph server |
| |
| 3. LangGraph Server |
| a. Load/create thread state |
| b. Execute middleware chain: |
| - ThreadDataMiddleware: Set up paths |
| - UploadsMiddleware: Inject file list |
| - SandboxMiddleware: Acquire sandbox |
| - SummarizationMiddleware: Check token limits |
| - TitleMiddleware: Generate title if needed |
| - TodoListMiddleware: Load todos (if plan mode) |
| - ViewImageMiddleware: Process images |
| - ClarificationMiddleware: Check for clarifications |
| |
| c. Execute agent: |
| - Model processes messages |
| - May call tools (bash, web_search, etc.) |
| - Tools execute via sandbox |
| - Results added to messages |
| |
| d. Stream response via SSE |
| |
| 4. Client receives streaming response |
| ``` |
|
|
| ## Data Flow |
|
|
| ### File Upload Flow |
|
|
| ``` |
| 1. Client uploads file |
| POST /api/threads/{thread_id}/uploads |
| Content-Type: multipart/form-data |
| |
| 2. Gateway receives file |
| - Validates file |
| - Stores in .deer-flow/threads/{thread_id}/user-data/uploads/ |
| - If document: converts to Markdown via markitdown |
| |
| 3. Returns response |
| { |
| "files": [{ |
| "filename": "doc.pdf", |
| "path": ".deer-flow/.../uploads/doc.pdf", |
| "virtual_path": "/mnt/user-data/uploads/doc.pdf", |
| "artifact_url": "/api/threads/.../artifacts/mnt/.../doc.pdf" |
| }] |
| } |
| |
| 4. Next agent run |
| - UploadsMiddleware lists files |
| - Injects file list into messages |
| - Agent can access via virtual_path |
| ``` |
|
|
| ### Configuration Reload |
|
|
| ``` |
| 1. Client updates MCP config |
| PUT /api/mcp/config |
| |
| 2. Gateway writes extensions_config.json |
| - Updates mcpServers section |
| - File mtime changes |
| |
| 3. MCP Manager detects change |
| - get_cached_mcp_tools() checks mtime |
| - If changed: reinitializes MCP client |
| - Loads updated server configurations |
| |
| 4. Next agent run uses new tools |
| ``` |
|
|
| ## Security Considerations |
|
|
| ### Sandbox Isolation |
|
|
| - Agent code executes within sandbox boundaries |
| - Local sandbox: Direct execution (development only) |
| - Docker sandbox: Container isolation (production recommended) |
| - Path traversal prevention in file operations |
|
|
| ### API Security |
|
|
| - Thread isolation: Each thread has separate data directories |
| - File validation: Uploads checked for path safety |
| - Environment variable resolution: Secrets not stored in config |
|
|
| ### MCP Security |
|
|
| - Each MCP server runs in its own process |
| - Environment variables resolved at runtime |
| - Servers can be enabled/disabled independently |
|
|
| ## Performance Considerations |
|
|
| ### Caching |
|
|
| - MCP tools cached with file mtime invalidation |
| - Configuration loaded once, reloaded on file change |
| - Skills parsed once at startup, cached in memory |
|
|
| ### Streaming |
|
|
| - SSE used for real-time response streaming |
| - Reduces time to first token |
| - Enables progress visibility for long operations |
|
|
| ### Context Management |
|
|
| - Summarization middleware reduces context when limits approached |
| - Configurable triggers: tokens, messages, or fraction |
| - Preserves recent messages while summarizing older ones |
|
|