| # LangGraph Agent System Architecture | |
| This document describes the architecture of the multi-agent system implemented using LangGraph 0.4.8+ and Langfuse 3.0.0. | |
| ## System Overview | |
| The system implements a sophisticated agent architecture with memory, routing, specialized agents, and verification as shown in the system diagram. | |
| ## Core Components | |
| ### 1. Memory Layer | |
| - **Short-Term Memory**: Graph state managed by LangGraph checkpointing | |
| - **Checkpointer**: SQLite-based persistence for conversation continuity | |
| - **Long-Term Memory**: Supabase vector store with pgvector for Q&A storage | |
| ### 2. Plan + ReAct Loop | |
| - Initial query analysis and planning | |
| - Contextual prompt injection with system requirements | |
| - Memory retrieval for similar past questions | |
| ### 3. Agent Router | |
| - Intelligent routing based on query analysis | |
| - Routes to specialized agents: Retrieval, Execution, or Critic | |
| - Uses low-temperature LLM for consistent routing decisions | |
| ### 4. Specialized Agents | |
| #### Retrieval Agent | |
| - Information gathering from external sources | |
| - Tools: Wikipedia, Arxiv, Tavily web search, vector store retrieval | |
| - Handles attachment downloading for GAIA tasks | |
| - Context-aware with memory integration | |
| #### Execution Agent | |
| - Computational tasks and code execution | |
| - Integrates with existing `code_agent.py` sandbox | |
| - Python code execution with pandas, cv2, standard libraries | |
| - Step-by-step problem breakdown | |
| #### Critic Agent | |
| - Response quality evaluation and review | |
| - Accuracy, completeness, and logical consistency checks | |
| - Scoring system with pass/fail determination | |
| - Constructive feedback generation | |
| ### 5. Verification & Fallback | |
| - Final quality control with system prompt compliance | |
| - Format verification for exact-match requirements | |
| - Retry logic with maximum attempt limits | |
| - Graceful fallback pipeline for failed attempts | |
| ### 6. Observability (Langfuse) | |
| - End-to-end tracing of all agent interactions | |
| - Performance monitoring and debugging | |
| - User session tracking | |
| - Error logging and analysis | |
| ## Data Flow | |
| 1. **User Query** β Plan Node (system prompt injection) | |
| 2. **Plan Node** β Router (agent selection) | |
| 3. **Router** β Specialized Agent (task execution) | |
| 4. **Agent** β Tools (if needed) β Agent (results) | |
| 5. **Agent** β Verification (quality check) | |
| 6. **Verification** β Output or Retry/Fallback | |
| ## Key Features | |
| ### Memory Management | |
| - Caching of similarity searches (TTL-based) | |
| - Duplicate detection and prevention | |
| - Task-based attachment tracking | |
| - Session-specific cache management | |
| ### Quality Control | |
| - Multi-level verification (agent β critic β verification) | |
| - Retry mechanism with attempt limits | |
| - Format compliance checking | |
| - Fallback responses for failures | |
| ### Tracing & Observability | |
| - Langfuse integration for complete observability | |
| - Agent-level span tracking | |
| - Error monitoring and debugging | |
| - Performance metrics collection | |
| ### Tool Integration | |
| - Modular tool system for each agent | |
| - Sandboxed code execution environment | |
| - External API integration (search, knowledge bases) | |
| - Attachment handling for complex tasks | |
| ## Configuration | |
| ### Environment Variables | |
| See `env.template` for required configuration: | |
| - LLM API keys (Groq, OpenAI, Google, HuggingFace) | |
| - Search tools (Tavily) | |
| - Vector store (Supabase) | |
| - Observability (Langfuse) | |
| - GAIA API endpoints | |
| ### System Prompts | |
| Located in `prompts/` directory: | |
| - `system_prompt.txt`: Main system requirements | |
| - `router_prompt.txt`: Agent routing instructions | |
| - `retrieval_prompt.txt`: Information gathering guidelines | |
| - `execution_prompt.txt`: Code execution instructions | |
| - `critic_prompt.txt`: Quality evaluation criteria | |
| - `verification_prompt.txt`: Final formatting rules | |
| ## Usage | |
| ### Basic Usage | |
| ```python | |
| from src import run_agent_system | |
| result = run_agent_system( | |
| query="Your question here", | |
| user_id="user123", | |
| session_id="session456" | |
| ) | |
| ``` | |
| ### With Memory Management | |
| ```python | |
| from src import memory_manager | |
| # Check if query is similar to previous ones | |
| similar = memory_manager.get_similar_qa(query) | |
| # Clear session cache | |
| memory_manager.clear_session_cache() | |
| ``` | |
| ### Direct Graph Access | |
| ```python | |
| from src import create_agent_graph | |
| workflow = create_agent_graph() | |
| app = workflow.compile(checkpointer=checkpointer) | |
| result = app.invoke(initial_state, config=config) | |
| ``` | |
| ## Dependencies | |
| ### Core Framework | |
| - `langgraph>=0.4.8`: Graph-based agent orchestration | |
| - `langgraph-checkpoint-sqlite>=2.0.0`: Persistence layer | |
| - `langchain>=0.3.0`: LLM and tool abstractions | |
| ### Observability | |
| - `langfuse==3.0.0`: Tracing and monitoring | |
| ### Memory & Storage | |
| - `supabase>=2.8.0`: Vector database backend | |
| - `pgvector>=0.3.0`: Vector similarity search | |
| ### Tools & APIs | |
| - `tavily-python>=0.5.0`: Web search | |
| - `arxiv>=2.1.0`: Academic paper search | |
| - `wikipedia>=1.4.0`: Knowledge base access | |
| ## Error Handling | |
| The system implements comprehensive error handling: | |
| - Graceful degradation when services are unavailable | |
| - Fallback responses for critical failures | |
| - Retry logic with exponential backoff | |
| - Detailed error logging for debugging | |
| ## Performance Considerations | |
| - Vector store caching reduces duplicate searches | |
| - Checkpoint-based state management for conversation continuity | |
| - Efficient tool routing based on query analysis | |
| - Memory cleanup for long-running sessions | |
| ## Future Enhancements | |
| - Additional specialized agents (e.g., Image Analysis, Code Review) | |
| - Enhanced memory clustering and retrieval algorithms | |
| - Real-time collaboration between agents | |
| - Advanced tool composition and chaining |