# LangGraph Agent System Architecture This document describes the architecture of the multi-agent system implemented using LangGraph 0.4.8+ and Langfuse 3.0.0. ## System Overview The system implements a sophisticated agent architecture with memory, routing, specialized agents, and verification as shown in the system diagram. ## Core Components ### 1. Memory Layer - **Short-Term Memory**: Graph state managed by LangGraph checkpointing - **Checkpointer**: SQLite-based persistence for conversation continuity - **Long-Term Memory**: Supabase vector store with pgvector for Q&A storage ### 2. Plan + ReAct Loop - Initial query analysis and planning - Contextual prompt injection with system requirements - Memory retrieval for similar past questions ### 3. Agent Router - Intelligent routing based on query analysis - Routes to specialized agents: Retrieval, Execution, or Critic - Uses low-temperature LLM for consistent routing decisions ### 4. Specialized Agents #### Retrieval Agent - Information gathering from external sources - Tools: Wikipedia, Arxiv, Tavily web search, vector store retrieval - Handles attachment downloading for GAIA tasks - Context-aware with memory integration #### Execution Agent - Computational tasks and code execution - Integrates with existing `code_agent.py` sandbox - Python code execution with pandas, cv2, standard libraries - Step-by-step problem breakdown #### Critic Agent - Response quality evaluation and review - Accuracy, completeness, and logical consistency checks - Scoring system with pass/fail determination - Constructive feedback generation ### 5. Verification & Fallback - Final quality control with system prompt compliance - Format verification for exact-match requirements - Retry logic with maximum attempt limits - Graceful fallback pipeline for failed attempts ### 6. Observability (Langfuse) - End-to-end tracing of all agent interactions - Performance monitoring and debugging - User session tracking - Error logging and analysis ## Data Flow 1. **User Query** → Plan Node (system prompt injection) 2. **Plan Node** → Router (agent selection) 3. **Router** → Specialized Agent (task execution) 4. **Agent** → Tools (if needed) → Agent (results) 5. **Agent** → Verification (quality check) 6. **Verification** → Output or Retry/Fallback ## Key Features ### Memory Management - Caching of similarity searches (TTL-based) - Duplicate detection and prevention - Task-based attachment tracking - Session-specific cache management ### Quality Control - Multi-level verification (agent → critic → verification) - Retry mechanism with attempt limits - Format compliance checking - Fallback responses for failures ### Tracing & Observability - Langfuse integration for complete observability - Agent-level span tracking - Error monitoring and debugging - Performance metrics collection ### Tool Integration - Modular tool system for each agent - Sandboxed code execution environment - External API integration (search, knowledge bases) - Attachment handling for complex tasks ## Configuration ### Environment Variables See `env.template` for required configuration: - LLM API keys (Groq, OpenAI, Google, HuggingFace) - Search tools (Tavily) - Vector store (Supabase) - Observability (Langfuse) - GAIA API endpoints ### System Prompts Located in `prompts/` directory: - `system_prompt.txt`: Main system requirements - `router_prompt.txt`: Agent routing instructions - `retrieval_prompt.txt`: Information gathering guidelines - `execution_prompt.txt`: Code execution instructions - `critic_prompt.txt`: Quality evaluation criteria - `verification_prompt.txt`: Final formatting rules ## Usage ### Basic Usage ```python from src import run_agent_system result = run_agent_system( query="Your question here", user_id="user123", session_id="session456" ) ``` ### With Memory Management ```python from src import memory_manager # Check if query is similar to previous ones similar = memory_manager.get_similar_qa(query) # Clear session cache memory_manager.clear_session_cache() ``` ### Direct Graph Access ```python from src import create_agent_graph workflow = create_agent_graph() app = workflow.compile(checkpointer=checkpointer) result = app.invoke(initial_state, config=config) ``` ## Dependencies ### Core Framework - `langgraph>=0.4.8`: Graph-based agent orchestration - `langgraph-checkpoint-sqlite>=2.0.0`: Persistence layer - `langchain>=0.3.0`: LLM and tool abstractions ### Observability - `langfuse==3.0.0`: Tracing and monitoring ### Memory & Storage - `supabase>=2.8.0`: Vector database backend - `pgvector>=0.3.0`: Vector similarity search ### Tools & APIs - `tavily-python>=0.5.0`: Web search - `arxiv>=2.1.0`: Academic paper search - `wikipedia>=1.4.0`: Knowledge base access ## Error Handling The system implements comprehensive error handling: - Graceful degradation when services are unavailable - Fallback responses for critical failures - Retry logic with exponential backoff - Detailed error logging for debugging ## Performance Considerations - Vector store caching reduces duplicate searches - Checkpoint-based state management for conversation continuity - Efficient tool routing based on query analysis - Memory cleanup for long-running sessions ## Future Enhancements - Additional specialized agents (e.g., Image Analysis, Code Review) - Enhanced memory clustering and retrieval algorithms - Real-time collaboration between agents - Advanced tool composition and chaining