# LangGraph Multi-Agent System A sophisticated multi-agent system built with LangGraph that follows best practices for state management, tracing, and iterative workflows. ## Architecture Overview The system implements an iterative research/code loop with specialized agents: ``` User Query → Lead Agent → Research Agent → Code Agent → Lead Agent (loop) → Answer Formatter → Final Answer ``` ### Key Components 1. **Lead Agent** (`agents/lead_agent.py`) - Orchestrates the entire workflow - Makes routing decisions between research and code agents - Manages the iterative loop with a maximum of 3 iterations - Synthesizes information from specialists into draft answers 2. **Research Agent** (`agents/research_agent.py`) - Handles information gathering from multiple sources - Uses web search (Tavily), Wikipedia, and ArXiv tools - Provides structured research results with citations 3. **Code Agent** (`agents/code_agent.py`) - Performs mathematical calculations and code execution - Uses calculator tools for basic operations - Executes Python code in a sandboxed environment - Handles Hugging Face Hub statistics 4. **Answer Formatter** (`agents/answer_formatter.py`) - Ensures GAIA benchmark compliance - Extracts final answers according to exact-match rules - Handles different answer types (numbers, strings, lists) 5. **Memory System** (`memory_system.py`) - Vector store integration for long-term learning - Session-based caching for performance - Similar question retrieval for context ## Core Features ### State Management - **Immutable State**: Uses LangGraph's Command pattern for pure functions - **Typed Schema**: AgentState TypedDict ensures type safety - **Accumulation**: Research notes and code outputs accumulate across iterations ### Observability (Langfuse v3) - **OTEL-Native Integration**: Uses Langfuse v3 with OpenTelemetry for automatic trace correlation - **Single Callback Handler**: One global handler passes traces seamlessly through LangGraph - **Predictable Span Naming**: `agent/`, `tool/`, `llm/` patterns for cost/latency dashboards - **Session Stitching**: User and session tracking for conversation continuity - **Background Flushing**: Non-blocking trace export for optimal performance ### Tools Integration - **Web Search**: Tavily API for current information - **Knowledge Bases**: Wikipedia and ArXiv for encyclopedic/academic content - **Computation**: Calculator tools and Python execution - **Hub Statistics**: Hugging Face model information ## Setup ### Environment Variables Create an `env.local` file with: ```bash # LLM API GROQ_API_KEY=your_groq_api_key # Search Tools TAVILY_API_KEY=your_tavily_api_key # Observability LANGFUSE_PUBLIC_KEY=your_langfuse_public_key LANGFUSE_SECRET_KEY=your_langfuse_secret_key LANGFUSE_HOST=https://cloud.langfuse.com # Memory (Optional) SUPABASE_URL=your_supabase_url SUPABASE_SERVICE_KEY=your_supabase_service_key ``` ### Dependencies The system requires: - `langgraph>=0.4.8` - `langchain>=0.3.0` - `langchain-groq` - `langfuse>=3.0.0` - `python-dotenv` - `tavily-python` ## Usage ### Basic Usage ```python import asyncio from langgraph_agent_system import run_agent_system async def main(): result = await run_agent_system( query="What is the capital of Maharashtra?", user_id="user_123", session_id="session_456" ) print(f"Answer: {result}") asyncio.run(main()) ``` ### Testing Run the test suite to verify functionality: ```bash python test_new_multi_agent_system.py ``` Test Langfuse v3 observability integration: ```bash python test_observability.py ``` ### Direct Graph Access ```python from langgraph_agent_system import create_agent_graph # Create and compile the workflow workflow = create_agent_graph() app = workflow.compile() # Run with initial state initial_state = { "messages": [HumanMessage(content="Your question")], "draft_answer": "", "research_notes": "", "code_outputs": "", "loop_counter": 0, "done": False, "next": "research", "final_answer": "", "user_id": "user_123", "session_id": "session_456" } final_state = await app.ainvoke(initial_state) print(final_state["final_answer"]) ``` ## Workflow Details ### Iterative Loop 1. **Lead Agent** analyzes the query and decides on next action 2. If research needed → **Research Agent** gathers information 3. If computation needed → **Code Agent** performs calculations 4. Back to **Lead Agent** for synthesis and next decision 5. When sufficient information → **Answer Formatter** creates final answer ### Routing Logic The Lead Agent uses the following criteria: - **Research**: Factual information, current events, citations needed - **Code**: Mathematical calculations, data analysis, programming tasks - **Formatter**: Sufficient information gathered OR max iterations reached ### GAIA Compliance The Answer Formatter ensures exact-match requirements: - **Numbers**: No commas, units, or extra symbols - **Strings**: Remove unnecessary articles and formatting - **Lists**: Comma and space separation - **No surrounding text**: No "Answer:", quotes, or brackets ## Best Practices Implemented ### LangGraph Patterns - ✅ Pure functions (AgentState → Command) - ✅ Immutable state with explicit updates - ✅ Typed state schema with operator annotations - ✅ Clear routing separated from business logic ### Langfuse v3 Observability - ✅ OTEL-native SDK with automatic trace correlation - ✅ Single global callback handler for seamless LangGraph integration - ✅ Predictable span naming (`agent/`, `tool/`, `llm/`) - ✅ Session and user tracking with environment tagging - ✅ Background trace flushing for performance - ✅ Graceful degradation when observability unavailable ### Memory Management - ✅ TTL-based caching for performance - ✅ Vector store integration for learning - ✅ Duplicate detection and prevention - ✅ Session cleanup for long-running instances ## Error Handling The system implements graceful degradation: - **Tool failures**: Continue with available tools - **API timeouts**: Retry with backoff - **Memory errors**: Degrade to LLM-only mode - **Agent failures**: Return informative error messages ## Performance Considerations - **Caching**: Vector store searches cached for 5 minutes - **Parallelization**: Tools can be executed in parallel - **Memory limits**: Sandbox execution has resource constraints - **Loop termination**: Hard limit of 3 iterations prevents infinite loops ## Extending the System ### Adding New Agents 1. Create agent file in `agents/` directory 2. Implement agent function returning Command 3. Add to workflow in `create_agent_graph()` 4. Update routing logic in Lead Agent ### Adding New Tools 1. Implement tool following LangChain Tool interface 2. Add to appropriate agent's tool list 3. Update agent prompts to describe new capabilities ### Custom Memory Backends 1. Extend MemoryManager class 2. Implement required interface methods 3. Update initialization in memory_system.py ## Troubleshooting ### Common Issues - **Missing API keys**: Check env.local file setup - **Tool failures**: Verify network connectivity and API quotas - **Memory errors**: Check Supabase configuration (optional) - **Import errors**: Ensure all dependencies are installed ### Debug Mode Set environment variable for detailed logging: ```bash export LANGFUSE_DEBUG=true ``` This implementation follows the specified plan while incorporating LangGraph and Langfuse best practices for a robust, observable, and maintainable multi-agent system.