File size: 7,684 Bytes
f844f16 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 |
# LangGraph Multi-Agent System
A sophisticated multi-agent system built with LangGraph that follows best practices for state management, tracing, and iterative workflows.
## Architecture Overview
The system implements an iterative research/code loop with specialized agents:
```
User Query β Lead Agent β Research Agent β Code Agent β Lead Agent (loop) β Answer Formatter β Final Answer
```
### Key Components
1. **Lead Agent** (`agents/lead_agent.py`)
- Orchestrates the entire workflow
- Makes routing decisions between research and code agents
- Manages the iterative loop with a maximum of 3 iterations
- Synthesizes information from specialists into draft answers
2. **Research Agent** (`agents/research_agent.py`)
- Handles information gathering from multiple sources
- Uses web search (Tavily), Wikipedia, and ArXiv tools
- Provides structured research results with citations
3. **Code Agent** (`agents/code_agent.py`)
- Performs mathematical calculations and code execution
- Uses calculator tools for basic operations
- Executes Python code in a sandboxed environment
- Handles Hugging Face Hub statistics
4. **Answer Formatter** (`agents/answer_formatter.py`)
- Ensures GAIA benchmark compliance
- Extracts final answers according to exact-match rules
- Handles different answer types (numbers, strings, lists)
5. **Memory System** (`memory_system.py`)
- Vector store integration for long-term learning
- Session-based caching for performance
- Similar question retrieval for context
## Core Features
### State Management
- **Immutable State**: Uses LangGraph's Command pattern for pure functions
- **Typed Schema**: AgentState TypedDict ensures type safety
- **Accumulation**: Research notes and code outputs accumulate across iterations
### Observability (Langfuse v3)
- **OTEL-Native Integration**: Uses Langfuse v3 with OpenTelemetry for automatic trace correlation
- **Single Callback Handler**: One global handler passes traces seamlessly through LangGraph
- **Predictable Span Naming**: `agent/<role>`, `tool/<name>`, `llm/<model>` patterns for cost/latency dashboards
- **Session Stitching**: User and session tracking for conversation continuity
- **Background Flushing**: Non-blocking trace export for optimal performance
### Tools Integration
- **Web Search**: Tavily API for current information
- **Knowledge Bases**: Wikipedia and ArXiv for encyclopedic/academic content
- **Computation**: Calculator tools and Python execution
- **Hub Statistics**: Hugging Face model information
## Setup
### Environment Variables
Create an `env.local` file with:
```bash
# LLM API
GROQ_API_KEY=your_groq_api_key
# Search Tools
TAVILY_API_KEY=your_tavily_api_key
# Observability
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_SECRET_KEY=your_langfuse_secret_key
LANGFUSE_HOST=https://cloud.langfuse.com
# Memory (Optional)
SUPABASE_URL=your_supabase_url
SUPABASE_SERVICE_KEY=your_supabase_service_key
```
### Dependencies
The system requires:
- `langgraph>=0.4.8`
- `langchain>=0.3.0`
- `langchain-groq`
- `langfuse>=3.0.0`
- `python-dotenv`
- `tavily-python`
## Usage
### Basic Usage
```python
import asyncio
from langgraph_agent_system import run_agent_system
async def main():
result = await run_agent_system(
query="What is the capital of Maharashtra?",
user_id="user_123",
session_id="session_456"
)
print(f"Answer: {result}")
asyncio.run(main())
```
### Testing
Run the test suite to verify functionality:
```bash
python test_new_multi_agent_system.py
```
Test Langfuse v3 observability integration:
```bash
python test_observability.py
```
### Direct Graph Access
```python
from langgraph_agent_system import create_agent_graph
# Create and compile the workflow
workflow = create_agent_graph()
app = workflow.compile()
# Run with initial state
initial_state = {
"messages": [HumanMessage(content="Your question")],
"draft_answer": "",
"research_notes": "",
"code_outputs": "",
"loop_counter": 0,
"done": False,
"next": "research",
"final_answer": "",
"user_id": "user_123",
"session_id": "session_456"
}
final_state = await app.ainvoke(initial_state)
print(final_state["final_answer"])
```
## Workflow Details
### Iterative Loop
1. **Lead Agent** analyzes the query and decides on next action
2. If research needed β **Research Agent** gathers information
3. If computation needed β **Code Agent** performs calculations
4. Back to **Lead Agent** for synthesis and next decision
5. When sufficient information β **Answer Formatter** creates final answer
### Routing Logic
The Lead Agent uses the following criteria:
- **Research**: Factual information, current events, citations needed
- **Code**: Mathematical calculations, data analysis, programming tasks
- **Formatter**: Sufficient information gathered OR max iterations reached
### GAIA Compliance
The Answer Formatter ensures exact-match requirements:
- **Numbers**: No commas, units, or extra symbols
- **Strings**: Remove unnecessary articles and formatting
- **Lists**: Comma and space separation
- **No surrounding text**: No "Answer:", quotes, or brackets
## Best Practices Implemented
### LangGraph Patterns
- β
Pure functions (AgentState β Command)
- β
Immutable state with explicit updates
- β
Typed state schema with operator annotations
- β
Clear routing separated from business logic
### Langfuse v3 Observability
- β
OTEL-native SDK with automatic trace correlation
- β
Single global callback handler for seamless LangGraph integration
- β
Predictable span naming (`agent/<role>`, `tool/<name>`, `llm/<model>`)
- β
Session and user tracking with environment tagging
- β
Background trace flushing for performance
- β
Graceful degradation when observability unavailable
### Memory Management
- β
TTL-based caching for performance
- β
Vector store integration for learning
- β
Duplicate detection and prevention
- β
Session cleanup for long-running instances
## Error Handling
The system implements graceful degradation:
- **Tool failures**: Continue with available tools
- **API timeouts**: Retry with backoff
- **Memory errors**: Degrade to LLM-only mode
- **Agent failures**: Return informative error messages
## Performance Considerations
- **Caching**: Vector store searches cached for 5 minutes
- **Parallelization**: Tools can be executed in parallel
- **Memory limits**: Sandbox execution has resource constraints
- **Loop termination**: Hard limit of 3 iterations prevents infinite loops
## Extending the System
### Adding New Agents
1. Create agent file in `agents/` directory
2. Implement agent function returning Command
3. Add to workflow in `create_agent_graph()`
4. Update routing logic in Lead Agent
### Adding New Tools
1. Implement tool following LangChain Tool interface
2. Add to appropriate agent's tool list
3. Update agent prompts to describe new capabilities
### Custom Memory Backends
1. Extend MemoryManager class
2. Implement required interface methods
3. Update initialization in memory_system.py
## Troubleshooting
### Common Issues
- **Missing API keys**: Check env.local file setup
- **Tool failures**: Verify network connectivity and API quotas
- **Memory errors**: Check Supabase configuration (optional)
- **Import errors**: Ensure all dependencies are installed
### Debug Mode
Set environment variable for detailed logging:
```bash
export LANGFUSE_DEBUG=true
```
This implementation follows the specified plan while incorporating LangGraph and Langfuse best practices for a robust, observable, and maintainable multi-agent system. |