File size: 7,684 Bytes
f844f16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
# LangGraph Multi-Agent System

A sophisticated multi-agent system built with LangGraph that follows best practices for state management, tracing, and iterative workflows.

## Architecture Overview

The system implements an iterative research/code loop with specialized agents:

```
User Query β†’ Lead Agent β†’ Research Agent β†’ Code Agent β†’ Lead Agent (loop) β†’ Answer Formatter β†’ Final Answer
```

### Key Components

1. **Lead Agent** (`agents/lead_agent.py`)
   - Orchestrates the entire workflow
   - Makes routing decisions between research and code agents
   - Manages the iterative loop with a maximum of 3 iterations
   - Synthesizes information from specialists into draft answers

2. **Research Agent** (`agents/research_agent.py`)
   - Handles information gathering from multiple sources
   - Uses web search (Tavily), Wikipedia, and ArXiv tools
   - Provides structured research results with citations

3. **Code Agent** (`agents/code_agent.py`)
   - Performs mathematical calculations and code execution
   - Uses calculator tools for basic operations
   - Executes Python code in a sandboxed environment
   - Handles Hugging Face Hub statistics

4. **Answer Formatter** (`agents/answer_formatter.py`)
   - Ensures GAIA benchmark compliance
   - Extracts final answers according to exact-match rules
   - Handles different answer types (numbers, strings, lists)

5. **Memory System** (`memory_system.py`)
   - Vector store integration for long-term learning
   - Session-based caching for performance
   - Similar question retrieval for context

## Core Features

### State Management
- **Immutable State**: Uses LangGraph's Command pattern for pure functions
- **Typed Schema**: AgentState TypedDict ensures type safety
- **Accumulation**: Research notes and code outputs accumulate across iterations

### Observability (Langfuse v3)
- **OTEL-Native Integration**: Uses Langfuse v3 with OpenTelemetry for automatic trace correlation
- **Single Callback Handler**: One global handler passes traces seamlessly through LangGraph 
- **Predictable Span Naming**: `agent/<role>`, `tool/<name>`, `llm/<model>` patterns for cost/latency dashboards
- **Session Stitching**: User and session tracking for conversation continuity
- **Background Flushing**: Non-blocking trace export for optimal performance

### Tools Integration
- **Web Search**: Tavily API for current information
- **Knowledge Bases**: Wikipedia and ArXiv for encyclopedic/academic content
- **Computation**: Calculator tools and Python execution
- **Hub Statistics**: Hugging Face model information

## Setup

### Environment Variables
Create an `env.local` file with:

```bash
# LLM API
GROQ_API_KEY=your_groq_api_key

# Search Tools
TAVILY_API_KEY=your_tavily_api_key

# Observability
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_SECRET_KEY=your_langfuse_secret_key
LANGFUSE_HOST=https://cloud.langfuse.com

# Memory (Optional)
SUPABASE_URL=your_supabase_url
SUPABASE_SERVICE_KEY=your_supabase_service_key
```

### Dependencies
The system requires:
- `langgraph>=0.4.8`
- `langchain>=0.3.0`
- `langchain-groq`
- `langfuse>=3.0.0`
- `python-dotenv`
- `tavily-python`

## Usage

### Basic Usage

```python
import asyncio
from langgraph_agent_system import run_agent_system

async def main():
    result = await run_agent_system(
        query="What is the capital of Maharashtra?",
        user_id="user_123",
        session_id="session_456"
    )
    print(f"Answer: {result}")

asyncio.run(main())
```

### Testing

Run the test suite to verify functionality:

```bash
python test_new_multi_agent_system.py
```

Test Langfuse v3 observability integration:

```bash
python test_observability.py
```

### Direct Graph Access

```python
from langgraph_agent_system import create_agent_graph

# Create and compile the workflow
workflow = create_agent_graph()
app = workflow.compile()

# Run with initial state
initial_state = {
    "messages": [HumanMessage(content="Your question")],
    "draft_answer": "",
    "research_notes": "",
    "code_outputs": "",
    "loop_counter": 0,
    "done": False,
    "next": "research",
    "final_answer": "",
    "user_id": "user_123",
    "session_id": "session_456"
}

final_state = await app.ainvoke(initial_state)
print(final_state["final_answer"])
```

## Workflow Details

### Iterative Loop
1. **Lead Agent** analyzes the query and decides on next action
2. If research needed β†’ **Research Agent** gathers information
3. If computation needed β†’ **Code Agent** performs calculations
4. Back to **Lead Agent** for synthesis and next decision
5. When sufficient information β†’ **Answer Formatter** creates final answer

### Routing Logic
The Lead Agent uses the following criteria:
- **Research**: Factual information, current events, citations needed
- **Code**: Mathematical calculations, data analysis, programming tasks
- **Formatter**: Sufficient information gathered OR max iterations reached

### GAIA Compliance
The Answer Formatter ensures exact-match requirements:
- **Numbers**: No commas, units, or extra symbols
- **Strings**: Remove unnecessary articles and formatting
- **Lists**: Comma and space separation
- **No surrounding text**: No "Answer:", quotes, or brackets

## Best Practices Implemented

### LangGraph Patterns
- βœ… Pure functions (AgentState β†’ Command)
- βœ… Immutable state with explicit updates
- βœ… Typed state schema with operator annotations
- βœ… Clear routing separated from business logic

### Langfuse v3 Observability
- βœ… OTEL-native SDK with automatic trace correlation
- βœ… Single global callback handler for seamless LangGraph integration
- βœ… Predictable span naming (`agent/<role>`, `tool/<name>`, `llm/<model>`)
- βœ… Session and user tracking with environment tagging
- βœ… Background trace flushing for performance
- βœ… Graceful degradation when observability unavailable

### Memory Management
- βœ… TTL-based caching for performance
- βœ… Vector store integration for learning
- βœ… Duplicate detection and prevention
- βœ… Session cleanup for long-running instances

## Error Handling

The system implements graceful degradation:
- **Tool failures**: Continue with available tools
- **API timeouts**: Retry with backoff
- **Memory errors**: Degrade to LLM-only mode
- **Agent failures**: Return informative error messages

## Performance Considerations

- **Caching**: Vector store searches cached for 5 minutes
- **Parallelization**: Tools can be executed in parallel
- **Memory limits**: Sandbox execution has resource constraints
- **Loop termination**: Hard limit of 3 iterations prevents infinite loops

## Extending the System

### Adding New Agents
1. Create agent file in `agents/` directory
2. Implement agent function returning Command
3. Add to workflow in `create_agent_graph()`
4. Update routing logic in Lead Agent

### Adding New Tools
1. Implement tool following LangChain Tool interface
2. Add to appropriate agent's tool list
3. Update agent prompts to describe new capabilities

### Custom Memory Backends
1. Extend MemoryManager class
2. Implement required interface methods
3. Update initialization in memory_system.py

## Troubleshooting

### Common Issues
- **Missing API keys**: Check env.local file setup
- **Tool failures**: Verify network connectivity and API quotas
- **Memory errors**: Check Supabase configuration (optional)
- **Import errors**: Ensure all dependencies are installed

### Debug Mode
Set environment variable for detailed logging:
```bash
export LANGFUSE_DEBUG=true
```

This implementation follows the specified plan while incorporating LangGraph and Langfuse best practices for a robust, observable, and maintainable multi-agent system.