File size: 5,525 Bytes
fe36046
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
# LangGraph Agent System Architecture

This document describes the architecture of the multi-agent system implemented using LangGraph 0.4.8+ and Langfuse 3.0.0.

## System Overview

The system implements a sophisticated agent architecture with memory, routing, specialized agents, and verification as shown in the system diagram.

## Core Components

### 1. Memory Layer
- **Short-Term Memory**: Graph state managed by LangGraph checkpointing
- **Checkpointer**: SQLite-based persistence for conversation continuity  
- **Long-Term Memory**: Supabase vector store with pgvector for Q&A storage

### 2. Plan + ReAct Loop
- Initial query analysis and planning
- Contextual prompt injection with system requirements
- Memory retrieval for similar past questions

### 3. Agent Router
- Intelligent routing based on query analysis
- Routes to specialized agents: Retrieval, Execution, or Critic
- Uses low-temperature LLM for consistent routing decisions

### 4. Specialized Agents

#### Retrieval Agent
- Information gathering from external sources
- Tools: Wikipedia, Arxiv, Tavily web search, vector store retrieval
- Handles attachment downloading for GAIA tasks
- Context-aware with memory integration

#### Execution Agent  
- Computational tasks and code execution
- Integrates with existing `code_agent.py` sandbox
- Python code execution with pandas, cv2, standard libraries
- Step-by-step problem breakdown

#### Critic Agent
- Response quality evaluation and review
- Accuracy, completeness, and logical consistency checks
- Scoring system with pass/fail determination
- Constructive feedback generation

### 5. Verification & Fallback
- Final quality control with system prompt compliance
- Format verification for exact-match requirements
- Retry logic with maximum attempt limits
- Graceful fallback pipeline for failed attempts

### 6. Observability (Langfuse)
- End-to-end tracing of all agent interactions
- Performance monitoring and debugging
- User session tracking
- Error logging and analysis

## Data Flow

1. **User Query** β†’ Plan Node (system prompt injection)
2. **Plan Node** β†’ Router (agent selection)
3. **Router** β†’ Specialized Agent (task execution)
4. **Agent** β†’ Tools (if needed) β†’ Agent (results)
5. **Agent** β†’ Verification (quality check)
6. **Verification** β†’ Output or Retry/Fallback

## Key Features

### Memory Management
- Caching of similarity searches (TTL-based)
- Duplicate detection and prevention
- Task-based attachment tracking
- Session-specific cache management

### Quality Control
- Multi-level verification (agent β†’ critic β†’ verification)
- Retry mechanism with attempt limits
- Format compliance checking
- Fallback responses for failures

### Tracing & Observability
- Langfuse integration for complete observability
- Agent-level span tracking
- Error monitoring and debugging
- Performance metrics collection

### Tool Integration
- Modular tool system for each agent
- Sandboxed code execution environment
- External API integration (search, knowledge bases)
- Attachment handling for complex tasks

## Configuration

### Environment Variables
See `env.template` for required configuration:
- LLM API keys (Groq, OpenAI, Google, HuggingFace)
- Search tools (Tavily)
- Vector store (Supabase)
- Observability (Langfuse)
- GAIA API endpoints

### System Prompts
Located in `prompts/` directory:
- `system_prompt.txt`: Main system requirements
- `router_prompt.txt`: Agent routing instructions
- `retrieval_prompt.txt`: Information gathering guidelines
- `execution_prompt.txt`: Code execution instructions
- `critic_prompt.txt`: Quality evaluation criteria
- `verification_prompt.txt`: Final formatting rules

## Usage

### Basic Usage
```python
from src import run_agent_system

result = run_agent_system(
    query="Your question here",
    user_id="user123",
    session_id="session456"
)
```

### With Memory Management
```python
from src import memory_manager

# Check if query is similar to previous ones
similar = memory_manager.get_similar_qa(query)

# Clear session cache
memory_manager.clear_session_cache()
```

### Direct Graph Access
```python
from src import create_agent_graph

workflow = create_agent_graph()
app = workflow.compile(checkpointer=checkpointer)
result = app.invoke(initial_state, config=config)
```

## Dependencies

### Core Framework
- `langgraph>=0.4.8`: Graph-based agent orchestration
- `langgraph-checkpoint-sqlite>=2.0.0`: Persistence layer
- `langchain>=0.3.0`: LLM and tool abstractions

### Observability
- `langfuse==3.0.0`: Tracing and monitoring

### Memory & Storage
- `supabase>=2.8.0`: Vector database backend
- `pgvector>=0.3.0`: Vector similarity search

### Tools & APIs
- `tavily-python>=0.5.0`: Web search
- `arxiv>=2.1.0`: Academic paper search
- `wikipedia>=1.4.0`: Knowledge base access

## Error Handling

The system implements comprehensive error handling:
- Graceful degradation when services are unavailable
- Fallback responses for critical failures
- Retry logic with exponential backoff
- Detailed error logging for debugging

## Performance Considerations

- Vector store caching reduces duplicate searches
- Checkpoint-based state management for conversation continuity
- Efficient tool routing based on query analysis
- Memory cleanup for long-running sessions

## Future Enhancements

- Additional specialized agents (e.g., Image Analysis, Code Review)
- Enhanced memory clustering and retrieval algorithms
- Real-time collaboration between agents
- Advanced tool composition and chaining