Final_Assignment_Template

Sleeping

File size: 5,525 Bytes

fe36046

# LangGraph Agent System Architecture

This document describes the architecture of the multi-agent system implemented using LangGraph 0.4.8+ and Langfuse 3.0.0.

## System Overview

The system implements a sophisticated agent architecture with memory, routing, specialized agents, and verification as shown in the system diagram.

## Core Components

### 1. Memory Layer
- **Short-Term Memory**: Graph state managed by LangGraph checkpointing
- **Checkpointer**: SQLite-based persistence for conversation continuity  
- **Long-Term Memory**: Supabase vector store with pgvector for Q&A storage

### 2. Plan + ReAct Loop
- Initial query analysis and planning
- Contextual prompt injection with system requirements
- Memory retrieval for similar past questions

### 3. Agent Router
- Intelligent routing based on query analysis
- Routes to specialized agents: Retrieval, Execution, or Critic
- Uses low-temperature LLM for consistent routing decisions

### 4. Specialized Agents

#### Retrieval Agent
- Information gathering from external sources
- Tools: Wikipedia, Arxiv, Tavily web search, vector store retrieval
- Handles attachment downloading for GAIA tasks
- Context-aware with memory integration

#### Execution Agent  
- Computational tasks and code execution
- Integrates with existing `code_agent.py` sandbox
- Python code execution with pandas, cv2, standard libraries
- Step-by-step problem breakdown

#### Critic Agent
- Response quality evaluation and review
- Accuracy, completeness, and logical consistency checks
- Scoring system with pass/fail determination
- Constructive feedback generation

### 5. Verification & Fallback
- Final quality control with system prompt compliance
- Format verification for exact-match requirements
- Retry logic with maximum attempt limits
- Graceful fallback pipeline for failed attempts

### 6. Observability (Langfuse)
- End-to-end tracing of all agent interactions
- Performance monitoring and debugging
- User session tracking
- Error logging and analysis

## Data Flow

1. **User Query** → Plan Node (system prompt injection)
2. **Plan Node** → Router (agent selection)
3. **Router** → Specialized Agent (task execution)
4. **Agent** → Tools (if needed) → Agent (results)
5. **Agent** → Verification (quality check)
6. **Verification** → Output or Retry/Fallback

## Key Features

### Memory Management
- Caching of similarity searches (TTL-based)
- Duplicate detection and prevention
- Task-based attachment tracking
- Session-specific cache management

### Quality Control
- Multi-level verification (agent → critic → verification)
- Retry mechanism with attempt limits
- Format compliance checking
- Fallback responses for failures

### Tracing & Observability
- Langfuse integration for complete observability
- Agent-level span tracking
- Error monitoring and debugging
- Performance metrics collection

### Tool Integration
- Modular tool system for each agent
- Sandboxed code execution environment
- External API integration (search, knowledge bases)
- Attachment handling for complex tasks

## Configuration

### Environment Variables
See `env.template` for required configuration:
- LLM API keys (Groq, OpenAI, Google, HuggingFace)
- Search tools (Tavily)
- Vector store (Supabase)
- Observability (Langfuse)
- GAIA API endpoints

### System Prompts
Located in `prompts/` directory:
- `system_prompt.txt`: Main system requirements
- `router_prompt.txt`: Agent routing instructions
- `retrieval_prompt.txt`: Information gathering guidelines
- `execution_prompt.txt`: Code execution instructions
- `critic_prompt.txt`: Quality evaluation criteria
- `verification_prompt.txt`: Final formatting rules

## Usage

### Basic Usage
```python
from src import run_agent_system

result = run_agent_system(
    query="Your question here",
    user_id="user123",
    session_id="session456"
)
```

### With Memory Management
```python
from src import memory_manager

# Check if query is similar to previous ones
similar = memory_manager.get_similar_qa(query)

# Clear session cache
memory_manager.clear_session_cache()
```

### Direct Graph Access
```python
from src import create_agent_graph

workflow = create_agent_graph()
app = workflow.compile(checkpointer=checkpointer)
result = app.invoke(initial_state, config=config)
```

## Dependencies

### Core Framework
- `langgraph>=0.4.8`: Graph-based agent orchestration
- `langgraph-checkpoint-sqlite>=2.0.0`: Persistence layer
- `langchain>=0.3.0`: LLM and tool abstractions

### Observability
- `langfuse==3.0.0`: Tracing and monitoring

### Memory & Storage
- `supabase>=2.8.0`: Vector database backend
- `pgvector>=0.3.0`: Vector similarity search

### Tools & APIs
- `tavily-python>=0.5.0`: Web search
- `arxiv>=2.1.0`: Academic paper search
- `wikipedia>=1.4.0`: Knowledge base access

## Error Handling

The system implements comprehensive error handling:
- Graceful degradation when services are unavailable
- Fallback responses for critical failures
- Retry logic with exponential backoff
- Detailed error logging for debugging

## Performance Considerations

- Vector store caching reduces duplicate searches
- Checkpoint-based state management for conversation continuity
- Efficient tool routing based on query analysis
- Memory cleanup for long-running sessions

## Future Enhancements

- Additional specialized agents (e.g., Image Analysis, Code Review)
- Enhanced memory clustering and retrieval algorithms
- Real-time collaboration between agents
- Advanced tool composition and chaining