Final_Assignment_Template

Sleeping

App Files Files Community

Final_Assignment_Template / README_MULTI_AGENT_SYSTEM.md

Humanlearning

updated agent

f844f16 8 months ago

preview code

raw

history blame contribute delete

7.68 kB

	# LangGraph Multi-Agent System

	A sophisticated multi-agent system built with LangGraph that follows best practices for state management, tracing, and iterative workflows.

	## Architecture Overview

	The system implements an iterative research/code loop with specialized agents:

	```
	User Query → Lead Agent → Research Agent → Code Agent → Lead Agent (loop) → Answer Formatter → Final Answer
	```

	### Key Components

	1. Lead Agent (`agents/lead_agent.py`)
	- Orchestrates the entire workflow
	- Makes routing decisions between research and code agents
	- Manages the iterative loop with a maximum of 3 iterations
	- Synthesizes information from specialists into draft answers

	2. Research Agent (`agents/research_agent.py`)
	- Handles information gathering from multiple sources
	- Uses web search (Tavily), Wikipedia, and ArXiv tools
	- Provides structured research results with citations

	3. Code Agent (`agents/code_agent.py`)
	- Performs mathematical calculations and code execution
	- Uses calculator tools for basic operations
	- Executes Python code in a sandboxed environment
	- Handles Hugging Face Hub statistics

	4. Answer Formatter (`agents/answer_formatter.py`)
	- Ensures GAIA benchmark compliance
	- Extracts final answers according to exact-match rules
	- Handles different answer types (numbers, strings, lists)

	5. Memory System (`memory_system.py`)
	- Vector store integration for long-term learning
	- Session-based caching for performance
	- Similar question retrieval for context

	## Core Features

	### State Management
	- Immutable State: Uses LangGraph's Command pattern for pure functions
	- Typed Schema: AgentState TypedDict ensures type safety
	- Accumulation: Research notes and code outputs accumulate across iterations

	### Observability (Langfuse v3)
	- OTEL-Native Integration: Uses Langfuse v3 with OpenTelemetry for automatic trace correlation
	- Single Callback Handler: One global handler passes traces seamlessly through LangGraph
	- Predictable Span Naming: `agent/<role>`, `tool/<name>`, `llm/<model>` patterns for cost/latency dashboards
	- Session Stitching: User and session tracking for conversation continuity
	- Background Flushing: Non-blocking trace export for optimal performance

	### Tools Integration
	- Web Search: Tavily API for current information
	- Knowledge Bases: Wikipedia and ArXiv for encyclopedic/academic content
	- Computation: Calculator tools and Python execution
	- Hub Statistics: Hugging Face model information

	## Setup

	### Environment Variables
	Create an `env.local` file with:

	```bash
	# LLM API
	GROQ_API_KEY=your_groq_api_key

	# Search Tools
	TAVILY_API_KEY=your_tavily_api_key

	# Observability
	LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
	LANGFUSE_SECRET_KEY=your_langfuse_secret_key
	LANGFUSE_HOST=https://cloud.langfuse.com

	# Memory (Optional)
	SUPABASE_URL=your_supabase_url
	SUPABASE_SERVICE_KEY=your_supabase_service_key
	```

	### Dependencies
	The system requires:
	- `langgraph>=0.4.8`
	- `langchain>=0.3.0`
	- `langchain-groq`
	- `langfuse>=3.0.0`
	- `python-dotenv`
	- `tavily-python`

	## Usage

	### Basic Usage

	```python
	import asyncio
	from langgraph_agent_system import run_agent_system

	async def main():
	result = await run_agent_system(
	query="What is the capital of Maharashtra?",
	user_id="user_123",
	session_id="session_456"
	)
	print(f"Answer: {result}")

	asyncio.run(main())
	```

	### Testing

	Run the test suite to verify functionality:

	```bash
	python test_new_multi_agent_system.py
	```

	Test Langfuse v3 observability integration:

	```bash
	python test_observability.py
	```

	### Direct Graph Access

	```python
	from langgraph_agent_system import create_agent_graph

	# Create and compile the workflow
	workflow = create_agent_graph()
	app = workflow.compile()

	# Run with initial state
	initial_state = {
	"messages": [HumanMessage(content="Your question")],
	"draft_answer": "",
	"research_notes": "",
	"code_outputs": "",
	"loop_counter": 0,
	"done": False,
	"next": "research",
	"final_answer": "",
	"user_id": "user_123",
	"session_id": "session_456"
	}

	final_state = await app.ainvoke(initial_state)
	print(final_state["final_answer"])
	```

	## Workflow Details

	### Iterative Loop
	1. Lead Agent analyzes the query and decides on next action
	2. If research needed → Research Agent gathers information
	3. If computation needed → Code Agent performs calculations
	4. Back to Lead Agent for synthesis and next decision
	5. When sufficient information → Answer Formatter creates final answer

	### Routing Logic
	The Lead Agent uses the following criteria:
	- Research: Factual information, current events, citations needed
	- Code: Mathematical calculations, data analysis, programming tasks
	- Formatter: Sufficient information gathered OR max iterations reached

	### GAIA Compliance
	The Answer Formatter ensures exact-match requirements:
	- Numbers: No commas, units, or extra symbols
	- Strings: Remove unnecessary articles and formatting
	- Lists: Comma and space separation
	- No surrounding text: No "Answer:", quotes, or brackets

	## Best Practices Implemented

	### LangGraph Patterns
	- ✅ Pure functions (AgentState → Command)
	- ✅ Immutable state with explicit updates
	- ✅ Typed state schema with operator annotations
	- ✅ Clear routing separated from business logic

	### Langfuse v3 Observability
	- ✅ OTEL-native SDK with automatic trace correlation
	- ✅ Single global callback handler for seamless LangGraph integration
	- ✅ Predictable span naming (`agent/<role>`, `tool/<name>`, `llm/<model>`)
	- ✅ Session and user tracking with environment tagging
	- ✅ Background trace flushing for performance
	- ✅ Graceful degradation when observability unavailable

	### Memory Management
	- ✅ TTL-based caching for performance
	- ✅ Vector store integration for learning
	- ✅ Duplicate detection and prevention
	- ✅ Session cleanup for long-running instances

	## Error Handling

	The system implements graceful degradation:
	- Tool failures: Continue with available tools
	- API timeouts: Retry with backoff
	- Memory errors: Degrade to LLM-only mode
	- Agent failures: Return informative error messages

	## Performance Considerations

	- Caching: Vector store searches cached for 5 minutes
	- Parallelization: Tools can be executed in parallel
	- Memory limits: Sandbox execution has resource constraints
	- Loop termination: Hard limit of 3 iterations prevents infinite loops

	## Extending the System

	### Adding New Agents
	1. Create agent file in `agents/` directory
	2. Implement agent function returning Command
	3. Add to workflow in `create_agent_graph()`
	4. Update routing logic in Lead Agent

	### Adding New Tools
	1. Implement tool following LangChain Tool interface
	2. Add to appropriate agent's tool list
	3. Update agent prompts to describe new capabilities

	### Custom Memory Backends
	1. Extend MemoryManager class
	2. Implement required interface methods
	3. Update initialization in memory_system.py

	## Troubleshooting

	### Common Issues
	- Missing API keys: Check env.local file setup
	- Tool failures: Verify network connectivity and API quotas
	- Memory errors: Check Supabase configuration (optional)
	- Import errors: Ensure all dependencies are installed

	### Debug Mode
	Set environment variable for detailed logging:
	```bash
	export LANGFUSE_DEBUG=true
	```

	This implementation follows the specified plan while incorporating LangGraph and Langfuse best practices for a robust, observable, and maintainable multi-agent system.