Final_Assignment_Template

Sleeping

App Files Files Community

Final_Assignment_Template / README_MULTI_AGENT_SYSTEM.md

Humanlearning

updated agent

f844f16 8 months ago

preview code

raw

history blame contribute delete

7.68 kB

A newer version of the Gradio SDK is available: 6.5.1

Upgrade

LangGraph Multi-Agent System

A sophisticated multi-agent system built with LangGraph that follows best practices for state management, tracing, and iterative workflows.

Architecture Overview

The system implements an iterative research/code loop with specialized agents:

User Query → Lead Agent → Research Agent → Code Agent → Lead Agent (loop) → Answer Formatter → Final Answer

Key Components

Lead Agent (agents/lead_agent.py)
- Orchestrates the entire workflow
- Makes routing decisions between research and code agents
- Manages the iterative loop with a maximum of 3 iterations
- Synthesizes information from specialists into draft answers
Research Agent (agents/research_agent.py)
- Handles information gathering from multiple sources
- Uses web search (Tavily), Wikipedia, and ArXiv tools
- Provides structured research results with citations
Code Agent (agents/code_agent.py)
- Performs mathematical calculations and code execution
- Uses calculator tools for basic operations
- Executes Python code in a sandboxed environment
- Handles Hugging Face Hub statistics
Answer Formatter (agents/answer_formatter.py)
- Ensures GAIA benchmark compliance
- Extracts final answers according to exact-match rules
- Handles different answer types (numbers, strings, lists)
Memory System (memory_system.py)
- Vector store integration for long-term learning
- Session-based caching for performance
- Similar question retrieval for context

Core Features

State Management

Immutable State: Uses LangGraph's Command pattern for pure functions
Typed Schema: AgentState TypedDict ensures type safety
Accumulation: Research notes and code outputs accumulate across iterations

Observability (Langfuse v3)

OTEL-Native Integration: Uses Langfuse v3 with OpenTelemetry for automatic trace correlation
Single Callback Handler: One global handler passes traces seamlessly through LangGraph
Predictable Span Naming: agent/<role>, tool/<name>, llm/<model> patterns for cost/latency dashboards
Session Stitching: User and session tracking for conversation continuity
Background Flushing: Non-blocking trace export for optimal performance

Tools Integration

Web Search: Tavily API for current information
Knowledge Bases: Wikipedia and ArXiv for encyclopedic/academic content
Computation: Calculator tools and Python execution
Hub Statistics: Hugging Face model information

Setup

Environment Variables

Create an env.local file with:

# LLM API
GROQ_API_KEY=your_groq_api_key

# Search Tools
TAVILY_API_KEY=your_tavily_api_key

# Observability
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_SECRET_KEY=your_langfuse_secret_key
LANGFUSE_HOST=https://cloud.langfuse.com

# Memory (Optional)
SUPABASE_URL=your_supabase_url
SUPABASE_SERVICE_KEY=your_supabase_service_key

Dependencies

The system requires:

langgraph>=0.4.8
langchain>=0.3.0
langchain-groq
langfuse>=3.0.0
python-dotenv
tavily-python

Usage

Basic Usage

import asyncio
from langgraph_agent_system import run_agent_system

async def main():
    result = await run_agent_system(
        query="What is the capital of Maharashtra?",
        user_id="user_123",
        session_id="session_456"
    )
    print(f"Answer: {result}")

asyncio.run(main())

Testing

Run the test suite to verify functionality:

python test_new_multi_agent_system.py

Test Langfuse v3 observability integration:

python test_observability.py

Direct Graph Access

from langgraph_agent_system import create_agent_graph

# Create and compile the workflow
workflow = create_agent_graph()
app = workflow.compile()

# Run with initial state
initial_state = {
    "messages": [HumanMessage(content="Your question")],
    "draft_answer": "",
    "research_notes": "",
    "code_outputs": "",
    "loop_counter": 0,
    "done": False,
    "next": "research",
    "final_answer": "",
    "user_id": "user_123",
    "session_id": "session_456"
}

final_state = await app.ainvoke(initial_state)
print(final_state["final_answer"])

Workflow Details

Iterative Loop

Lead Agent analyzes the query and decides on next action
If research needed → Research Agent gathers information
If computation needed → Code Agent performs calculations
Back to Lead Agent for synthesis and next decision
When sufficient information → Answer Formatter creates final answer

Routing Logic

The Lead Agent uses the following criteria:

Research: Factual information, current events, citations needed
Code: Mathematical calculations, data analysis, programming tasks
Formatter: Sufficient information gathered OR max iterations reached

GAIA Compliance

The Answer Formatter ensures exact-match requirements:

Numbers: No commas, units, or extra symbols
Strings: Remove unnecessary articles and formatting
Lists: Comma and space separation
No surrounding text: No "Answer:", quotes, or brackets

Best Practices Implemented

LangGraph Patterns

✅ Pure functions (AgentState → Command)
✅ Immutable state with explicit updates
✅ Typed state schema with operator annotations
✅ Clear routing separated from business logic

Langfuse v3 Observability

✅ OTEL-native SDK with automatic trace correlation
✅ Single global callback handler for seamless LangGraph integration
✅ Predictable span naming (agent/<role>, tool/<name>, llm/<model>)
✅ Session and user tracking with environment tagging
✅ Background trace flushing for performance
✅ Graceful degradation when observability unavailable

Memory Management

✅ TTL-based caching for performance
✅ Vector store integration for learning
✅ Duplicate detection and prevention
✅ Session cleanup for long-running instances

Error Handling

The system implements graceful degradation:

Tool failures: Continue with available tools
API timeouts: Retry with backoff
Memory errors: Degrade to LLM-only mode
Agent failures: Return informative error messages

Performance Considerations

Caching: Vector store searches cached for 5 minutes
Parallelization: Tools can be executed in parallel
Memory limits: Sandbox execution has resource constraints
Loop termination: Hard limit of 3 iterations prevents infinite loops

Extending the System

Adding New Agents

Create agent file in agents/ directory
Implement agent function returning Command
Add to workflow in create_agent_graph()
Update routing logic in Lead Agent

Adding New Tools

Implement tool following LangChain Tool interface
Add to appropriate agent's tool list
Update agent prompts to describe new capabilities

Custom Memory Backends

Extend MemoryManager class
Implement required interface methods
Update initialization in memory_system.py

Troubleshooting

Common Issues

Missing API keys: Check env.local file setup
Tool failures: Verify network connectivity and API quotas
Memory errors: Check Supabase configuration (optional)
Import errors: Ensure all dependencies are installed

Debug Mode

Set environment variable for detailed logging:

export LANGFUSE_DEBUG=true

This implementation follows the specified plan while incorporating LangGraph and Langfuse best practices for a robust, observable, and maintainable multi-agent system.