Humanlearning's picture
updated agent
f844f16

A newer version of the Gradio SDK is available: 6.5.1

Upgrade

LangGraph Agent System Architecture

This document describes the architecture of the multi-agent system implemented using LangGraph 0.4.8+ and Langfuse 3.0.0.

System Overview

The system implements a sophisticated agent architecture with memory, routing, specialized agents, and verification as shown in the system diagram.

Core Components

1. Memory Layer

  • Short-Term Memory: Graph state managed by LangGraph checkpointing
  • Checkpointer: SQLite-based persistence for conversation continuity
  • Long-Term Memory: Supabase vector store with pgvector for Q&A storage

2. Plan + ReAct Loop

  • Initial query analysis and planning
  • Contextual prompt injection with system requirements
  • Memory retrieval for similar past questions

3. Agent Router

  • Intelligent routing based on query analysis
  • Routes to specialized agents: Retrieval, Execution, or Critic
  • Uses low-temperature LLM for consistent routing decisions

4. Specialized Agents

Retrieval Agent

  • Information gathering from external sources
  • Tools: Wikipedia, Arxiv, Tavily web search, vector store retrieval
  • Handles attachment downloading for GAIA tasks
  • Context-aware with memory integration

Execution Agent

  • Computational tasks and code execution
  • Integrates with existing code_agent.py sandbox
  • Python code execution with pandas, cv2, standard libraries
  • Step-by-step problem breakdown

Critic Agent

  • Response quality evaluation and review
  • Accuracy, completeness, and logical consistency checks
  • Scoring system with pass/fail determination
  • Constructive feedback generation

5. Verification & Fallback

  • Final quality control with system prompt compliance
  • Format verification for exact-match requirements
  • Retry logic with maximum attempt limits
  • Graceful fallback pipeline for failed attempts

6. Observability (Langfuse)

  • End-to-end tracing of all agent interactions
  • Performance monitoring and debugging
  • User session tracking
  • Error logging and analysis

Data Flow

  1. User Query β†’ Plan Node (system prompt injection)
  2. Plan Node β†’ Router (agent selection)
  3. Router β†’ Specialized Agent (task execution)
  4. Agent β†’ Tools (if needed) β†’ Agent (results)
  5. Agent β†’ Verification (quality check)
  6. Verification β†’ Output or Retry/Fallback

Key Features

Memory Management

  • Caching of similarity searches (TTL-based)
  • Duplicate detection and prevention
  • Task-based attachment tracking
  • Session-specific cache management

Quality Control

  • Multi-level verification (agent β†’ critic β†’ verification)
  • Retry mechanism with attempt limits
  • Format compliance checking
  • Fallback responses for failures

Tracing & Observability

  • Langfuse integration for complete observability
  • Agent-level span tracking
  • Error monitoring and debugging
  • Performance metrics collection

Tool Integration

  • Modular tool system for each agent
  • Sandboxed code execution environment
  • External API integration (search, knowledge bases)
  • Attachment handling for complex tasks

Configuration

Environment Variables

See env.template for required configuration:

  • LLM API keys (Groq, OpenAI, Google, HuggingFace)
  • Search tools (Tavily)
  • Vector store (Supabase)
  • Observability (Langfuse)
  • GAIA API endpoints

System Prompts

Located in prompts/ directory:

  • system_prompt.txt: Main system requirements
  • router_prompt.txt: Agent routing instructions
  • retrieval_prompt.txt: Information gathering guidelines
  • execution_prompt.txt: Code execution instructions
  • critic_prompt.txt: Quality evaluation criteria
  • verification_prompt.txt: Final formatting rules

Usage

Basic Usage

from src import run_agent_system

result = run_agent_system(
    query="Your question here",
    user_id="user123",
    session_id="session456"
)

With Memory Management

from src import memory_manager

# Check if query is similar to previous ones
similar = memory_manager.get_similar_qa(query)

# Clear session cache
memory_manager.clear_session_cache()

Direct Graph Access

from src import create_agent_graph

workflow = create_agent_graph()
app = workflow.compile(checkpointer=checkpointer)
result = app.invoke(initial_state, config=config)

Dependencies

Core Framework

  • langgraph>=0.4.8: Graph-based agent orchestration
  • langgraph-checkpoint-sqlite>=2.0.0: Persistence layer
  • langchain>=0.3.0: LLM and tool abstractions

Observability

  • langfuse==3.0.0: Tracing and monitoring

Memory & Storage

  • supabase>=2.8.0: Vector database backend
  • pgvector>=0.3.0: Vector similarity search

Tools & APIs

  • tavily-python>=0.5.0: Web search
  • arxiv>=2.1.0: Academic paper search
  • wikipedia>=1.4.0: Knowledge base access

Error Handling

The system implements comprehensive error handling:

  • Graceful degradation when services are unavailable
  • Fallback responses for critical failures
  • Retry logic with exponential backoff
  • Detailed error logging for debugging

Performance Considerations

  • Vector store caching reduces duplicate searches
  • Checkpoint-based state management for conversation continuity
  • Efficient tool routing based on query analysis
  • Memory cleanup for long-running sessions

Future Enhancements

  • Additional specialized agents (e.g., Image Analysis, Code Review)
  • Enhanced memory clustering and retrieval algorithms
  • Real-time collaboration between agents
  • Advanced tool composition and chaining