GAIA-Langgraph

Configuration error

App Files Files Community

GAIA-Langgraph / README.md

jash0803

docs: update readme 2

968382a 23 days ago

preview code

raw

history blame contribute delete

2.86 kB

GAIA Multi-Agent Evaluation System

A multi-agent system built with LangGraph and LangChain to tackle the GAIA benchmark — a set of real-world questions that test AI assistants on reasoning, tool use, and multimodal understanding.

How It Works

A supervisor agent analyzes each incoming question and delegates it to one of four specialized sub-agents:

Agent	Responsibility	Tools
Web Research	Factual lookups, current events, YouTube video analysis	Tavily Search, Wikipedia, Gemini 2.5 Pro Video
Code Execution	Python programming, algorithms, data processing	Python REPL
File Processing	Excel, CSV, PDF, audio, image analysis	GAIA File Downloader, Pandas, Whisper, GPT-5-mini Vision
Math/Reasoning	Arithmetic, algebra, calculus, statistics	Calculator, Python REPL

See ARCHITECTURE.md for detailed diagrams and data flow.

Project Structure

├── app.py                  # Gradio UI + submission logic
├── agent.py                # GAIAAgent class (supervisor wrapper)
├── prompts.py              # Shared GAIA answer format prompt
├── agents/
│   ├── supervisor.py       # LangGraph supervisor graph
│   ├── web_research.py     # Web search + video agent
│   ├── code_agent.py       # Code execution agent
│   ├── file_agent.py       # File processing agent
│   └── math_agent.py       # Math/reasoning agent
├── tools/
│   ├── search_tools.py     # Tavily + Wikipedia
│   ├── video_tools.py      # Gemini YouTube video analysis
│   ├── code_tools.py       # Python REPL
│   ├── file_tools.py       # File download, Excel, audio, image, PDF
│   └── math_tools.py       # Calculator + Python REPL
├── requirements.txt
└── test_agent.py           # Local testing script

Setup

Environment Variables

Set these in a local .env file:

Variable	Purpose
`OPENAI_API_KEY`	GPT-5-mini for reasoning, vision, and Whisper transcription
`TAVILY_API_KEY`	Web search via Tavily
`GOOGLE_API_KEY`	Gemini 2.5 Pro for YouTube video analysis
`HF_TOKEN`	HuggingFace token for downloading GAIA dataset files

Local Development

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python test_agent.py      # test on a random GAIA question
python app.py             # launch Gradio UI

Scoring

The GAIA benchmark uses exact match scoring. The agent uses the official GAIA answer format prompt — reasoning through each question before producing a concise FINAL ANSWER (a number, a few words, or a comma-separated list) with no articles, abbreviations, or units unless specified.