GAIA-Langgraph

Configuration error

App Files Files Community

GAIA-Langgraph / README.md

jash0803

docs: update readme 2

968382a 23 days ago

preview code

raw

history blame contribute delete

2.86 kB

	# GAIA Multi-Agent Evaluation System

	A multi-agent system built with LangGraph and LangChain to tackle the [GAIA benchmark](https://huggingface.co/spaces/gaia-benchmark/leaderboard) — a set of real-world questions that test AI assistants on reasoning, tool use, and multimodal understanding.

	## How It Works

	A supervisor agent analyzes each incoming question and delegates it to one of four specialized sub-agents:

	\| Agent \| Responsibility \| Tools \|
	\|---\|---\|---\|
	\| Web Research \| Factual lookups, current events, YouTube video analysis \| Tavily Search, Wikipedia, Gemini 2.5 Pro Video \|
	\| Code Execution \| Python programming, algorithms, data processing \| Python REPL \|
	\| File Processing \| Excel, CSV, PDF, audio, image analysis \| GAIA File Downloader, Pandas, Whisper, GPT-5-mini Vision \|
	\| Math/Reasoning \| Arithmetic, algebra, calculus, statistics \| Calculator, Python REPL \|

	See [ARCHITECTURE.md](ARCHITECTURE.md) for detailed diagrams and data flow.

	## Project Structure

	```
	├── app.py # Gradio UI + submission logic
	├── agent.py # GAIAAgent class (supervisor wrapper)
	├── prompts.py # Shared GAIA answer format prompt
	├── agents/
	│ ├── supervisor.py # LangGraph supervisor graph
	│ ├── web_research.py # Web search + video agent
	│ ├── code_agent.py # Code execution agent
	│ ├── file_agent.py # File processing agent
	│ └── math_agent.py # Math/reasoning agent
	├── tools/
	│ ├── search_tools.py # Tavily + Wikipedia
	│ ├── video_tools.py # Gemini YouTube video analysis
	│ ├── code_tools.py # Python REPL
	│ ├── file_tools.py # File download, Excel, audio, image, PDF
	│ └── math_tools.py # Calculator + Python REPL
	├── requirements.txt
	└── test_agent.py # Local testing script
	```

	## Setup

	### Environment Variables

	Set these in a local `.env` file:

	\| Variable \| Purpose \|
	\|---\|---\|
	\| `OPENAI_API_KEY` \| GPT-5-mini for reasoning, vision, and Whisper transcription \|
	\| `TAVILY_API_KEY` \| Web search via Tavily \|
	\| `GOOGLE_API_KEY` \| Gemini 2.5 Pro for YouTube video analysis \|
	\| `HF_TOKEN` \| HuggingFace token for downloading GAIA dataset files \|

	### Local Development

	```bash
	python -m venv .venv
	source .venv/bin/activate
	pip install -r requirements.txt
	python test_agent.py # test on a random GAIA question
	python app.py # launch Gradio UI
	```

	## Scoring

	The GAIA benchmark uses exact match scoring. The agent uses the official GAIA answer format prompt — reasoning through each question before producing a concise `FINAL ANSWER` (a number, a few words, or a comma-separated list) with no articles, abbreviations, or units unless specified.