Spaces:

mrtom17
/

gaia-agent

Sleeping

App Files Files Community

gaia-agent / README.md

mrtom17

Update README.md

906b336 verified 6 months ago

preview code

raw

history blame contribute delete

1.92 kB

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

metadata

title: GAIA Benchmark Agent
emoji: 🤖
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: 4.27.0
app_file: app_safe.py
pinned: false
hf_oauth: true

GAIA Benchmark Agent

This project is an AI agent designed to tackle the GAIA benchmark, featuring multi-step reasoning, tool use (web search, Wikipedia, data analysis, file handling), and a Gradio web interface for evaluation and submission.

Features

LangGraph-based agent with robust tool integration
Wikipedia, Tavily (web search), data analysis, and file handling tools
Automatic file download for file-based questions
Gradio interface for user interaction and answer submission
Error handling and graceful fallback for recursion/tool loops

Setup & Deployment

1. Install Dependencies

pip install -r requirements.txt

2. Environment Variables

Create a .env file (not committed) or set these variables in your Hugging Face Space:

OPENAI_API_KEY (for OpenAI LLM and transcription)
TAVILY_API_KEY (for Tavily web search)
(Optional) SPACE_ID (for Hugging Face Space integration)

3. Run Locally

python app_safe.py

Or launch the Gradio interface as your main app file.

4. Deploy to Hugging Face Spaces

Push your code to a public Hugging Face Space repository.
Set your API keys as secrets in the Space settings.
The Gradio app will launch automatically.

Project Structure

app_safe.py — Main Gradio app for full agent evaluation
agent.py — Agent logic and tool orchestration
tools.py — Tool definitions (Tavily, Wikipedia, data analysis, etc.)
requirements.txt — All dependencies
README.md — This file

Notes

The agent will return a fallback answer if it cannot answer within the recursion/tool call limits.
For best results, ensure all environment variables are set and dependencies are installed.

Good luck on the GAIA benchmark!