---
title: GAIA Agent - Certification
emoji: 🤖
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 5.25.2
app_file: evaluation_app.py
pinned: false
hf_oauth: true
hf_oauth_expiration_minutes: 480
---

#  GAIA Agent - Hugging Face Agents Course Certification

This is a LangGraph-based AI agent built to answer questions from the GAIA benchmark for the Hugging Face Agents Course Unit 4 certification.

##  Goal

Achieve **30%+ accuracy** on the GAIA benchmark to earn the certification.

##  Agent Architecture

The agent is built using:
- **LLM**: Groq's Llama 3.3 70B (fast and free)
- **Framework**: LangGraph for agent orchestration
- **Tools**: 5 essential tools for maximum coverage

### Tools Implemented

1. **Web Search** (Tavily) - Search the internet for current information
2. **Wikipedia Search** - Access encyclopedic knowledge (Wikipedia API)
3. **Calculator** - Perform mathematical calculations
4. **Python Executor** - Execute Python code for complex computations
5. **File Reader** - Read CSV, JSON, and text files

##  Answer Format Rules

The agent follows GAIA's strict formatting requirements:
- **Numbers**: No commas, no units (unless requested)
- **Text**: No articles (a, an, the), no abbreviations
- **Lists**: Comma-separated with one space after commas
- **Dates**: ISO format (YYYY-MM-DD) unless specified

##  Usage

### Local Testing

```bash
# Install dependencies
pip install -r requirements.txt

# Set up environment variables in .env
GROQ_API_KEY=your_key_here
TAVILY_API_KEY=your_key_here

# Test the agent
python test_agent.py
```

### Running Evaluation

1. Open the Space URL
2. Log in with your HuggingFace account
3. Click "Run Evaluation & Submit All Answers"
4. Wait for results (takes ~1-2 hours due to rate limiting)

##  Project Structure

```
.
├── agent.py              # Main agent implementation
├── evaluation_app.py     # Gradio app for evaluation
├── test_agent.py         # Local testing script
├── requirements.txt      # Python dependencies
├── .env                  # API keys (not committed)
└── README.md            # This file
```

##  Required API Keys

- **GROQ_API_KEY**: Get from [console.groq.com](https://console.groq.com)
- **TAVILY_API_KEY**: Get from [tavily.com](https://tavily.com)

##  Expected Performance

With the current tool set:
- **Web Search + Wikipedia + Calculator**: ~25-30%
- **+ File Processing**: ~35-40%
- **+ Python Execution**: ~40-45%

##  Course Information

This project is part of the [Hugging Face Agents Course](https://huggingface.co/learn/agents-course) Unit 4 certification.

##  License

MY License - Feel free to use and modify for your own certification!