File size: 2,703 Bytes
2705160
223e45d
 
62ad9da
223e45d
2705160
 
223e45d
2705160
d123508
 
2705160
 
223e45d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
---
title: GAIA Agent - Certification
emoji: πŸ€–
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 5.25.2
app_file: evaluation_app.py
pinned: false
hf_oauth: true
hf_oauth_expiration_minutes: 480
---

#  GAIA Agent - Hugging Face Agents Course Certification

This is a LangGraph-based AI agent built to answer questions from the GAIA benchmark for the Hugging Face Agents Course Unit 4 certification.

##  Goal

Achieve **30%+ accuracy** on the GAIA benchmark to earn the certification.

##  Agent Architecture

The agent is built using:
- **LLM**: Groq's Llama 3.3 70B (fast and free)
- **Framework**: LangGraph for agent orchestration
- **Tools**: 5 essential tools for maximum coverage

### Tools Implemented

1. **Web Search** (Tavily) - Search the internet for current information
2. **Wikipedia Search** - Access encyclopedic knowledge (Wikipedia API)
3. **Calculator** - Perform mathematical calculations
4. **Python Executor** - Execute Python code for complex computations
5. **File Reader** - Read CSV, JSON, and text files

##  Answer Format Rules

The agent follows GAIA's strict formatting requirements:
- **Numbers**: No commas, no units (unless requested)
- **Text**: No articles (a, an, the), no abbreviations
- **Lists**: Comma-separated with one space after commas
- **Dates**: ISO format (YYYY-MM-DD) unless specified

##  Usage

### Local Testing

```bash
# Install dependencies
pip install -r requirements.txt

# Set up environment variables in .env
GROQ_API_KEY=your_key_here
TAVILY_API_KEY=your_key_here

# Test the agent
python test_agent.py
```

### Running Evaluation

1. Open the Space URL
2. Log in with your HuggingFace account
3. Click "Run Evaluation & Submit All Answers"
4. Wait for results (takes ~1-2 hours due to rate limiting)

##  Project Structure

```
.
β”œβ”€β”€ agent.py              # Main agent implementation
β”œβ”€β”€ evaluation_app.py     # Gradio app for evaluation
β”œβ”€β”€ test_agent.py         # Local testing script
β”œβ”€β”€ requirements.txt      # Python dependencies
β”œβ”€β”€ .env                  # API keys (not committed)
└── README.md            # This file
```

##  Required API Keys

- **GROQ_API_KEY**: Get from [console.groq.com](https://console.groq.com)
- **TAVILY_API_KEY**: Get from [tavily.com](https://tavily.com)

##  Expected Performance

With the current tool set:
- **Web Search + Wikipedia + Calculator**: ~25-30%
- **+ File Processing**: ~35-40%
- **+ Python Execution**: ~40-45%

##  Course Information

This project is part of the [Hugging Face Agents Course](https://huggingface.co/learn/agents-course) Unit 4 certification.

##  License

MY License - Feel free to use and modify for your own certification!