File size: 4,764 Bytes
1b22afc
 
d142da5
 
 
1b22afc
cd0edd6
1b22afc
33310e4
d142da5
 
46a52d9
 
1b22afc
 
7f72cbd
 
2048d4d
 
 
7f72cbd
2f04a60
bddac84
 
 
7f72cbd
bddac84
 
7f72cbd
682a11a
 
e5bfc40
 
2048d4d
e5bfc40
 
 
 
 
 
2048d4d
e5bfc40
 
11a9fe6
 
 
e5bfc40
 
 
7f72cbd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1fb171c
7f72cbd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cd0edd6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
---
title: Agents Course Final Assignment
emoji: πŸ•΅πŸ»β€β™‚οΈ
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 5.34.0
app_file: app.py
pinned: true
hf_oauth: true
hf_oauth_expiration_minutes: 480
license: mit
short_description: Developed for the Agents Course final project.
---

# πŸ•΅πŸ»β€β™‚οΈ Agents Course Final Assignment

<div align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/660d652e2461f72aa268bb8c/-q8C143P4TkNiLxCJS8cp.png" alt="certification" width="600"/>
</div>


> **⚠️ Important Notice**: <br/>
> After this project is made public, the `OPENAI_API_KEY` in Hugging Face Space settings has been set to an invalid value <br/>
> to protect API key security. To run this project, please use your own OpenAI API key.

This is a multi-agent system developed for the [Hugging Face Agents Course](https://huggingface.co/learn/agents-course/en/unit4/introduction) Unit 4 final project. 
The system is designed to evaluate AI agent performance through the [GAIA benchmark](https://huggingface.co/spaces/gaia-benchmark/leaderboard).



Due to the fact that this agent system does not provide file processing capabilities, multimodal reasoning, and other advanced features, some questions like the following cannot be answered by this agent system:


> The attached Excel file contains the sales of menu items for a local fast-food chain. What were the total sales that the chain made from food (not including drinks)? Express your answer in USD with two decimal places.

> Review the chess position provided in the image. It is black's turn. Provide the correct next move for black which guarantees a win. Please provide your response in algebraic notation.

> In the video https://www.youtube.com/watch?v=L1vXCYZAYYM, what is the highest number of bird species to be on camera simultaneously?


Even so, this agent system is able to pass the GAIA benchmark with a score of 50% by correctly answering 10 out of 20 questions.

> Submission Successful! <br/>
> User: Hemimoon <br/>
> Overall Score: 50.0% (10/20 correct) <br/>
> Message: Score calculated successfully: 10/20 total questions answered correctly (20 valid tasks attempted). High score updated on leaderboard.


## πŸ—οΈ Architecture Design

This project adopts a multi-agent architecture with two specialized sub-agents:

### 1. Web Search Agent
- **Functions**: Web search, webpage access, Wikipedia queries
- **Tools**: 
  - `WebSearchTool()`: Web search
  - `VisitWebpageTool()`: Access specific webpages
  - `WikipediaSearchTool()`: Wikipedia search
- **Authorized Imports**: `requests`, `beautifulsoup4`

### 2. Calculation Agent
- **Functions**: Mathematical calculations, data analysis, statistical computing
- **Authorized Imports**: `pandas`, `numpy`, `math`, `statistics`, `scipy`
- **Features**: Specialized in numerical computation and data processing tasks

### 3. Main Agent
- **Role**: Coordinate and manage sub-agents
- **Functions**: Task distribution, result integration, decision making
- **Features**: Uses `managed_agents` mode to uniformly manage sub-agents

## πŸ“ Project Structure

```
β”œβ”€β”€ agent.py          # Main agent definition and configuration
β”œβ”€β”€ prompt.py         # Agent prompt templates and system instructions
β”œβ”€β”€ app.py            # Gradio interface and GAIA evaluation logic
β”œβ”€β”€ test_agent.py     # Agent functionality test scripts
β”œβ”€β”€ requirements.txt  # Python dependencies
β”œβ”€β”€ pyproject.toml    # Project configuration file
└── README.md         # Project documentation
```

## πŸ”§ Tech Stack

- **AI Framework**: [smolagents](https://github.com/huggingface/smolagents) - Hugging Face's agent framework
- **LLM**: OpenAI GPT-4.1 (via OpenAI API)
- **Frontend Interface**: Gradio 5.34.0
- **Data Processing**: pandas, numpy, scipy
- **Web Requests**: requests, beautifulsoup4
- **Environment Management**: python-dotenv

## πŸš€ Quick Start

### Environment Setup

1. **Clone the project**
```bash
git clone https://huggingface.co/spaces/Hemimoon/Agents-Course-Final-Assignment
cd Agents-Course-Final-Assignment
```

2. **Install dependencies**
```bash
pip install -r requirements.txt
```

3. **Configure API Key**
Create a `.env` file and add your OpenAI API key:
```bash
OPENAI_API_KEY=your_openai_api_key_here
```

## πŸ”— Related Links

- [Hugging Face Agents Course](https://huggingface.co/learn/agents-course/en/unit4/introduction)
- [GAIA Benchmark](https://huggingface.co/spaces/gaia-benchmark/leaderboard)
- [Student Leaderboard](https://huggingface.co/spaces/agents-course/Students_leaderboard)
- [smolagents Documentation](https://github.com/huggingface/smolagents)


---

*Created with ❀️ for the Hugging Face Agents Course Final Assignment*