Spaces:
Sleeping
Sleeping
Delete README.md
Browse filestitle: Template Final Assignment
emoji: 🕵🏻♂️
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 5.25.2
app_file: app.py
pinned: false
hf_oauth: true
hf_oauth_expiration_minutes: 480
README.md
DELETED
|
@@ -1,74 +0,0 @@
|
|
| 1 |
-
# AI Agent for GAIA Dataset
|
| 2 |
-
|
| 3 |
-
This repository contains the code and solution developed for the final test of the Hugging Face course on AI Agents. It demonstrates the capabilities of an AI Agent using the GAIA dataset.
|
| 4 |
-
|
| 5 |
-
## Repository Structure
|
| 6 |
-
|
| 7 |
-
- **`app.py`**: Fetches test questions from the Hugging Face server and executes the AI Agent.
|
| 8 |
-
- **`tools/`**: Contains custom tools created specifically to assist the Agent in solving tasks.
|
| 9 |
-
|
| 10 |
-
## Performance and Considerations
|
| 11 |
-
|
| 12 |
-
The implemented solution achieved a **70% score** on the GAIA benchmark using the following models:
|
| 13 |
-
|
| 14 |
-
- Gemini-2.0-flash
|
| 15 |
-
- GPT-4.1-mini
|
| 16 |
-
|
| 17 |
-
However, due to the inherent non-deterministic behavior of these large language models (LLMs), outputs can occasionally vary even when using a temperature setting of `0`. Batched inference is likely the main contributor to this variability.
|
| 18 |
-
|
| 19 |
-
Additionally, a substantial number of tools were necessary to enhance performance given the limitations of these more economical models. For instance, a dedicated chess tool was implemented due to consistent inaccuracies in parsing image data by GPT-4.1-mini and Gemini models.
|
| 20 |
-
|
| 21 |
-
## The certificate
|
| 22 |
-
<img src="certificate.jpg">
|
| 23 |
-
|
| 24 |
-
## Installation
|
| 25 |
-
|
| 26 |
-
1\. **Clone the repository**
|
| 27 |
-
|
| 28 |
-
```bash
|
| 29 |
-
git clone <repository-url>
|
| 30 |
-
```
|
| 31 |
-
|
| 32 |
-
2\. **Set up Python virtual environment**
|
| 33 |
-
|
| 34 |
-
**Unix or MacOS**:
|
| 35 |
-
```bash
|
| 36 |
-
python -m venv .venv
|
| 37 |
-
source .venv/bin/activate
|
| 38 |
-
```
|
| 39 |
-
|
| 40 |
-
**Windows**:
|
| 41 |
-
```batch
|
| 42 |
-
python -m venv .venv
|
| 43 |
-
.\.venv\Scripts\activate
|
| 44 |
-
```
|
| 45 |
-
|
| 46 |
-
3\. **Install dependencies**
|
| 47 |
-
|
| 48 |
-
```bash
|
| 49 |
-
pip install -r requirements.txt
|
| 50 |
-
```
|
| 51 |
-
|
| 52 |
-
4\. **Configure environment variables**
|
| 53 |
-
|
| 54 |
-
**Unix or MacOS**:
|
| 55 |
-
```bash
|
| 56 |
-
export HF_TOKEN="your_huggingface_token"
|
| 57 |
-
export OPENAI_API_KEY="your_openai_api_key"
|
| 58 |
-
export GEMINI_API_KEY="your_gemini_api_key"
|
| 59 |
-
```
|
| 60 |
-
|
| 61 |
-
**Windows**:
|
| 62 |
-
```batch
|
| 63 |
-
set HF_TOKEN=your_huggingface_token
|
| 64 |
-
set OPENAI_API_KEY=your_openai_api_key
|
| 65 |
-
set GEMINI_API_KEY=your_gemini_api_key
|
| 66 |
-
```
|
| 67 |
-
|
| 68 |
-
5\. **Run the application**
|
| 69 |
-
|
| 70 |
-
```bash
|
| 71 |
-
python app.py
|
| 72 |
-
```
|
| 73 |
-
|
| 74 |
-
Upon starting, log in as prompted and run the provided questions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|