Final_Assignment_Template

Sleeping

App Files Files Community

Final_Assignment_Template / README.md

AheedTahir

Final Working Implementation

223e45d 3 months ago

preview code

raw

history blame contribute delete

2.7 kB

	---
	title: GAIA Agent - Certification
	emoji: 🤖
	colorFrom: indigo
	colorTo: purple
	sdk: gradio
	sdk_version: 5.25.2
	app_file: evaluation_app.py
	pinned: false
	hf_oauth: true
	hf_oauth_expiration_minutes: 480
	---

	# GAIA Agent - Hugging Face Agents Course Certification

	This is a LangGraph-based AI agent built to answer questions from the GAIA benchmark for the Hugging Face Agents Course Unit 4 certification.

	## Goal

	Achieve 30%+ accuracy on the GAIA benchmark to earn the certification.

	## Agent Architecture

	The agent is built using:
	- LLM: Groq's Llama 3.3 70B (fast and free)
	- Framework: LangGraph for agent orchestration
	- Tools: 5 essential tools for maximum coverage

	### Tools Implemented

	1. Web Search (Tavily) - Search the internet for current information
	2. Wikipedia Search - Access encyclopedic knowledge (Wikipedia API)
	3. Calculator - Perform mathematical calculations
	4. Python Executor - Execute Python code for complex computations
	5. File Reader - Read CSV, JSON, and text files

	## Answer Format Rules

	The agent follows GAIA's strict formatting requirements:
	- Numbers: No commas, no units (unless requested)
	- Text: No articles (a, an, the), no abbreviations
	- Lists: Comma-separated with one space after commas
	- Dates: ISO format (YYYY-MM-DD) unless specified

	## Usage

	### Local Testing

	```bash
	# Install dependencies
	pip install -r requirements.txt

	# Set up environment variables in .env
	GROQ_API_KEY=your_key_here
	TAVILY_API_KEY=your_key_here

	# Test the agent
	python test_agent.py
	```

	### Running Evaluation

	1. Open the Space URL
	2. Log in with your HuggingFace account
	3. Click "Run Evaluation & Submit All Answers"
	4. Wait for results (takes ~1-2 hours due to rate limiting)

	## Project Structure

	```
	.
	├── agent.py # Main agent implementation
	├── evaluation_app.py # Gradio app for evaluation
	├── test_agent.py # Local testing script
	├── requirements.txt # Python dependencies
	├── .env # API keys (not committed)
	└── README.md # This file
	```

	## Required API Keys

	- GROQ_API_KEY: Get from [console.groq.com](https://console.groq.com)
	- TAVILY_API_KEY: Get from [tavily.com](https://tavily.com)

	## Expected Performance

	With the current tool set:
	- Web Search + Wikipedia + Calculator: ~25-30%
	- + File Processing: ~35-40%
	- + Python Execution: ~40-45%

	## Course Information

	This project is part of the [Hugging Face Agents Course](https://huggingface.co/learn/agents-course) Unit 4 certification.

	## License

	MY License - Feel free to use and modify for your own certification!