atharva / SETUP_GUIDE.md
ATHARVA
Add application file
984ac15

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

GAIA AI Agent - Hugging Face Space Setup

This directory contains an optimized GAIA AI agent designed for the Hugging Face Unit 4 final assignment.

🎯 Goal

Score 30+ on GAIA Level 1 questions to earn certification.

πŸš€ Quick Setup

1. Create a Hugging Face Space

  1. Go to Hugging Face Spaces
  2. Click "Create new Space"
  3. Choose "Gradio" as the SDK
  4. Upload all files from this hf_space directory

2. Set up API Keys

  1. Get a free Groq API key from console.groq.com
  2. (Optional) Get a Tavily API key from tavily.com
  3. In your Space settings, add these as secrets:
    • GROQ_API_KEY: Your Groq API key
    • TAVILY_API_KEY: Your Tavily API key (optional)

3. Run the Evaluation

  1. Open your Space
  2. Login with your Hugging Face account
  3. Click "Run Evaluation & Submit All Answers"
  4. Wait for results (usually 2-5 minutes)

🧠 Agent Features

  • Fast LLM: Uses Llama 3.1 70B via Groq for quick responses
  • Web Search: Real-time information via Tavily API
  • Math Tools: Built-in calculator for numerical problems
  • Optimized: Streamlined for speed and accuracy
  • Error Handling: Robust error management

πŸ“ Files Overview

  • app.py: Main Gradio application
  • agent.py: Core GAIA agent implementation
  • requirements.txt: Python dependencies
  • system_prompt.txt: Agent instructions
  • README.md: Space documentation
  • .env.example: Environment variable template

πŸ”§ Technical Details

The agent uses a multi-step approach:

  1. Analysis: Determines if tools are needed
  2. Tool Usage: Applies calculations or web search
  3. Reasoning: Combines information for final answer
  4. Formatting: Ensures proper "FINAL ANSWER:" format

🎯 Optimization for GAIA

  • Focused on Level 1 questions (basic reasoning)
  • Fast model selection (70B for capability, Groq for speed)
  • Minimal tool overhead
  • Direct answer extraction
  • Error recovery mechanisms

πŸ“Š Expected Performance

Target: 30%+ accuracy on GAIA Level 1 questions

  • Mathematical problems: High accuracy
  • Web search questions: Good accuracy with Tavily
  • Reasoning tasks: Moderate to high accuracy
  • Overall: Should achieve certification threshold

πŸ› οΈ Customization

You can improve the agent by:

  • Adjusting the system prompt
  • Adding more specialized tools
  • Fine-tuning the answer extraction
  • Implementing caching mechanisms
  • Adding more robust error handling

Good luck with your certification! πŸŽ‰