W01fAI's picture
Upload 7 files
524e3cf verified
|
raw
history blame
2.77 kB
metadata
title: GAIA Unit 4 Agent
emoji: 🧭
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0

GAIA Unit 4 β€” Hugging Face Agents Course (final assignment)

This folder is a drop-in replacement for the course Space agents-course/Final_Assignment_Template.

One-time: create your Space

  1. On Hugging Face, Duplicate the template Space above (or create a new Gradio Space and copy these files into the repo root).
  2. In the Space Settings β†’ Repository secrets, add:
    • HF_TOKEN: a Hugging Face access token with read permission (for Inference API / serverless models).
  3. Optional Variables (or secrets) to tune models:
    • HF_INFERENCE_PROVIDER β€” omit by default so the client uses auto: the first inference provider that supports your chosen model on the Hub. Do not set hf-inference unless that model lists it β€” many chat models (e.g. Qwen2.5-7B-Instruct) only support together / featherless-ai, and forcing hf-inference yields 404. If the auto order hits a provider that returns 401 (e.g. Novita), reorder providers in HF settings or pin e.g. HF_INFERENCE_PROVIDER=together.
    • GAIA_TEXT_MODEL β€” default Qwen/Qwen2.5-7B-Instruct (broad provider mapping via Together).
    • GAIA_ASR_MODEL β€” default openai/whisper-large-v3
    • GAIA_VISION_MODEL β€” default meta-llama/Llama-3.2-11B-Vision-Instruct
    • GAIA_API_URL β€” default https://agents-course-unit4-scoring.hf.space
    • GAIA_USE_CACHE β€” 1 (default) or 0 to disable gaia_answers_cache.json

Keep the Space public so agent_code (…/tree/main) verifies for the leaderboard.

Local dry-run (no submission)

cd gaia_unit4_space
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
export HF_TOKEN=hf_...
python run_local_eval.py

This fetches /questions, runs the agent, prints answers, and writes local_eval_answers.json. It does not call /submit.

What was fixed vs the stock template

  • Downloads attachments when file_name is set (GET /files/{task_id}).
  • Tool-using agent (web, Wikipedia, Python, Excel, ASR, vision, YouTube transcripts).
  • Deterministic shortcuts for the reversed-English puzzle, Cayley-table commutativity, .py stdout, and .xlsx food-sales heuristic.
  • Optional Crypto tab (BTC/USD demo only; not used for GAIA).

Leaderboard

Submit scores via the Gradio app after logging in. Student leaderboard: agents-course/Students_leaderboard.