Spaces:

igorpavlov-mgr
/

GAIA-Test-HF-Agent-Course

Sleeping

Upload BASELINE.md

6474e3d verified 12 months ago

3.71 kB

A newer version of the Gradio SDK is available: 6.13.0

GAIA Final Assignment – Baseline Specification

This project is developed as part of the Hugging Face AI Agents Course (Unit 4).
The goal is to implement an agent that:

Component	Decision
LLM	Qwen/Qwen1.5-1.8B-Chat
Hardware	CPU-only (both locally and in deployment)
Development Flow	Code is tested in Google Colab (CPU mode), then ported to HF Space
Submission Interface	Uses provided endpoints: /questions and /submit
UI	Gradio-based interface with OAuth login (gr.LoginButton)
Logging / Observability	Step-by-step logging with [REASONING], [ACTION], [OBSERVATION], [ANSWER] blocks
Agent Framework (Phase 1)	Manual ReAct implementation with full control for transparency and debugging
Agent Framework (Phase 2)	Planned upgrade to smolagent for simplified tool integration and scaling logic
Tooling Strategy	Begin with calculator; add web search, Python code execution, and Wikipedia access incrementally

GAIA Level	Description	Covered in Plan
Level 1	ReAct agent with one tool	Included in Phase 1 baseline
Level 2	Robust instruction parsing	Planned via prompt engineering
Level 3	Self-reflection and retry	Planned in Phase 2 and 3 upgrades
Level 4	Tool chaining	Planned in Phase 3
Level 5	Multimodal or complex tasks	Currently out of scope

HF OAuth Login enabled via Gradio
Agent receives tasks from: https://agents-course-unit4-scoring.hf.space/questions
Submits answers to: https://agents-course-unit4-scoring.hf.space/submit
Submission includes:
- username (from login)
- agent_code (this Space URL)
- answer list (one per task)

Feature	Description
ReAct loop	Simple reasoning + single tool use
Tools	Calculator (initial)
Output format	Clean, final answers (no trace steps included)
Logging	Inline reasoning log to support debugging
Model behavior	Deterministic generation (low temperature)
Deployment	Fully compatible with HF Space CPU runtime

Maintained by: Igor Pavlov