Spaces:

felixmortas
/

Hf_Agent_Course_Final_Assignment

Configuration error

App Files Files Community

Hf_Agent_Course_Final_Assignment / README.md

felixmortas

Update README.md

2957155 verified 11 days ago

preview code

raw

history blame contribute delete

4.6 kB

Multi-Model Agentic AI: GAIA Benchmark Solver

This project was developed as part of the Hugging Face Agents Course. It features an advanced autonomous agent designed to solve complex, multi-step tasks from the GAIA (General AI Assistants) benchmark (Level 1).

The agent leverages the Re-Act (Reasoning + Acting) framework via LangGraph to navigate through tools, manage long-form reasoning, and handle diverse data formats including web content, spreadsheets, audio, and video.

🚀 Key Features

Hybrid Multi-Model Orchestration: To overcome rate limits of free-tier plans, the system implements a robust fallback mechanism. It primary utilizes Gemini 2.0 Pro, with automated failover to Gemini 2.0 Flash, Mistral Large, and various models on Groq (Llama 3.3, DeepSeek R1, Qwen).
Advanced Toolset:
- Web Semantic Search: Intelligent web browsing and information extraction.
- Data Manipulation: Tools for processing and analyzing Excel/CSV spreadsheets.
- Audio & Video Analysis: Custom-built logic to transcribe audio and analyze video content without relying on expensive, dedicated video APIs.
- Custom RAG: A Retrieval-Augmented Generation pipeline using ChromaDB for efficient context injection.
Observability: Integrated with LangFuse (hosted locally) to monitor agent traces, evaluate performance, and debug the Thought-Action-Observation loops.
User Interface: A clean, interactive UI built with Gradio and hosted on Hugging Face Spaces.

🏗️ Architecture & Project Structure

The project is organized to separate the agent logic from the core utility functions, ensuring the agent doesn't get "confused" by an over-saturated toolset.

File Map

app.py: The entry point. Manages the Gradio UI, Hugging Face OAuth, and the multi-model fallback loop for the evaluation runner.
react_agent.py: Contains the core logic for the LangGraph agent and the Re-Act prompt engineering.
custom_tools.py: Definitions of the high-level tools available to the agent.
utils.py: The "engine room" containing complex functions (video analysis logic, audio transcription, file processing) called by the tools.
web_semantic_search_tool.py: Specialized module for RAG and semantic web queries using ChromaDB.
requirements.txt: List of dependencies including langgraph, chromadb, gradio, and model SDKs.
*.ipynb: Testing sandboxes for Mistral, LangChain, and agent components.

🛠️ Technical Challenges & Solutions

1. The "Free Plan" Resilience

The biggest challenge was maintaining execution during the 20-question GAIA evaluation without crashing due to API quotas. Solution: I implemented a recursive retry strategy in app.py. If one provider (e.g., Google) returns a 429 or 500 error, the agent automatically re-instantiates using a different provider (Mistral or Groq) and continues from the same task.

2. Video Analysis Without Video APIs

Since free video analysis tools are scarce, I developed a custom "Video-to-Insight" pipeline in utils.py that breaks down video tasks into manageable image and text analysis steps that standard LLMs can process.

3. Tool Optimization

To prevent the agent from losing focus, I followed the "Thin Tool, Fat Utility" pattern. Instead of giving the agent 20 simple tools, I gave it 5 powerful, "smart" tools that utilize complex logic hidden in utils.py.

🚦 Getting Started

Prerequisites

Python 3.10+
API Keys for: Google (AI Studio), Mistral AI, and Groq.
A local LangFuse instance (optional, for tracing).

Installation

Clone the repository:

git clone https://huggingface.co/spaces/[YOUR_USERNAME]/[YOUR_SPACE_NAME]
cd [YOUR_SPACE_NAME]

Install dependencies:

pip install -r requirements.txt

Run the app:

python app.py

🎓 Certification

This project was completed for the Hugging Face Agents Course, covering:

Theory: LLM Mechanics, Re-Act, LangGraph, RAG, and Benchmarking (GAIA).
Practice: Building and deploying a functional agent capable of autonomous tool use.

title: Template Final Assignment emoji: 🕵🏻‍♂️ colorFrom: indigo colorTo: indigo sdk: gradio sdk_version: 5.25.2 app_file: app.py pinned: false hf_oauth: true # optional, default duration is 8 hours/480 minutes. Max duration is 30 days/43200 minutes. hf_oauth_expiration_minutes: 480

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference