Spaces:
Configuration error
Configuration error
Create README.md
Browse files
README.md
CHANGED
|
@@ -1,103 +1,23 @@
|
|
| 1 |
-
#
|
| 2 |
|
| 3 |
-
|
| 4 |
|
| 5 |
-
The
|
| 6 |
|
| 7 |
-
|
| 8 |
|
| 9 |
-
- **
|
| 10 |
-
- **Advanced Toolset**:
|
| 11 |
-
- **Web Semantic Search**: Intelligent web browsing and information extraction.
|
| 12 |
-
- **Data Manipulation**: Tools for processing and analyzing Excel/CSV spreadsheets.
|
| 13 |
-
- **Audio & Video Analysis**: Custom-built logic to transcribe audio and analyze video content without relying on expensive, dedicated video APIs.
|
| 14 |
-
- **Custom RAG**: A Retrieval-Augmented Generation pipeline using **ChromaDB** for efficient context injection.
|
| 15 |
-
- **Observability**: Integrated with **LangFuse** (hosted locally) to monitor agent traces, evaluate performance, and debug the Thought-Action-Observation loops.
|
| 16 |
-
- **User Interface**: A clean, interactive UI built with **Gradio** and hosted on **Hugging Face Spaces**.
|
| 17 |
|
| 18 |
-
--
|
| 19 |
|
| 20 |
-
|
| 21 |
|
| 22 |
-
|
| 23 |
|
| 24 |
-
|
| 25 |
|
| 26 |
-
-
|
| 27 |
-
- `react_agent.py`: Contains the core logic for the **LangGraph** agent and the Re-Act prompt engineering.
|
| 28 |
-
- `custom_tools.py`: Definitions of the high-level tools available to the agent.
|
| 29 |
-
- `utils.py`: The "engine room" containing complex functions (video analysis logic, audio transcription, file processing) called by the tools.
|
| 30 |
-
- `web_semantic_search_tool.py`: Specialized module for RAG and semantic web queries using ChromaDB.
|
| 31 |
-
- `requirements.txt`: List of dependencies including `langgraph`, `chromadb`, `gradio`, and model SDKs.
|
| 32 |
-
- `*.ipynb`: Testing sandboxes for Mistral, LangChain, and agent components.
|
| 33 |
|
| 34 |
-
-
|
| 35 |
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
### 1. The "Free Plan" Resilience
|
| 39 |
-
|
| 40 |
-
The biggest challenge was maintaining execution during the 20-question GAIA evaluation without crashing due to API quotas.
|
| 41 |
-
**Solution:** I implemented a recursive retry strategy in `app.py`. If one provider (e.g., Google) returns a 429 or 500 error, the agent automatically re-instantiates using a different provider (Mistral or Groq) and continues from the same task.
|
| 42 |
-
|
| 43 |
-
### 2. Video Analysis Without Video APIs
|
| 44 |
-
|
| 45 |
-
Since free video analysis tools are scarce, I developed a custom "Video-to-Insight" pipeline in `utils.py` that breaks down video tasks into manageable image and text analysis steps that standard LLMs can process.
|
| 46 |
-
|
| 47 |
-
### 3. Tool Optimization
|
| 48 |
-
|
| 49 |
-
To prevent the agent from losing focus, I followed the "Thin Tool, Fat Utility" pattern. Instead of giving the agent 20 simple tools, I gave it 5 powerful, "smart" tools that utilize complex logic hidden in `utils.py`.
|
| 50 |
-
|
| 51 |
-
---
|
| 52 |
-
|
| 53 |
-
## 🚦 Getting Started
|
| 54 |
-
|
| 55 |
-
### Prerequisites
|
| 56 |
-
|
| 57 |
-
- Python 3.10+
|
| 58 |
-
- API Keys for: Google (AI Studio), Mistral AI, and Groq.
|
| 59 |
-
- A local LangFuse instance (optional, for tracing).
|
| 60 |
-
|
| 61 |
-
### Installation
|
| 62 |
-
|
| 63 |
-
1. Clone the repository:
|
| 64 |
-
```bash
|
| 65 |
-
git clone https://huggingface.co/spaces/[YOUR_USERNAME]/[YOUR_SPACE_NAME]
|
| 66 |
-
cd [YOUR_SPACE_NAME]
|
| 67 |
-
```
|
| 68 |
-
|
| 69 |
-
2. Install dependencies:
|
| 70 |
-
```bash
|
| 71 |
-
pip install -r requirements.txt
|
| 72 |
-
```
|
| 73 |
-
|
| 74 |
-
3. Run the app:
|
| 75 |
-
```bash
|
| 76 |
-
python app.py
|
| 77 |
-
```
|
| 78 |
-
|
| 79 |
-
---
|
| 80 |
-
|
| 81 |
-
## 🎓 Certification
|
| 82 |
-
|
| 83 |
-
This project was completed for the **Hugging Face Agents Course**, covering:
|
| 84 |
-
|
| 85 |
-
- **Theory**: LLM Mechanics, Re-Act, LangGraph, RAG, and Benchmarking (GAIA).
|
| 86 |
-
- **Practice**: Building and deploying a functional agent capable of autonomous tool use.
|
| 87 |
-
|
| 88 |
-
---
|
| 89 |
-
|
| 90 |
-
title: Template Final Assignment
|
| 91 |
-
emoji: 🕵🏻♂️
|
| 92 |
-
colorFrom: indigo
|
| 93 |
-
colorTo: indigo
|
| 94 |
-
sdk: gradio
|
| 95 |
-
sdk_version: 5.25.2
|
| 96 |
-
app_file: app.py
|
| 97 |
-
pinned: false
|
| 98 |
-
hf_oauth: true
|
| 99 |
-
# optional, default duration is 8 hours/480 minutes. Max duration is 30 days/43200 minutes.
|
| 100 |
-
hf_oauth_expiration_minutes: 480
|
| 101 |
-
---
|
| 102 |
-
|
| 103 |
-
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
| 1 |
+
# 🤖 Autonomous Agentic System – GAIA Benchmark Solver
|
| 2 |
|
| 3 |
+
Final project for the **Hugging Face Agents Course**. I developed a high-level autonomous agent capable of solving complex, multi-step tasks from the **GAIA Benchmark** (General AI Assistants), involving real-world tool usage and multimodal reasoning.
|
| 4 |
|
| 5 |
+
**The concept:** A robust agentic workflow built with **LangGraph** that follows a Thought-Action-Observation cycle to decompose 20 validation queries into executable steps, navigating through technical constraints like API rate limits and data extraction challenges.
|
| 6 |
|
| 7 |
+
**Technical highlights:**
|
| 8 |
|
| 9 |
+
- **Resilient Model Orchestration:** Implemented a **fallback & routing strategy** using Gemini 2.5 Pro as the primary brain, with automatic switching to Gemini Flash, Mistral, or Groq-hosted models to bypass free-tier rate limits without interrupting the execution flow.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
|
| 11 |
+
- **Advanced Tool Engineering:** Instead of overloading the context window with many small tools, I developed a `utils.py` library of complex functions. The agent uses a refined set of "Super-Tools" (Web Search, Excel manipulation, Audio Transcription, API interaction) that handle internal logic complexity autonomously.
|
| 12 |
|
| 13 |
+
- **Multimodal Innovation:** Engineered a **custom Video Analysis sub-agent**. Since no free direct video-to-text API was available, I built a pipeline that intelligently extracts frames and metadata to reconstruct temporal context for the LLM.
|
| 14 |
|
| 15 |
+
- **Custom RAG Architecture:** Integrated **ChromaDB** with a specialized retrieval algorithm optimized for the specific nuances of the GAIA dataset, ensuring the agent retrieves only the most relevant context for its reasoning steps.
|
| 16 |
|
| 17 |
+
- **Observability & Evaluation:** Self-hosted **LangFuse** locally to monitor traces, evaluate agent costs, and debug the Reasoning-on-Action (Re-Act) loops without incurring cloud platform fees.
|
| 18 |
|
| 19 |
+
- **Full-Stack Deployment:** Interface built with **Gradio** and hosted on Hugging Face Spaces, managed via Git for version control and CI/CD.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
+
**Results:** Successfully validated 16 "Level 1" GAIA tasks, demonstrating a high degree of autonomy in tool selection and the ability to maintain long-term state across multiple reasoning cycles.
|
| 22 |
|
| 23 |
+
[View certification](https://cas-bridge.xethub.hf.co/xet-bridge-us/6800ea554845e4edbca48825/5348431f62a3761b560f14e536cde6005f7dcd9eeda8ac8c7d5835edebe00c15?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=cas%2F20260118%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20260118T175600Z&X-Amz-Expires=3600&X-Amz-Signature=27ccefa0283d59c99512a9117a28a66f52bfb9e73c32ffe509ae1a9dfefc4504&X-Amz-SignedHeaders=host&X-Xet-Cas-Uid=65c927db2ba32c95416eb25d&response-content-disposition=inline%3B+filename*%3DUTF-8%27%272025-07-06.png%3B+filename%3D%222025-07-06.png%22%3B&response-content-type=image%2Fpng&x-id=GetObject&Expires=1768762560&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc2ODc2MjU2MH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2FzLWJyaWRnZS54ZXRodWIuaGYuY28veGV0LWJyaWRnZS11cy82ODAwZWE1NTQ4NDVlNGVkYmNhNDg4MjUvNTM0ODQzMWY2MmEzNzYxYjU2MGYxNGU1MzZjZGU2MDA1ZjdkY2Q5ZWVkYThhYzhjN2Q1ODM1ZWRlYmUwMGMxNSoifV19&Signature=S5%7EtuLDo36TB8V5mk8x03P2Pqo5NIOqCLS2XlFkJglZGz%7EOx6ePM8d0he166d%7E6s-KzLXenUv86%7EdSfJ8VWhDpZc7hpsrNsFqltLFYMGXAcmnflST0sZcReTqC3qx3gUlJ1H7%7Ea8geI55JvmcF36RiU-N5fQyBb-oFkOv8A47WjgEngEwSDMrGxq8FmYnKT3vDMu98HNSVQJoVDoBQG5uQxzYn2KmGTLwzWUqVHmRAMMXPoqxwCtRLsu7ZdyP1H0qQDJkD0TvTAegl3fLC2m0I1S0kSW3MQhT2SzOTOFHKKtn10lrPG7GG4iDmW487sZ7g-gU1rFoaGVezvc-W63dw__&Key-Pair-Id=K2L8F4GPSG1IFC)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|