Isateles commited on
Commit
512bc6b
ยท
verified ยท
1 Parent(s): c57d4a8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -271
README.md CHANGED
@@ -1,5 +1,6 @@
1
- title: Template Final Assignment
2
- emoji: ๐Ÿ•ต๐ŸŒŸ
 
3
  colorFrom: indigo
4
  colorTo: indigo
5
  sdk: gradio
@@ -7,273 +8,6 @@ sdk_version: 5.25.2
7
  app_file: app.py
8
  pinned: false
9
  hf_oauth: true
 
10
  hf_oauth_expiration_minutes: 480
11
-
12
-
13
-
14
- # ๐ŸŽฏ GAIA Benchmark Agent - Course Final Project
15
-
16
- A comprehensive AI agent that demonstrates course learning while achieving 30%+ score on GAIA benchmark to earn your course certificate.
17
-
18
- ## ๐ŸŒŸ What This Agent Demonstrates
19
-
20
- This project combines all major concepts from the course:
21
-
22
- ### ๐Ÿ“š **Course Learning Applied**
23
- - **๐Ÿ”ง Tools Integration**: Multiple tool types working together
24
- - **๐Ÿ“– RAG Implementation**: Persona database with 5K diverse individuals using vector embeddings
25
- - **๐Ÿค– Agent Workflows**: LlamaIndex agent orchestration
26
- - **๐Ÿง  LLM Integration**: Fallback options for accessibility
27
- - **๐Ÿ“ Modular Architecture**: Clean separation of concerns
28
-
29
- ### ๐ŸŽฏ **GAIA Benchmark Optimized**
30
- - **๐Ÿ” Web Search**: For current information and facts
31
- - **๐Ÿงฎ Calculator**: For mathematical accuracy (critical for GAIA)
32
- - **๐Ÿ“Š File Analysis**: For data processing questions
33
- - **๐Ÿ’ฌ Conversational**: Natural language interaction
34
-
35
- ## ๐Ÿ—‚๏ธ Project Structure
36
-
37
- ```
38
- your-space/
39
- โ”œโ”€โ”€ app.py # Main application with Gradio interface
40
- โ”œโ”€โ”€ tools.py # All agent tools (web search, calculator, etc.)
41
- โ”œโ”€โ”€ retriever.py # RAG implementation with guest database
42
- โ”œโ”€โ”€ requirements.txt # Python dependencies
43
- โ””โ”€โ”€ README.md # This file
44
- ```
45
-
46
- ### ๐Ÿ“ **File Explanations**
47
-
48
- **`app.py`** - Main Application
49
- - Gradio interface for GAIA evaluation
50
- - Agent initialization with error handling
51
- - Question processing and answer submission
52
- - Results display and certificate status
53
-
54
- **`tools.py`** - Agent Tools
55
- - **Web Search Tool**: DuckDuckGo integration for current info
56
- - **Calculator Tool**: Safe mathematical expression evaluation
57
- - **File Analysis Tool**: Process CSV, text, and data files
58
- - All tools have detailed documentation and error handling
59
-
60
- **`retriever.py`** - Advanced RAG System
61
- - Persona database with 5K diverse individuals from HuggingFace
62
- - Vector embeddings with ChromaDB for semantic search
63
- - IngestionPipeline for document processing
64
- - Demonstrates state-of-the-art RAG concepts
65
-
66
- ## ๐Ÿš€ Quick Setup Guide
67
-
68
- ### 1. **Clone or Duplicate This Space**
69
- ```bash
70
- # If cloning locally
71
- git clone https://huggingface.co/spaces/your-username/your-space
72
- cd your-space
73
-
74
- # Or duplicate this space to your HF account
75
- ```
76
-
77
- ### 2. **Set API Keys** โšก **CRITICAL STEP**
78
-
79
- In your HuggingFace Space:
80
- 1. Go to **Settings** โ†’ **Repository secrets**
81
- 2. Add **at least one** of these:
82
-
83
- **Option A: OpenAI (Recommended)**
84
- - Name: `OPENAI_API_KEY`
85
- - Value: `sk-...` (your OpenAI API key)
86
- - **Why**: Better performance on GAIA benchmark
87
-
88
- **Option B: HuggingFace (Free Alternative)**
89
- - Name: `HF_TOKEN`
90
- - Value: `hf_...` (your HF token)
91
- - **Why**: Free alternative, works without OpenAI credits
92
-
93
- **Get API Keys:**
94
- - **OpenAI**: https://platform.openai.com/api-keys
95
- - **HuggingFace**: https://huggingface.co/settings/tokens
96
-
97
- ### 3. **Ensure Public Space**
98
- - Your space must be **public** for leaderboard verification
99
- - Go to Settings โ†’ Change from Private to Public
100
-
101
- ### 4. **Run Evaluation**
102
- 1. Click the HuggingFace login button
103
- 2. Click "Run GAIA Evaluation & Submit Results"
104
- 3. Wait 5-10 minutes for completion
105
- 4. Check your score - need 30%+ to pass! ๐Ÿ†
106
-
107
- ## ๐Ÿ”ง Why Each Tool Matters for GAIA
108
-
109
- ### ๐ŸŒ **Web Search Tool**
110
- ```python
111
- # Example GAIA questions this helps with:
112
- "Who is the current president of France?"
113
- "What was Tesla's stock price yesterday?"
114
- "Recent developments in AI research"
115
- ```
116
- **Why needed**: GAIA questions often require current information beyond LLM training data.
117
-
118
- ### ๐Ÿงฎ **Calculator Tool**
119
- ```python
120
- # Example GAIA questions this helps with:
121
- "What is 15% of 847?"
122
- "Calculate the area of a circle with radius 23.7m"
123
- "If I invest $5000 at 3.2% annual interest for 7 years..."
124
- ```
125
- **Why needed**: LLMs can make arithmetic errors. GAIA requires exact numerical accuracy.
126
-
127
- ### ๐Ÿ“Š **File Analysis Tool**
128
- ```python
129
- # Example GAIA questions this helps with:
130
- "Analyze this CSV file and tell me the average..."
131
- "What is the most common value in column 3?"
132
- "Process this data file and extract..."
133
- ```
134
- **Why needed**: Some GAIA questions include file attachments requiring analysis.
135
-
136
- ### ๐Ÿ“š **Persona RAG Tool**
137
- ```python
138
- # Example questions this demonstrates:
139
- "Find writers and authors"
140
- "Who are the scientists?"
141
- "People interested in travel"
142
- "Creative professionals at the event"
143
- ```
144
- **Why included**: Demonstrates advanced RAG with 5K real personas, vector embeddings, and semantic search.
145
-
146
- ## ๐Ÿ“– Course Concepts Demonstrated
147
-
148
- ### ๐Ÿ”ง **Components** (From Course Unit 2)
149
- - **LLM Integration**: OpenAI + HuggingFace fallback
150
- - **Document Processing**: Text chunking and metadata
151
- - **Response Synthesis**: Clean answer formatting
152
-
153
- ### ๐Ÿ› ๏ธ **Tools** (From Course Unit 3)
154
- - **FunctionTool Creation**: Multiple tool types
155
- - **Tool Descriptions**: Proper LLM guidance
156
- - **Error Handling**: Graceful tool failures
157
-
158
- ### ๐Ÿค– **Agents** (From Course Unit 4)
159
- - **AgentWorkflow**: Multi-tool orchestration
160
- - **System Prompts**: GAIA-optimized instructions
161
- - **Async Processing**: Efficient question handling
162
-
163
- ### ๐Ÿ“– **RAG Implementation** (From Course Unit 5)
164
- - **Dataset Integration**: 5K personas from HuggingFace
165
- - **Vector Embeddings**: Semantic search with BAAI/bge-small-en-v1.5
166
- - **ChromaDB Storage**: Persistent vector database
167
- - **Ingestion Pipeline**: Document processing and chunking
168
-
169
- ### ๐Ÿ—๏ธ **Workflows** (From Course Unit 6)
170
- - **Event-Driven**: Tool selection and execution
171
- - **State Management**: Context preservation
172
- - **Error Recovery**: Robust failure handling
173
-
174
- ## ๐ŸŽ“ Why This Approach Works for GAIA
175
-
176
- ### โœ… **Accuracy First**
177
- - Calculator prevents math errors
178
- - Web search provides current facts
179
- - Low temperature LLM settings for consistency
180
-
181
- ### โœ… **Comprehensive Coverage**
182
- - Factual questions โ†’ Web search
183
- - Mathematical questions โ†’ Calculator
184
- - Data questions โ†’ File analysis
185
- - Knowledge questions โ†’ RAG system
186
-
187
- ### โœ… **Robust Error Handling**
188
- - Graceful API failures
189
- - Tool availability checking
190
- - Fallback responses
191
-
192
- ### โœ… **GAIA-Specific Optimizations**
193
- - Direct, concise answers
194
- - Exact match optimization
195
- - Minimal extra text
196
-
197
- ## ๐Ÿ”ง Troubleshooting
198
-
199
- ### โŒ **"No LLM available" Error**
200
- **Problem**: No API keys set
201
- **Solution**: Add `OPENAI_API_KEY` or `HF_TOKEN` to Space secrets
202
-
203
- ### โŒ **Import Errors**
204
- **Problem**: Dependencies not installed
205
- **Solution**: Check requirements.txt is in root directory, restart Space
206
-
207
- ### โŒ **Low GAIA Score**
208
- **Problem**: Agent giving wrong answers
209
- **Solutions**:
210
- - Check API key is working (OpenAI generally performs better)
211
- - Review agent logs for tool usage
212
- - Ensure web search and calculator are working
213
-
214
- ### โŒ **"Could not submit" Error**
215
- **Problem**: Network or authentication issue
216
- **Solution**:
217
- - Ensure logged in to HuggingFace
218
- - Check space is public
219
- - Try again (temporary network issues)
220
-
221
- ### โŒ **Tools Not Working**
222
- **Problem**: Missing dependencies or API issues
223
- **Solution**: Check Space logs, verify all packages installed
224
-
225
- ## ๐Ÿ“Š Expected Performance
226
-
227
- ### ๐ŸŽฏ **Target Scores**
228
- - **Minimum for Certificate**: 30%
229
- - **Good Performance**: 40-50%
230
- - **Excellent Performance**: 60%+
231
-
232
- ### ๐Ÿ“ˆ **Performance Factors**
233
- - **API Choice**: OpenAI typically scores higher than HuggingFace
234
- - **Tool Usage**: Questions requiring tools score better when tools work
235
- - **Answer Format**: Direct answers score better than verbose responses
236
-
237
- ## ๐Ÿš€ Getting Better Scores
238
-
239
- ### ๐Ÿ’ก **Optimization Tips**
240
- 1. **Use OpenAI**: Generally more accurate than HuggingFace for GAIA
241
- 2. **Check Tool Functionality**: Test web search and calculator work
242
- 3. **Review Failed Questions**: Look at specific errors in results table
243
- 4. **Adjust System Prompt**: Fine-tune for your specific weak areas
244
-
245
- ### ๐Ÿ”„ **Iterative Improvement**
246
- 1. Run evaluation and check results
247
- 2. Identify patterns in failed questions
248
- 3. Adjust tools or prompts accordingly
249
- 4. Re-run evaluation
250
-
251
- ## ๐Ÿ† Certificate Achievement
252
-
253
- **To earn your course certificate:**
254
- 1. โœ… Score 30% or higher on GAIA evaluation
255
- 2. โœ… Keep your space public for verification
256
- 3. โœ… Submit through the official interface
257
-
258
- **When you pass:**
259
- - You'll see "โœ… PASSED - Certificate Earned!" in results
260
- - Your score will appear on the student leaderboard
261
- - You can download your official certificate
262
-
263
- ## ๐Ÿค Getting Help
264
-
265
- **If you're stuck:**
266
- 1. Check the troubleshooting section above
267
- 2. Review Space logs for specific errors
268
- 3. Test individual components (tools.py, retriever.py)
269
- 4. Ask in the course Discord for community help
270
-
271
- ## ๐ŸŽ‰ Good Luck!
272
-
273
- This agent represents everything you've learned in the course. The modular design makes it easy to understand, debug, and improve. Focus on getting those API keys set up correctly, and you'll be well on your way to earning your certificate!
274
-
275
- **Remember**: The goal isn't just to pass the benchmark, but to demonstrate your understanding of modern AI agent development. This codebase serves as a portfolio piece showing your skills in RAG, tool integration, and agent orchestration.
276
-
277
- ---
278
-
279
- *Built with โค๏ธ using LlamaIndex and course concepts*
 
1
+ ---
2
+ title: Isadora Final Assignment
3
+ emoji: ๐Ÿ•ต๐Ÿปโ€โ™‚๏ธ
4
  colorFrom: indigo
5
  colorTo: indigo
6
  sdk: gradio
 
8
  app_file: app.py
9
  pinned: false
10
  hf_oauth: true
11
+ # optional, default duration is 8 hours/480 minutes. Max duration is 30 days/43200 minutes.
12
  hf_oauth_expiration_minutes: 480
13
+ ---