Spaces:
Sleeping
Sleeping
added new system prompt
Browse files
app.py
CHANGED
|
@@ -347,24 +347,32 @@ def run_and_submit_all(profile: gr.OAuthProfile | None):
|
|
| 347 |
|
| 348 |
# --- Build Gradio Interface using Blocks ---
|
| 349 |
with gr.Blocks() as demo:
|
| 350 |
-
gr.Markdown("# SmolAgent Evaluation Runner
|
| 351 |
gr.Markdown(
|
| 352 |
"""
|
| 353 |
-
**
|
| 354 |
-
|
| 355 |
-
|
| 356 |
-
|
| 357 |
-
|
| 358 |
-
|
| 359 |
-
|
| 360 |
-
|
| 361 |
-
|
| 362 |
|
| 363 |
-
|
| 364 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 365 |
|
| 366 |
---
|
| 367 |
-
**Note:** This agent
|
| 368 |
"""
|
| 369 |
)
|
| 370 |
|
|
|
|
| 347 |
|
| 348 |
# --- Build Gradio Interface using Blocks ---
|
| 349 |
with gr.Blocks() as demo:
|
| 350 |
+
gr.Markdown("# ๐ค SmolAgent GAIA Evaluation Runner")
|
| 351 |
gr.Markdown(
|
| 352 |
"""
|
| 353 |
+
**Enhanced Agent for GAIA Dataset:**
|
| 354 |
+
|
| 355 |
+
๐ ๏ธ **Tools Available:**
|
| 356 |
+
- ๐ **DuckDuckGoSearchTool**: Real-time web search capabilities
|
| 357 |
+
- ๐ **VisitWebpageTool**: Can visit and analyze web pages
|
| 358 |
+
- ๐งฎ **Math Calculator**: Safe mathematical calculations
|
| 359 |
+
- ๐ **Data Analysis**: Basic data analysis capabilities
|
| 360 |
+
- โ
**Fact Checker**: Helps verify claims with authoritative sources
|
| 361 |
+
- ๐ง **Advanced Reasoning**: Structured problem-solving approach
|
| 362 |
|
| 363 |
+
๐ฏ **GAIA Format Compliance:**
|
| 364 |
+
- Numbers without commas or units (unless specified)
|
| 365 |
+
- Strings without articles or abbreviations
|
| 366 |
+
- Proper comma-separated lists
|
| 367 |
+
- Extracts only the final answer for submission
|
| 368 |
+
|
| 369 |
+
**Instructions:**
|
| 370 |
+
1. Log in to your Hugging Face account using the button below.
|
| 371 |
+
2. Click 'Run Evaluation & Submit All Answers' to start the evaluation.
|
| 372 |
+
3. The agent will process all questions using multiple tools and reasoning steps.
|
| 373 |
|
| 374 |
---
|
| 375 |
+
**Note:** This agent follows GAIA's strict answer formatting requirements and uses advanced reasoning with multiple tools.
|
| 376 |
"""
|
| 377 |
)
|
| 378 |
|