Spaces:
Sleeping
Sleeping
jebaponselvasingh commited on
Commit ·
e95a92e
1
Parent(s): 93b72dc
Add application file
Browse files
app.py
CHANGED
|
@@ -224,6 +224,7 @@ def submit_to_leaderboard(username: str, space_url: str, answers_json: str):
|
|
| 224 |
result = submit_answers(username, space_url, answers)
|
| 225 |
|
| 226 |
score = result.get("score", 0)
|
|
|
|
| 227 |
correct = result.get("correct_count", 0)
|
| 228 |
total = result.get("total_attempted", 0)
|
| 229 |
|
|
@@ -397,28 +398,6 @@ This agent uses **LangGraph** to solve GAIA benchmark questions. It has access t
|
|
| 397 |
|
| 398 |
gr.Markdown("""
|
| 399 |
---
|
| 400 |
-
### 📋 Tips for Better Scores
|
| 401 |
-
|
| 402 |
-
**Answer Formatting:**
|
| 403 |
-
- Answers are matched **exactly** (character-for-character), so precision is critical
|
| 404 |
-
- Do NOT include prefixes like "FINAL ANSWER:" or "The answer is:"
|
| 405 |
-
- For lists: use comma-separated format with NO spaces (e.g., "item1,item2,item3")
|
| 406 |
-
- For numbers: just the number, no units unless specified
|
| 407 |
-
- Check the validation status in the test tab
|
| 408 |
-
|
| 409 |
-
**Agent Capabilities:**
|
| 410 |
-
- Uses GPT-4o for optimal reasoning
|
| 411 |
-
- Automatically reads files (PDFs, Excel, text) when available
|
| 412 |
-
- Web search for current information
|
| 413 |
-
- Wikipedia for factual lookups
|
| 414 |
-
- Python execution for calculations
|
| 415 |
-
|
| 416 |
-
**Best Practices:**
|
| 417 |
-
1. Test with a single question first to verify the agent works
|
| 418 |
-
2. Run the full benchmark (takes ~10-15 minutes)
|
| 419 |
-
3. Review answers before submission
|
| 420 |
-
4. Ensure your Space is public for verification
|
| 421 |
-
|
| 422 |
### 🔗 Links
|
| 423 |
- [GAIA Benchmark](https://huggingface.co/spaces/gaia-benchmark/leaderboard)
|
| 424 |
- [Student Leaderboard](https://huggingface.co/spaces/agents-course/Students_leaderboard)
|
|
|
|
| 224 |
result = submit_answers(username, space_url, answers)
|
| 225 |
|
| 226 |
score = result.get("score", 0)
|
| 227 |
+
print(result)
|
| 228 |
correct = result.get("correct_count", 0)
|
| 229 |
total = result.get("total_attempted", 0)
|
| 230 |
|
|
|
|
| 398 |
|
| 399 |
gr.Markdown("""
|
| 400 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 401 |
### 🔗 Links
|
| 402 |
- [GAIA Benchmark](https://huggingface.co/spaces/gaia-benchmark/leaderboard)
|
| 403 |
- [Student Leaderboard](https://huggingface.co/spaces/agents-course/Students_leaderboard)
|