sabonzo commited on
Commit
5315214
·
verified ·
1 Parent(s): b39e102

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -5
README.md CHANGED
@@ -1,14 +1,42 @@
1
  ---
2
- title: GAIA Agent Evaluator (Display Only) # Or your title
3
  emoji: 🚀
4
  colorFrom: blue
5
  colorTo: green
6
  sdk: gradio
7
- sdk_version: 5.25.2 # Use your Gradio version
8
  app_file: app.py
9
  pinned: false
10
- hf_oauth: true
11
- # Add this section to install system packages:
12
  packages:
13
  - ffmpeg
14
- - stockfish
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: GAIA Agent Evaluator
3
  emoji: 🚀
4
  colorFrom: blue
5
  colorTo: green
6
  sdk: gradio
7
+ sdk_version: 5.25.2
8
  app_file: app.py
9
  pinned: false
10
+ hf_oauth: true # Enable Login button
 
11
  packages:
12
  - ffmpeg
13
+ - stockfish
14
+ ---
15
+
16
+ # GAIA Agent Evaluation Runner
17
+
18
+ This Space runs an AI agent designed to answer questions from the GAIA benchmark (Level 1 subset).
19
+
20
+ **Dependencies:**
21
+
22
+ This space requires Python packages listed in `requirements.txt`.
23
+
24
+ It also requires the following system packages:
25
+ * `ffmpeg`: For processing audio files (used by Whisper).
26
+ * `stockfish`: The chess engine used for Question 4.
27
+
28
+ Add this to your Dockerfile or specify system packages if using other methods. For standard Spaces, add `apt-get install -y ffmpeg stockfish` commands appropriately (e.g., some spaces allow a startup script or Docker commands).
29
+ If using default Spaces runtime, you might need to handle installing these differently, potentially bundling Stockfish or checking if ffmpeg is pre-installed.
30
+
31
+ **Setup:**
32
+
33
+ 1. Add your `OPENAI_API_KEY` as a Secret in the Space settings.
34
+ 2. (Optional) Add `TAVILY_API_KEY` as a Secret for Tavily search.
35
+ 3. Ensure Stockfish is installed and accessible via the `stockfish` command or set the `STOCKFISH_PATH` secret.
36
+
37
+ **Usage:**
38
+
39
+ 1. Log in using the Hugging Face Login button.
40
+ 2. Click "Run Evaluation & Submit All Answers".
41
+ 3. Wait for the agent to process all questions (this can take several minutes).
42
+ 4. View the results and score.