YI Zhongyue commited on
Commit
d27d1e8
Β·
2 Parent(s): 682a11a46a52d9

Merge remote-tracking branch 'origin'

Browse files
Files changed (1) hide show
  1. README.md +17 -5
README.md CHANGED
@@ -9,29 +9,41 @@ app_file: app.py
9
  pinned: true
10
  hf_oauth: true
11
  hf_oauth_expiration_minutes: 480
 
 
12
  ---
13
 
14
  # πŸ•΅πŸ»β€β™‚οΈ Agents Course Final Assignment
15
 
16
- > **⚠️ Important Notice**: After this project is made public, the `OPENAI_API_KEY` in Hugging Face Space settings has been set to an invalid value to protect API key security. To run this project, please use your own OpenAI API key.
 
 
17
 
18
- This is a multi-agent system developed for the [Hugging Face Agents Course](https://huggingface.co/learn/agents-course/en/unit4/introduction) Unit 4 final project. The system is designed to evaluate AI agent performance through the [GAIA benchmark](https://huggingface.co/spaces/gaia-benchmark/leaderboard).
 
 
 
 
 
 
19
 
20
 
21
 
22
  Due to the fact that this agent system does not provide file processing capabilities, multimodal reasoning, and other advanced features, some questions like the following cannot be answered by this agent system:
23
 
 
24
  > The attached Excel file contains the sales of menu items for a local fast-food chain. What were the total sales that the chain made from food (not including drinks)? Express your answer in USD with two decimal places.
25
 
26
  > Review the chess position provided in the image. It is black's turn. Provide the correct next move for black which guarantees a win. Please provide your response in algebraic notation.
27
 
28
  > In the video https://www.youtube.com/watch?v=L1vXCYZAYYM, what is the highest number of bird species to be on camera simultaneously?
29
 
 
30
  Even so, this agent system is able to pass the GAIA benchmark with a score of 50% by correctly answering 10 out of 20 questions.
31
 
32
- > Submission Successful!
33
- > User: Hemimoon
34
- > Overall Score: 50.0% (10/20 correct)
35
  > Message: Score calculated successfully: 10/20 total questions answered correctly (20 valid tasks attempted). High score updated on leaderboard.
36
 
37
 
 
9
  pinned: true
10
  hf_oauth: true
11
  hf_oauth_expiration_minutes: 480
12
+ license: mit
13
+ short_description: Developed for the Agents Course final project.
14
  ---
15
 
16
  # πŸ•΅πŸ»β€β™‚οΈ Agents Course Final Assignment
17
 
18
+ <div align="center">
19
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/660d652e2461f72aa268bb8c/-q8C143P4TkNiLxCJS8cp.png" alt="certification" width="600"/>
20
+ </div>
21
 
22
+
23
+ > **⚠️ Important Notice**: <br/>
24
+ > After this project is made public, the `OPENAI_API_KEY` in Hugging Face Space settings has been set to an invalid value <br/>
25
+ > to protect API key security. To run this project, please use your own OpenAI API key.
26
+
27
+ This is a multi-agent system developed for the [Hugging Face Agents Course](https://huggingface.co/learn/agents-course/en/unit4/introduction) Unit 4 final project.
28
+ The system is designed to evaluate AI agent performance through the [GAIA benchmark](https://huggingface.co/spaces/gaia-benchmark/leaderboard).
29
 
30
 
31
 
32
  Due to the fact that this agent system does not provide file processing capabilities, multimodal reasoning, and other advanced features, some questions like the following cannot be answered by this agent system:
33
 
34
+
35
  > The attached Excel file contains the sales of menu items for a local fast-food chain. What were the total sales that the chain made from food (not including drinks)? Express your answer in USD with two decimal places.
36
 
37
  > Review the chess position provided in the image. It is black's turn. Provide the correct next move for black which guarantees a win. Please provide your response in algebraic notation.
38
 
39
  > In the video https://www.youtube.com/watch?v=L1vXCYZAYYM, what is the highest number of bird species to be on camera simultaneously?
40
 
41
+
42
  Even so, this agent system is able to pass the GAIA benchmark with a score of 50% by correctly answering 10 out of 20 questions.
43
 
44
+ > Submission Successful! <br/>
45
+ > User: Hemimoon <br/>
46
+ > Overall Score: 50.0% (10/20 correct) <br/>
47
  > Message: Score calculated successfully: 10/20 total questions answered correctly (20 valid tasks attempted). High score updated on leaderboard.
48
 
49