| \n", + " | task_id | \n", + "Question | \n", + "Level | \n", + "Final answer | \n", + "file_name | \n", + "file_path | \n", + "Annotator Metadata | \n", + "
|---|---|---|---|---|---|---|---|
| 0 | \n", + "e1fc63a2-da7a-432f-be78-7c4a95598703 | \n", + "If Eliud Kipchoge could maintain his record-ma... | \n", + "1 | \n", + "17 | \n", + "\n", + " | \n", + " | {'Steps': '1. Googled Eliud Kipchoge marathon ... | \n", + "
| 1 | \n", + "8e867cd7-cff9-4e6c-867a-ff5ddc2550be | \n", + "How many studio albums were published by Merce... | \n", + "1 | \n", + "3 | \n", + "\n", + " | \n", + " | {'Steps': '1. I did a search for Mercedes Sosa... | \n", + "
| 2 | \n", + "ec09fa32-d03f-4bf8-84b0-1f16922c3ae4 | \n", + "Here's a fun riddle that I think you'll enjoy.... | \n", + "1 | \n", + "3 | \n", + "\n", + " | \n", + " | {'Steps': 'Step 1: Evaluate the problem statem... | \n", + "
| 3 | \n", + "5d0080cb-90d7-4712-bc33-848150e917d3 | \n", + "What was the volume in m^3 of the fish bag tha... | \n", + "1 | \n", + "0.1777 | \n", + "\n", + " | \n", + " | {'Steps': '1. Searched '\"Can Hiccup Supply Eno... | \n", + "
| 4 | \n", + "a1e91b78-d3d8-4675-bb8d-62741b4b68a6 | \n", + "In the video https://www.youtube.com/watch?v=L... | \n", + "1 | \n", + "3 | \n", + "\n", + " | \n", + " | {'Steps': '1. Navigate to the YouTube link.\n", + "2.... | \n", + "
| \n", + " | Question | \n", + "file_path | \n", + "Agent response | \n", + "Final answer | \n", + "is_correct | \n", + "
|---|---|---|---|---|---|
| 22 | \n", + "You are a telecommunications engineer who want... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "None | \n", + "3 | \n", + "None | \n", + "
| \n", + " | Question | \n", + "file_path | \n", + "Agent response | \n", + "Final answer | \n", + "is_correct | \n", + "
|---|---|---|---|---|---|
| 22 | \n", + "You are a telecommunications engineer who want... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "2 | \n", + "3 | \n", + "0 | \n", + "
| \n", + " | Question | \n", + "file_path | \n", + "Agent response | \n", + "Final answer | \n", + "is_correct | \n", + "
|---|---|---|---|---|---|
| 22 | \n", + "You are a telecommunications engineer who want... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "2 | \n", + "3 | \n", + "0 | \n", + "
| \n", + " | iteration | \n", + "experiment | \n", + "agent | \n", + "tools | \n", + "accuracy | \n", + "
|---|---|---|---|---|---|
| 0 | \n", + "1 | \n", + "Implement Calculator Tool | \n", + "React agent | \n", + "Aritmetic | \n", + "0.00 | \n", + "
| 1 | \n", + "2 | \n", + "Implement Search and Code tools | \n", + "React agent | \n", + "Aritmetic, Search, Code | \n", + "0.17 | \n", + "
| 2 | \n", + "3 | \n", + "Integrate Whisper Audio Transcriber | \n", + "React agent | \n", + "Aritmetic, Search, Code, Audio Transcriber | \n", + "0.15 | \n", + "
| 3 | \n", + "4 | \n", + "Test Workflow | \n", + "React agent | \n", + "Aritmetic, Search, Code, Audio Transcriber, | \n", + "0.20 | \n", + "
| 4 | \n", + "5 | \n", + "Integrate Text processing tool | \n", + "React agent | \n", + "Aritmetic, Search, Code, Audio Transcriber, , ... | \n", + "0.10 | \n", + "
| 5 | \n", + "6 | \n", + "Integrate Text Handler tool | \n", + "React agent | \n", + "Aritmetic, Search, Code, Audio Transcriber, , ... | \n", + "0.20 | \n", + "
| 6 | \n", + "7 | \n", + "Evaluate Agent Performance against tasks with ... | \n", + "React agent | \n", + "Aritmetic, Search, Code, Audio Transcriber, , ... | \n", + "NaN | \n", + "
| 7 | \n", + "8 | \n", + "Test Agent performance against tasks with atta... | \n", + "React agent | \n", + "Aritmetic, Search, Code, Audio Transcriber, , ... | \n", + "0.00 | \n", + "
| \n", + " | task_id | \n", + "Question | \n", + "Level | \n", + "Final answer | \n", + "file_name | \n", + "file_path | \n", + "Annotator Metadata | \n", + "
|---|---|---|---|---|---|---|---|
| 0 | \n", + "e1fc63a2-da7a-432f-be78-7c4a95598703 | \n", + "If Eliud Kipchoge could maintain his record-ma... | \n", + "1 | \n", + "17 | \n", + "\n", + " | \n", + " | {'Steps': '1. Googled Eliud Kipchoge marathon ... | \n", + "
| 1 | \n", + "8e867cd7-cff9-4e6c-867a-ff5ddc2550be | \n", + "How many studio albums were published by Merce... | \n", + "1 | \n", + "3 | \n", + "\n", + " | \n", + " | {'Steps': '1. I did a search for Mercedes Sosa... | \n", + "
| 2 | \n", + "ec09fa32-d03f-4bf8-84b0-1f16922c3ae4 | \n", + "Here's a fun riddle that I think you'll enjoy.... | \n", + "1 | \n", + "3 | \n", + "\n", + " | \n", + " | {'Steps': 'Step 1: Evaluate the problem statem... | \n", + "
| 3 | \n", + "5d0080cb-90d7-4712-bc33-848150e917d3 | \n", + "What was the volume in m^3 of the fish bag tha... | \n", + "1 | \n", + "0.1777 | \n", + "\n", + " | \n", + " | {'Steps': '1. Searched '\"Can Hiccup Supply Eno... | \n", + "
| 4 | \n", + "a1e91b78-d3d8-4675-bb8d-62741b4b68a6 | \n", + "In the video https://www.youtube.com/watch?v=L... | \n", + "1 | \n", + "3 | \n", + "\n", + " | \n", + " | {'Steps': '1. Navigate to the YouTube link.\n", + "2.... | \n", + "
| \n", + " | Question | \n", + "file_path | \n", + "Agent response | \n", + "Final answer | \n", + "is_correct | \n", + "
|---|---|---|---|---|---|
| 23 | \n", + "If there is anything that doesn't make sense i... | \n", + "\n", + " | None | \n", + "Guava | \n", + "None | \n", + "
| 2 | \n", + "Here's a fun riddle that I think you'll enjoy.... | \n", + "\n", + " | None | \n", + "3 | \n", + "None | \n", + "
| 26 | \n", + "Examine the video at https://www.youtube.com/w... | \n", + "\n", + " | None | \n", + "Extremely | \n", + "None | \n", + "
| 1 | \n", + "How many studio albums were published by Merce... | \n", + "\n", + " | None | \n", + "3 | \n", + "None | \n", + "
| 14 | \n", + "In the fictional language of Tizin, basic sent... | \n", + "\n", + " | None | \n", + "Maktay mato apple | \n", + "None | \n", + "
| 16 | \n", + "Review the chess position provided in the imag... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "None | \n", + "Rd5 | \n", + "None | \n", + "
| 31 | \n", + "In the Scikit-Learn July 2017 changelog, what ... | \n", + "\n", + " | None | \n", + "BaseLabelPropagation | \n", + "None | \n", + "
| 49 | \n", + "What country had the least number of athletes ... | \n", + "\n", + " | None | \n", + "CUB | \n", + "None | \n", + "
| 15 | \n", + "In Nature journal's Scientific Reports confere... | \n", + "\n", + " | None | \n", + "diamond | \n", + "None | \n", + "
| 3 | \n", + "What was the volume in m^3 of the fish bag tha... | \n", + "\n", + " | None | \n", + "0.1777 | \n", + "None | \n", + "
| 5 | \n", + "Of the authors (First M. Last) that worked on ... | \n", + "\n", + " | None | \n", + "Mapping Human Oriented Information to Software... | \n", + "None | \n", + "
| 37 | \n", + "Pull out the sentence in the following 5x7 blo... | \n", + "\n", + " | None | \n", + "The seagull glided peacefully to my chair. | \n", + "None | \n", + "
| 24 | \n", + "How many slides in this PowerPoint presentatio... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "None | \n", + "4 | \n", + "None | \n", + "
| 6 | \n", + "In Series 9, Episode 11 of Doctor Who, the Doc... | \n", + "\n", + " | None | \n", + "THE CASTLE | \n", + "None | \n", + "
| 47 | \n", + "Where were the Vietnamese specimens described ... | \n", + "\n", + " | None | \n", + "Saint Petersburg | \n", + "None | \n", + "
| 43 | \n", + "In Audre Lorde’s poem “Father Son and Holy Gho... | \n", + "\n", + " | None | \n", + "2 | \n", + "None | \n", + "
| 48 | \n", + "A standard Rubik’s cube has been broken into c... | \n", + "\n", + " | None | \n", + "green, white | \n", + "None | \n", + "
| 44 | \n", + "Hi, I was out sick from my classes on Friday, ... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "None | \n", + "132, 133, 134, 197, 245 | \n", + "None | \n", + "
| 30 | \n", + "Hi, I'm making a pie but I could use some help... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "None | \n", + "cornstarch, freshly squeezed lemon juice, gran... | \n", + "None | \n", + "
| 13 | \n", + "Under DDC 633 on Bielefeld University Library'... | \n", + "\n", + " | None | \n", + "Guatemala | \n", + "None | \n", + "
| \n", + " | Question | \n", + "file_path | \n", + "Agent response | \n", + "Final answer | \n", + "is_correct | \n", + "
|---|---|---|---|---|---|
| 23 | \n", + "If there is anything that doesn't make sense i... | \n", + "\n", + " | None | \n", + "Guava | \n", + "None | \n", + "
| 2 | \n", + "Here's a fun riddle that I think you'll enjoy.... | \n", + "\n", + " | None | \n", + "3 | \n", + "None | \n", + "
| 26 | \n", + "Examine the video at https://www.youtube.com/w... | \n", + "\n", + " | None | \n", + "Extremely | \n", + "None | \n", + "
| 1 | \n", + "How many studio albums were published by Merce... | \n", + "\n", + " | None | \n", + "3 | \n", + "None | \n", + "
| 14 | \n", + "In the fictional language of Tizin, basic sent... | \n", + "\n", + " | None | \n", + "Maktay mato apple | \n", + "None | \n", + "
| 16 | \n", + "Review the chess position provided in the imag... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "None | \n", + "Rd5 | \n", + "None | \n", + "
| 31 | \n", + "In the Scikit-Learn July 2017 changelog, what ... | \n", + "\n", + " | None | \n", + "BaseLabelPropagation | \n", + "None | \n", + "
| 49 | \n", + "What country had the least number of athletes ... | \n", + "\n", + " | None | \n", + "CUB | \n", + "None | \n", + "
| 15 | \n", + "In Nature journal's Scientific Reports confere... | \n", + "\n", + " | None | \n", + "diamond | \n", + "None | \n", + "
| 3 | \n", + "What was the volume in m^3 of the fish bag tha... | \n", + "\n", + " | None | \n", + "0.1777 | \n", + "None | \n", + "
| 5 | \n", + "Of the authors (First M. Last) that worked on ... | \n", + "\n", + " | None | \n", + "Mapping Human Oriented Information to Software... | \n", + "None | \n", + "
| 37 | \n", + "Pull out the sentence in the following 5x7 blo... | \n", + "\n", + " | None | \n", + "The seagull glided peacefully to my chair. | \n", + "None | \n", + "
| 24 | \n", + "How many slides in this PowerPoint presentatio... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "None | \n", + "4 | \n", + "None | \n", + "
| 6 | \n", + "In Series 9, Episode 11 of Doctor Who, the Doc... | \n", + "\n", + " | None | \n", + "THE CASTLE | \n", + "None | \n", + "
| 47 | \n", + "Where were the Vietnamese specimens described ... | \n", + "\n", + " | None | \n", + "Saint Petersburg | \n", + "None | \n", + "
| 43 | \n", + "In Audre Lorde’s poem “Father Son and Holy Gho... | \n", + "\n", + " | None | \n", + "2 | \n", + "None | \n", + "
| 48 | \n", + "A standard Rubik’s cube has been broken into c... | \n", + "\n", + " | None | \n", + "green, white | \n", + "None | \n", + "
| 44 | \n", + "Hi, I was out sick from my classes on Friday, ... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "None | \n", + "132, 133, 134, 197, 245 | \n", + "None | \n", + "
| 30 | \n", + "Hi, I'm making a pie but I could use some help... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "None | \n", + "cornstarch, freshly squeezed lemon juice, gran... | \n", + "None | \n", + "
| 13 | \n", + "Under DDC 633 on Bielefeld University Library'... | \n", + "\n", + " | None | \n", + "Guatemala | \n", + "None | \n", + "
| \n", + " | Question | \n", + "file_path | \n", + "Agent response | \n", + "Final answer | \n", + "is_correct | \n", + "
|---|---|---|---|---|---|
| 23 | \n", + "If there is anything that doesn't make sense i... | \n", + "\n", + " | Guava | \n", + "Guava | \n", + "1 | \n", + "
| 2 | \n", + "Here's a fun riddle that I think you'll enjoy.... | \n", + "\n", + " | To maximize your odds of winning the grand pri... | \n", + "3 | \n", + "0 | \n", + "
| 26 | \n", + "Examine the video at https://www.youtube.com/w... | \n", + "\n", + " | I can't access or analyze video content direct... | \n", + "Extremely | \n", + "0 | \n", + "
| 1 | \n", + "How many studio albums were published by Merce... | \n", + "\n", + " | I couldn't find the specific number of studio ... | \n", + "3 | \n", + "0 | \n", + "
| 14 | \n", + "In the fictional language of Tizin, basic sent... | \n", + "\n", + " | \"Maktay Zapple Pa\" | \n", + "Maktay mato apple | \n", + "0 | \n", + "
| 16 | \n", + "Review the chess position provided in the imag... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "Rd5 | \n", + "Rd5 | \n", + "1 | \n", + "
| 31 | \n", + "In the Scikit-Learn July 2017 changelog, what ... | \n", + "\n", + " | Please provide the file path to the Scikit-Lea... | \n", + "BaseLabelPropagation | \n", + "0 | \n", + "
| 49 | \n", + "What country had the least number of athletes ... | \n", + "\n", + " | To find the country with the least number of a... | \n", + "CUB | \n", + "0 | \n", + "
| 15 | \n", + "In Nature journal's Scientific Reports confere... | \n", + "\n", + " | Please provide the file containing the confere... | \n", + "diamond | \n", + "0 | \n", + "
| 3 | \n", + "What was the volume in m^3 of the fish bag tha... | \n", + "\n", + " | Please provide the file containing the Univers... | \n", + "0.1777 | \n", + "0 | \n", + "
| 5 | \n", + "Of the authors (First M. Last) that worked on ... | \n", + "\n", + " | To assist you, I need the attached files that ... | \n", + "Mapping Human Oriented Information to Software... | \n", + "0 | \n", + "
| 37 | \n", + "Pull out the sentence in the following 5x7 blo... | \n", + "\n", + " | Please provide the file containing the 5x7 blo... | \n", + "The seagull glided peacefully to my chair. | \n", + "0 | \n", + "
| 24 | \n", + "How many slides in this PowerPoint presentatio... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "I can only process text files. Please convert ... | \n", + "4 | \n", + "0 | \n", + "
| 6 | \n", + "In Series 9, Episode 11 of Doctor Who, the Doc... | \n", + "\n", + " | Please provide the file path to the script of ... | \n", + "THE CASTLE | \n", + "0 | \n", + "
| 47 | \n", + "Where were the Vietnamese specimens described ... | \n", + "\n", + " | Please provide the file containing Nedoshivina... | \n", + "Saint Petersburg | \n", + "0 | \n", + "
| 43 | \n", + "In Audre Lorde’s poem “Father Son and Holy Gho... | \n", + "\n", + " | Please provide the text of the poem \"Father So... | \n", + "2 | \n", + "0 | \n", + "
| 48 | \n", + "A standard Rubik’s cube has been broken into c... | \n", + "\n", + " | The removed cube has two colors on its faces: ... | \n", + "green, white | \n", + "0 | \n", + "
| 44 | \n", + "Hi, I was out sick from my classes on Friday, ... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "132, 133, 134, 197, 245 | \n", + "132, 133, 134, 197, 245 | \n", + "1 | \n", + "
| 30 | \n", + "Hi, I'm making a pie but I could use some help... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "cornstarch, freshly squeezed lemon juice, gran... | \n", + "cornstarch, freshly squeezed lemon juice, gran... | \n", + "1 | \n", + "
| 13 | \n", + "Under DDC 633 on Bielefeld University Library'... | \n", + "\n", + " | Please provide the attached files so I can ass... | \n", + "Guatemala | \n", + "0 | \n", + "
| \n", + " | Question | \n", + "file_path | \n", + "Agent response | \n", + "Final answer | \n", + "is_correct | \n", + "
|---|---|---|---|---|---|
| 23 | \n", + "If there is anything that doesn't make sense i... | \n", + "\n", + " | Guava | \n", + "Guava | \n", + "1 | \n", + "
| 2 | \n", + "Here's a fun riddle that I think you'll enjoy.... | \n", + "\n", + " | To maximize your odds of winning the grand pri... | \n", + "3 | \n", + "0 | \n", + "
| 26 | \n", + "Examine the video at https://www.youtube.com/w... | \n", + "\n", + " | I can't access or analyze video content direct... | \n", + "Extremely | \n", + "0 | \n", + "
| 1 | \n", + "How many studio albums were published by Merce... | \n", + "\n", + " | I couldn't find the specific number of studio ... | \n", + "3 | \n", + "0 | \n", + "
| 14 | \n", + "In the fictional language of Tizin, basic sent... | \n", + "\n", + " | \"Maktay Zapple Pa\" | \n", + "Maktay mato apple | \n", + "0 | \n", + "
| 16 | \n", + "Review the chess position provided in the imag... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "Rd5 | \n", + "Rd5 | \n", + "1 | \n", + "
| 31 | \n", + "In the Scikit-Learn July 2017 changelog, what ... | \n", + "\n", + " | Please provide the file path to the Scikit-Lea... | \n", + "BaseLabelPropagation | \n", + "0 | \n", + "
| 49 | \n", + "What country had the least number of athletes ... | \n", + "\n", + " | To find the country with the least number of a... | \n", + "CUB | \n", + "0 | \n", + "
| 15 | \n", + "In Nature journal's Scientific Reports confere... | \n", + "\n", + " | Please provide the file containing the confere... | \n", + "diamond | \n", + "0 | \n", + "
| 3 | \n", + "What was the volume in m^3 of the fish bag tha... | \n", + "\n", + " | Please provide the file containing the Univers... | \n", + "0.1777 | \n", + "0 | \n", + "
| 5 | \n", + "Of the authors (First M. Last) that worked on ... | \n", + "\n", + " | To assist you, I need the attached files that ... | \n", + "Mapping Human Oriented Information to Software... | \n", + "0 | \n", + "
| 37 | \n", + "Pull out the sentence in the following 5x7 blo... | \n", + "\n", + " | Please provide the file containing the 5x7 blo... | \n", + "The seagull glided peacefully to my chair. | \n", + "0 | \n", + "
| 24 | \n", + "How many slides in this PowerPoint presentatio... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "I can only process text files. Please convert ... | \n", + "4 | \n", + "0 | \n", + "
| 6 | \n", + "In Series 9, Episode 11 of Doctor Who, the Doc... | \n", + "\n", + " | Please provide the file path to the script of ... | \n", + "THE CASTLE | \n", + "0 | \n", + "
| 47 | \n", + "Where were the Vietnamese specimens described ... | \n", + "\n", + " | Please provide the file containing Nedoshivina... | \n", + "Saint Petersburg | \n", + "0 | \n", + "
| 43 | \n", + "In Audre Lorde’s poem “Father Son and Holy Gho... | \n", + "\n", + " | Please provide the text of the poem \"Father So... | \n", + "2 | \n", + "0 | \n", + "
| 48 | \n", + "A standard Rubik’s cube has been broken into c... | \n", + "\n", + " | The removed cube has two colors on its faces: ... | \n", + "green, white | \n", + "0 | \n", + "
| 44 | \n", + "Hi, I was out sick from my classes on Friday, ... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "132, 133, 134, 197, 245 | \n", + "132, 133, 134, 197, 245 | \n", + "1 | \n", + "
| 30 | \n", + "Hi, I'm making a pie but I could use some help... | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "cornstarch, freshly squeezed lemon juice, gran... | \n", + "cornstarch, freshly squeezed lemon juice, gran... | \n", + "1 | \n", + "
| 13 | \n", + "Under DDC 633 on Bielefeld University Library'... | \n", + "\n", + " | Please provide the attached files so I can ass... | \n", + "Guatemala | \n", + "0 | \n", + "
| \n", + " | iteration | \n", + "experiment | \n", + "agent | \n", + "tools | \n", + "accuracy | \n", + "
|---|---|---|---|---|---|
| 0 | \n", + "1 | \n", + "Implement Calculator Tool | \n", + "React agent | \n", + "Aritmetic | \n", + "0.00 | \n", + "
| 1 | \n", + "2 | \n", + "Implement Search and Code tools | \n", + "React agent | \n", + "Aritmetic, Search, Code | \n", + "0.17 | \n", + "
| 2 | \n", + "3 | \n", + "Integrate Whisper Audio Transcriber | \n", + "React agent | \n", + "Aritmetic, Search, Code, Audio Transcriber | \n", + "0.15 | \n", + "
| 3 | \n", + "4 | \n", + "Test Workflow | \n", + "React agent | \n", + "Aritmetic, Search, Code, Audio Transcriber, | \n", + "0.20 | \n", + "
| 4 | \n", + "5 | \n", + "Integrate Text processing tool | \n", + "React agent | \n", + "Aritmetic, Search, Code, Audio Transcriber, Te... | \n", + "0.10 | \n", + "
| 5 | \n", + "6 | \n", + "Integrate Text Handler tool | \n", + "React agent | \n", + "Aritmetic, Search, Code, Audio Transcriber, Te... | \n", + "0.20 | \n", + "
| 6 | \n", + "7 | \n", + "Test Agent performance against tasks with atta... | \n", + "React agent | \n", + "Aritmetic, Search, Code, Audio Transcriber, Te... | \n", + "0.00 | \n", + "
| 7 | \n", + "8 | \n", + "Integrate Chess Tool | \n", + "React agent | \n", + "Aritmetic, Search, Code, Audio Transcriber, Te... | \n", + "0.20 | \n", + "
| \n", + " | task_id | \n", + "Question | \n", + "Level | \n", + "Final answer | \n", + "file_name | \n", + "file_path | \n", + "Annotator Metadata | \n", + "
|---|---|---|---|---|---|---|---|
| 0 | \n", + "e1fc63a2-da7a-432f-be78-7c4a95598703 | \n", + "If Eliud Kipchoge could maintain his record-ma... | \n", + "1 | \n", + "17 | \n", + "\n", + " | \n", + " | {'Steps': '1. Googled Eliud Kipchoge marathon ... | \n", + "
| 1 | \n", + "8e867cd7-cff9-4e6c-867a-ff5ddc2550be | \n", + "How many studio albums were published by Merce... | \n", + "1 | \n", + "3 | \n", + "\n", + " | \n", + " | {'Steps': '1. I did a search for Mercedes Sosa... | \n", + "
| 2 | \n", + "ec09fa32-d03f-4bf8-84b0-1f16922c3ae4 | \n", + "Here's a fun riddle that I think you'll enjoy.... | \n", + "1 | \n", + "3 | \n", + "\n", + " | \n", + " | {'Steps': 'Step 1: Evaluate the problem statem... | \n", + "
| 3 | \n", + "5d0080cb-90d7-4712-bc33-848150e917d3 | \n", + "What was the volume in m^3 of the fish bag tha... | \n", + "1 | \n", + "0.1777 | \n", + "\n", + " | \n", + " | {'Steps': '1. Searched '\"Can Hiccup Supply Eno... | \n", + "
| 4 | \n", + "a1e91b78-d3d8-4675-bb8d-62741b4b68a6 | \n", + "In the video https://www.youtube.com/watch?v=L... | \n", + "1 | \n", + "3 | \n", + "\n", + " | \n", + " | {'Steps': '1. Navigate to the YouTube link.\n", + "2.... | \n", + "
| \n", + " | task_id | \n", + "Question | \n", + "Level | \n", + "Final answer | \n", + "file_name | \n", + "file_path | \n", + "Annotator Metadata | \n", + "
|---|---|---|---|---|---|---|---|
| 16 | \n", + "cca530fc-4052-43b2-b130-b30968d8aa44 | \n", + "Review the chess position provided in the imag... | \n", + "1 | \n", + "Rd5 | \n", + "cca530fc-4052-43b2-b130-b30968d8aa44.png | \n", + "/home/santiagoal/.cache/huggingface/hub/datase... | \n", + "{'Steps': 'Step 1: Evaluate the position of th... | \n", + "
| \n", + " | index | \n", + "task_id | \n", + "Question | \n", + "Level | \n", + "Final answer | \n", + "file_name | \n", + "file_path | \n", + "Annotator Metadata | \n", + "
|---|---|---|---|---|---|---|---|---|
| 0 | \n", + "3 | \n", + "5d0080cb-90d7-4712-bc33-848150e917d3 | \n", + "What was the volume in m^3 of the fish bag tha... | \n", + "1 | \n", + "0.1777 | \n", + "\n", + " | \n", + " | {'Steps': '1. Searched '\"Can Hiccup Supply Eno... | \n", + "
| 1 | \n", + "5 | \n", + "46719c30-f4c3-4cad-be07-d5cb21eee6bb | \n", + "Of the authors (First M. Last) that worked on ... | \n", + "1 | \n", + "Mapping Human Oriented Information to Software... | \n", + "\n", + " | \n", + " | {'Steps': '1. Searched \"Pie Menus or Linear Me... | \n", + "
| 2 | \n", + "12 | \n", + "b816bfce-3d80-4913-a07d-69b752ce6377 | \n", + "In Emily Midkiff's June 2014 article in a jour... | \n", + "1 | \n", + "fluffy | \n", + "\n", + " | \n", + " | {'Steps': '1. Searched \"Hreidmar's sons\" on Go... | \n", + "
| 3 | \n", + "15 | \n", + "b415aba4-4b68-4fc6-9b89-2c812e55a3e1 | \n", + "In Nature journal's Scientific Reports confere... | \n", + "1 | \n", + "diamond | \n", + "\n", + " | \n", + " | {'Steps': '1. Searched \"nature scientific repo... | \n", + "
| 4 | \n", + "17 | \n", + "935e2cff-ae78-4218-b3f5-115589b19dae | \n", + "In the year 2022, and before December, what do... | \n", + "1 | \n", + "research | \n", + "\n", + " | \n", + " | {'Steps': '1. Searched \"legume wikipedia\" on G... | \n", + "
| 5 | \n", + "19 | \n", + "5188369a-3bbe-43d8-8b94-11558f909a08 | \n", + "What writer is quoted by Merriam-Webster for t... | \n", + "1 | \n", + "Annie Levin | \n", + "\n", + " | \n", + " | {'Steps': '1. Search \"merriam-webster word of ... | \n", + "
| 6 | \n", + "38 | \n", + "7673d772-ef80-4f0f-a602-1bf4485c9b43 | \n", + "On Cornell Law School website's legal informat... | \n", + "1 | \n", + "inference | \n", + "\n", + " | \n", + " | {'Steps': '1. Searched \"Cornell Law School leg... | \n", + "
| 7 | \n", + "39 | \n", + "c365c1c7-a3db-4d5e-a9a1-66f56eae7865 | \n", + "Of the cities within the United States where U... | \n", + "1 | \n", + "Braintree, Honolulu | \n", + "\n", + " | \n", + " | {'Steps': '1. Searched \"cities where us presid... | \n", + "
| 8 | \n", + "40 | \n", + "7d4a7d1d-cac6-44a8-96e8-ea9584a70825 | \n", + "According to Girls Who Code, how long did it t... | \n", + "1 | \n", + "22 | \n", + "\n", + " | \n", + " | {'Steps': '1. Searched \"Girls Who Code\" on Goo... | \n", + "
| 9 | \n", + "42 | \n", + "3f57289b-8c60-48be-bd80-01f8099ca449 | \n", + "How many at bats did the Yankee with the most ... | \n", + "1 | \n", + "519 | \n", + "\n", + " | \n", + " | {'Steps': '1. Search \"yankee stats\" to find th... | \n", + "
| 10 | \n", + "43 | \n", + "23dd907f-1261-4488-b21c-e9185af91d5e | \n", + "In Audre Lorde’s poem “Father Son and Holy Gho... | \n", + "1 | \n", + "2 | \n", + "\n", + " | \n", + " | {'Steps': '1. Search the web for “Audre Lorde ... | \n", + "
| 11 | \n", + "45 | \n", + "840bfca7-4f7b-481a-8794-c560c340185d | \n", + "On June 6, 2023, an article by Carolyn Collins... | \n", + "1 | \n", + "80GSFC21M0002 | \n", + "\n", + " | \n", + " | {'Steps': '1. Google \"June 6, 2023 Carolyn Col... | \n", + "
| 12 | \n", + "46 | \n", + "a0068077-79f4-461a-adfe-75c1a4148545 | \n", + "What was the actual enrollment count of the cl... | \n", + "1 | \n", + "90 | \n", + "\n", + " | \n", + " | {'Steps': '1. Searched \"nih\" on Google search.... | \n", + "
| 13 | \n", + "50 | \n", + "a0c07678-e491-4bbc-8f0b-07405144218f | \n", + "Who are the pitchers with the number before an... | \n", + "1 | \n", + "Yoshida, Uehara | \n", + "\n", + " | \n", + " | {'Steps': '1. Look up Taishō Tamai on Wikipedi... | \n", + "