Spaces:
Paused
Paused
| # InfiAgent-DABench | |
| This example is used to solve the InfiAgent-DABench using Data Interpreter (DI), and obtains 94.93% accuracy using gpt-4o. | |
| ## Dataset download | |
| ``` | |
| cd /examples/di/InfiAgent-DABench | |
| git clone https://github.com/InfiAgent/InfiAgent.git | |
| mv InfiAgent/examples/DA-Agent/data ./ | |
| ``` | |
| ## Special note: | |
| When doing DABench testing, you need to set the ExecuteNbCode() init to: | |
| ``` | |
| class ExecuteNbCode(Action): | |
| """execute notebook code block, return result to llm, and display it.""" | |
| nb: NotebookNode | |
| nb_client: NotebookClient | |
| console: Console | |
| interaction: str | |
| timeout: int = 600 | |
| def __init__( | |
| self, | |
| nb=nbformat.v4.new_notebook(), | |
| timeout=600, | |
| ): | |
| super().__init__( | |
| nb=nbformat.v4.new_notebook(),#nb, | |
| nb_client=NotebookClient(nb, timeout=timeout), | |
| timeout=timeout, | |
| console=Console(), | |
| interaction=("ipython" if self.is_ipython() else "terminal"), | |
| ) | |
| ``` | |
| The path of ExecuteNbCode() is: | |
| ``` | |
| metagpt.actions.di.execute_nb_code | |
| ``` | |
| Instead of using the original nb initialization by default. | |
| ## How to run | |
| ``` | |
| python run_InfiAgent-DABench_single.py --id x # run a task, x represents the id of the question you want to test | |
| python run_InfiAgent-DABench_all.py # Run all tasks serially | |
| python run_InfiAgent-DABench.py --k x # Run all tasks in parallel, x represents the number of parallel tasks at a time | |
| ``` |