mcpmark / docs /datasets /task.md
haochengsama's picture
Add files using upload-large-folder tool
a2ec7b6 verified
|
Raw
History Blame Contribute Delete
3.68 kB
# Task
The tasks in MCPMark follows two major principles
- The tasks are based on realistic digital environments that are also used by human programmers.
- The task outcome can be robustly verified in python scripts.
Therefore, each MCPMark task consists of three files
- `meta.json`
- `description.md`
- `verify.py`
Here, `metadata.json` includes the meta information of the task, `description.md` describes the purpose and setting of the task, as well as the instruction to complete the task. `verify.py` checks whether the task is completed successfully.
For example, you can ask the model agent to create a file with specific name and write specific content to the file, which belongs to the category of operating the file context. The structure looks like
```
tasks
└───filesystem
└───standard # task_suite (also supports `easy`)
└───file_context # category_id
└───create_file_write
│ meta.json
│ description.md
│ verify.py
```
All tasks live under `tasks/<mcp>/<task_suite>/<category>/<task_id>/`. `filesystem` refers to the MCP service and `task_suite` captures the difficulty slice (`standard` benchmark vs `easy` smoke tests).
`meta.json` includes the meta information about the task, including the following key
- task_id: the id of the task.
- task_name: full name of the task.
- description: task description.
- category_id: the id of task category.
- category_name: the full name of task categeory.
- author: the author of the task.
- difficulty: the task difficulty level.
- created_at: the timestamp of task creation.
- tags: a list of tags that describe the task.
- mcp: a list of MCP services it belongs to.
- metadata: other meta information.
Here `category_name` describes the shared feature or the environment across different tasks (e.g. the github repository or notion page the task is built on). In this running example, `category_name` refers to `file_context`.
`description.md` could include the following information
- Task name
- Create and Write File.
- Task description
- Use the filesystem MCP tools to create a new file and write content to it.
- Task Objectives
- Create a new file named `hello_world.txt` in the test directory.
- Write the following content to the file: ``` Hello, World```
- Verify the file was created successfully
- Verification Criteria
- File `hello_world.txt` exists in the test directory
- File contains the expected content structure
- File includes "Hello, World!" on the first line
- Tips
- Use the `write_file` tool to create and write content to the file
- The test directory path will be provided in the task context
The entire content of `description.md` will be read by the model agent for completing the task.
Accordingly, the `verify.py` contains the following functionalities
- Check whether the target directory exists. [![Check Target Directory](https://i.postimg.cc/SQfBYvby/task-sample-verify-get-test-dir.png)](https://postimg.cc/4nnLrw3M)
- Check whether the target directory contains the file with target file name. [![Check Target File Existence](https://i.postimg.cc/Qx0Zwnf6/task-sample-verify-file-existence.png)](https://postimg.cc/7fGRTX87)
- Check whether the target file contains the desired content `EXPECTED_PATTERNS = ["Hello Wolrd"]`. [![Check Content in Target File](https://i.postimg.cc/JzzMhWyV/task-sample-verify-check-content.png)](https://postimg.cc/w7ZSWZc0)
- If the outcome passes **all the above verification functionalities**, the task would be marked as successfully completed.