| """Synthetic filesystem-task data pipeline for MCPMark. | |
| Generates benchmark-style filesystem tasks (test environment + description.md + | |
| verify.py + meta.json) with deterministic, *recomputable* verifiers. An LLM is | |
| used only to make file content/names realistic and diverse; it never authors the | |
| verification logic. Every generated task is validated with a built-in oracle: | |
| the pipeline solves it programmatically and asserts ``verify.py`` exits 0. | |
| """ | |